Thursday, November 08, 2007

Applications or data?

When I look at Facebook and the huge amount of applications I can add to my profile, I tend to get lost. Not on my own profile page, which has a fairly limited amount of applications, but when I see other pages with 50 or more applications added. What is this good for? Far from wanting to judge other people by the kind of applications they add (e. g. how active is your sex life), I can only speak for myself. Doing so, I would say that I am focused on those applications which will display my interests (music I listen to, books I read, reviews I wrote elsewhere, postings). At least, this is what I would consider the personal benefit for other people looking at my profile. They will probably not care as much about the services I use, but rather on what output they produce. (On a side note, I really do not know how many applications I am missing because there are just too many. But this also applies to the desktop computer world). But is Facebook really the social network operating system?

Another approach is Google's OpenSocial initiative, a set of APIs to integrate multiple social services. Personally, I am registrated at so many services that I would rather manage my profile and network at one cental place. But this would mean not only access to the data, but also their full integration - which OpenSocial does not seem to support. Nevertheless, I still hope that I will be proven wrong so developers will be able to implement applications integrating data from multiple sources and services which will be of true benefit for the user as they will be able to take my full context into account.

Monday, October 22, 2007

People networks - where to register?

We are in a time of so-called social services being as abundant as never before. Which means that every week there's an abundancy of services with the basic idea to connect people. As it seems difficult to know where to register (I just joined the third virtual media shelf and am starting to loose track on how many services I am registered to at all), I would like to propose the following characterization:

  • Media-based recommendation networks - aggregate books, music and videos you own or like to find people with similar media consumption preferences and have the service recommend other media items or people with similar (media) preferences. Examples are librarything, shelfmates, moviepilot (in German),only to name a few
  • Sequential media recommendation services: specify a preferred media item and have the service recommend you similar items, such as or Pandora
  • Location-based networks - share ratings about localizable entities (e. g. shops, restaurants, monuments, museums) and use the service as a kind of collaborative tourist guide. These service are often available in web-based and / or mobile versions, and some of them include automatic localization. Examples are qype, qiro, townster.
  • Offer-and-demand-based networks - share your professional experience, your hobbies or your needs in order to find what you're looking for, e. g. a new job, a relationship, a professional (e. g. craftsman) to get a job done, etc. Examples include Xing, LinkedIn, MyHammer (in German), Friendster, not to mention the zillions of dating platforms.
  • (Micro)publishing sites, such as weblog hosting services, twitter or jaiku

I am not sure where resource aggregation services for photos, videos or bookmarks fit into that classification, as the social network and recommendation factor does not seem to be prominent with regards to services such as flick.r, YouTube and

Also, as this is only a first attempt, I do welcome comments that are geared towards expanding this classification. More specifically, does anyone know of other attempts to classify all these fancy services that claim to be Web2.0?

Thursday, October 04, 2007

Finding what you're searching for

I do not have precise numbers regarding how much the amount of information on the World Wide Web has been increasing in the past couple of years, but from my searches I get the impression that the same information is available in an exceedingly high number of copies (so to say), be it news, frequently asked questions, or product information, just to name a few.

Every now and then, the magic word natural language search pops up, producing about 104 million hits on Google if I just type in the words, and still 226.000 hits if put into quotation marks - which is way too much to handle. Others have written about the topic before, so I am not going to repeat what has been said before, but the question is what can be done to find the information in something that seems like a huge haystack of Web pages.

Why is everyone using Google? Because it actually does quite well, and I can second that I mostly do not have to go beyond the first couple of result pages to find the information I want. If it does not appear, either my search turns out to have been too unspecific, or the information is not available at all.

Enter startup companies such as Powerset that claim to revolutionize search. I doubt that, given that it is hard to extract any semantics from most searches, which do only contain about three significant search terms or less. I would assume that natural language search may be able to yield decent results, but is the benefit (from the point of view of the users) really as significant as claimed? I am not so sure about this.

The problem is not that our search technologies are not good enough. It is that there is too much information to search within. So, I suspect that the future lies in dedicated search engines for specific domains (e. g. news) rather than a new universal search engine.

Friday, September 07, 2007

The Future of (IP)TV

The start of internet-based TV (IPTV) is often claimed to be as much of a step forward as the introduction of color television. The most significant change, from a consumer's point of view, is the potential use of a backchannel, turning a former broadcasting device into an interactive media center.

Two alternative approaches are known: on one hand, the set-top-box based delivery, on the other, P2P based interactive television. The former seems a suitable way to sell high speed broadband connections for telecommunication providers, the latter is yet another attempt at bringing P2P platforms to a wider clientele, with competitors such as Joost, Babelgum or Zattoo. (A posting comparing these three P2P platforms can be found at ReadWriteWeb).

Perhaps it is to early to say who will be winning the competition in the long run, but the following factors seem to be essential for IPTV, whether P2P or not:

  • Attractive and high-quality content: In order to substitute and / or expand regular TV, partnerships with both traditional broadcasting houses and niche providers of video content is a prerequisite to offer a decent selection of (streamed) media. This is also a great opportunity for professional content producers. However, it is important that the main focus should not be user-generated content (such as YouTube), although this may be offered as an addition.
  • Audio and video (technical) quality: This should be considerably superior to PAL or NTSC standards, otherwise there is no point for end users to give up conventional TV
  • Extensible widgets on the software that delivers content and functions to the customer: Like Facebook that opened its interfaces for third-party application providers, additional functions that enable interactivity may help turning TV into a collaborative experience. More precisely, this is essential for any kind of personalized content delivery that suggests specific programs based on past viewing or permits user-triggered suggestions (e. g. forwarding of a program to specific user groups as a suggestion). It also permits IPTV providers to focus on developing their core platform while remaining open for future development.
  • Integration of external information sources (such as news portals, weblogs, discussion boards) via RSS fields, with the option of filtering the currently delivered feed against the characteristics of the currently delivered media stream - call it personalized aggregation of multimedia channels
  • Intelligent filtering and forecast: The more diverse the delivered content (e. g. number of broadcasting channels), the more efficient the filtering and search mechanism need to be. While personal recommendations require some kind of user profiling, quick access to all available content needs to be ensured via an efficient combination of search and navigation techniques.

IPTV has the potential not only to substitute broadcasted TV, but may also offer media distribution to a broader public (such as video on demand) and turn uni-directional viewing into a communicative experience. However, there is still quite a way to go in order to be a true competitor for the mass markets.

Wednesday, September 05, 2007

Local portals: bridging the gap between reality and virtuality

How do you find special offers in your town? You either know what store you want to go to (either in reality or via their web page), or you could just go and see (trial and error), but where can you possibly find specific products or services? For the former, you have eBay, where you can look for any goods which are either offered via auctions or static (instant) purchases, assuming that the provider of the product offers some means of sending his products to the requesting customer. For the latter (services), you have auctioning platforms such as my hammer, bringing demand and users together. But what about the real word, e. g. you have a need for a product that you would like to buy locally, wanting to find the best value for your money (which does not necessarily mean the lowest price)? Local platforms such as Qype may help in finding opinions regarding local places, but although it is possible for stores and service providers to enter their own information, they seem to be reluctant to do so. Perhaps this is so because of lack of time, or fear of being bashed, but I am not sure. My feeling is that these platforms can help in raising awareness for local product or service providers, but their task does not seem to be to answer specific demands.

Entrepreneurs of small businesses often do not have the time nor the money to invest in their own web presence. For them, some framework where they can easily place their products, specific offers and background information can be helpful. On the other hand, before investing their time (and possibly, money), they need to be sure that such a platform will not only help them in improving their business (i. e. more customers), but that they can also trust in the platforms' persistence and reliability.

One example in Germany, targeted at just that is CityPedia (not to be confused with the British platform bearing the same name). Of course time will tell whether it will succeed in attracting businesses and private users alike, but I am quite confident that its founders will succeed provided they manage to emphasize the benefits and potentials of their platform for both businesses and end users.

Friday, August 31, 2007

Local portals - what's the benefit?

If you look for places related to a city or town (e. g. sights to see, restaurants, shops, businesses), there are basically three approaches: either you use your favorite search engine and click on one of the result lists (trial and error), you know their website and load their own presentation, or you use a user generated content triggered service (I am refusing to use the term web2.0 deliberately) such as Qype to see whether there is some contribution about that place by some user.

The first approach will possibly lead to a small number of pages that may have some helpful background information, but the majority will just have the address and possibly a link to their homepage (if available). The second approach will only work for a small fraction of those companies that have the money or manpower to have their own web presentation (and you don't really know whether it's trusted information or more like a biased advertising). Finally, the third approach depends on whether other users have contributed to that location or not.

My personal strategy is (in that order) 3, 1 (for additional information I may have missed) and 2 (for restaurants, I would like to see their menu before actually going there). To me, the disadvantage is that I have to look in different places, especially relating to option 1, and that information is often duplicated (even if the wording is different). Instead, I would like to have the information I am looking for aggregated in one single place.

For entrepeneurs that are thinking of establishing yet another local information page, I suppose that the goal for success is that they have a clear understanding of what kind of information they would like to offer, and in what context they would like to see their offering related to other services. While I understand that competition is helpful for some time (taking the example of location based information services), it does not really help the user to have to navigate through several offerings that depend on user content and share more or less the same functions.

In summary, assuming that there is some truth behind the buzz phrase that content and context is king, it has to be made really clear to investors, shareholders and end users what this really means. For instance, one argument could be reliability and trustworthiness. Second, any information service related to the real world should provide a link between virtuality and reality that will provide added value. If your users will access your service on a regular basis, this may be an indication that your proposition is working.

Thursday, July 19, 2007

Mobile Tagging

Apparently a white paper regarding mobile tagging was just published, but somehow it is not available currently. That seems like a new term, somewhat confusing to me, because tagging means to place a tag (i. e. a word) to describe a resource (most often, a piece of text). On the other hand, mobile tagging means to decode a barcode (either EAN13 or 2D) via a mobile phone and use this ID to access information regarding the "tagged" resource.

Well, the first benefit seems obvious, namely the improvement of user interaction, as everyone knows how tedious it is to enter URLs on a mobile device. The second one relates to bridging the divide between the real and virtual world: any resource in real life can be associated with a tag, and any function can be associated that takes the resource's id and performs an action, such as retrieving information or placing an EBay auction, buy a concert ticket or initiate a media download - the possibilities seem endless.

Wednesday, July 18, 2007

Exodus from Second Life

I should start these thoughts by admitting that I have never been a user of LindenLab's Second Life platform, nor do I intend to use this service. Why? Simply because I have the impression that it is all about a virtual environment you can create, without having an equivalence in the real world. Already some time ago, there were comments of the platform being increasingly unstable. It has been observed that the computer manufacturer Dell and hotel chain Starwood already left their virtual islands due to a lack of visitors. But honestly, what could be the reason for me to visit a company's virtual other self? If it's about an Internet-based service, I would probably have a look at their web presence, but what additional use can a virtual island have? I honestly have no explanation.

When it is about community building and information / experience sharing, then virtual communities are a great means of communication platform between singular end users. But when it gets too comercial, then this may mean the beginning of an exodus, such as seen with SecondLife. Of course, that's a personal opinion, but if I want to buy something I will possibly go to the relevant internet shops or auction platforms.

The remaining question, then, is whether the users are going to come back. In June, the number of active users has been decreasing by 2.5 percent, where out of 8 million registered users, only 40.000 are active at peak times. As I read it, the companies' exodus was a consequence of the decreasing number of users. Thus, if this trend continues, more companies are going to leave their virtual residences.

Another way to perceive this, however, is that the companies' virtual representation was not attractive enough for end users. But is this really the case? I don't know, but would welcome any further thoughts on this.

Bottom line, however, is the challenge to link the real to the virtual world in order to achieve a blended experience for end users.

Monday, July 02, 2007

Cooperative Tags

Tags are a great facility to assign meaning to contributions. They are very useful when it comes to personal information management. However, they are problematic regarding collaborative information management - every user will generate different tags relative to their experience, intentions, etc.

Much has been written about the issue of folksonomies, taxonomies and tags. I do understand that taxonomies are not considered very user-friendly - but on the other hand, the evolution of folksonomies does not seem to be very goal-directed. Which goal, you may ask? To help find information based on specific concepts that may come to mind.

The most well-known approach is to assign user-specific labels to a resource, or a chunk of information. For very popular resources, you may end up with lots of different tags, while other resources may end up with only the tags its author had thought of.

Today I discovered an interesting approach to help keep folksonomies tidy by adding a means to rate them (seen on MovieLens). By rating the appropriateness of a tag (on a scale between 0 and 1), based on a sufficiently large number of users, the so-called "wisdom of the crowds" should lead to an improvement of supplied tags. As tags are always relative to a tagged entity, the other question that should also be addressed is how to appropriately monitor tag evolution. Should inappropriate tags (i. e. whose rating relative to a resource does not exceed a given threshold) be automatically removed? Should tags which are considered as very useful be added to a tag dictionary?

Last, but not least: when combining taxonomies and folksonomies, what should be done to relate them, e. g. should there be associated (recommended) tags for a term in the taxonomy?

As I am surely not the first to raise these questions, I would welcome any feedback on this issue.

Thursday, June 28, 2007

Keeping track of community services

The current hype are communities with user-generated content - starting from pure weblogs with their blogrolls, media sharing sites (flick.r, youtube), bookmarking services (, Mr. Wong), recommendation sites (Qype, DaWanda), genealogy sites and others. (I'm sure someone out there must have a more comprehensive overview). Some of these services intentionally try to look flashy and innovative (especially the ones being implemented in Flash). But how do I really keep track of all these services I am subscribed to (and I am not going to ask how to have the time to take care of all this)?

Enter Facebook, which is called a social utility by their creators (for some more information look into mashable, TechCrunch and Wikipedia). Actually, it'a kind of service aggregator plus social network, which may help structuring the own set of subscribed services.

Like most of the cited services, this one also lives from advertisements - but I am not sure what the consequences of alternative ad revenue models such as pay-per-action will be. While it seems relatively easy to find potential investors for services which label themselves as Web 2.0, only time will tell if the revenues will be sufficient in the long run, especially if there are many competitors.

Thursday, June 21, 2007

Ratings and Trust

Some online communities offer some kind of rewards, e. g. points that one can collect (associated to specific actions like writing a contribution, sharing one's knowledge, number of contacts). Assuming that everyone acts according to fair principles, there seems no problem with that. However, taking the example of rating other people's contributions, it is not uncommon for people to generate some specific accounts from where they will rate their own contributions, however under another user account (associated to themselves, which the platform is not aware of).

There seem to be several approaches to cope with this issue:

  • Require full address upon registration together with phone number in order to check it against phone listings. This has the disadvantage that not every potential user may be listed in some given directory (e. g. students sharing an appartment, where one phone is shared among several people)
  • Require first and last names at registration (with the possiblity to choose a nickname for users who do not want to unveil their identity to the general public).
  • Require passport or ID card upon registration. This requires a mechanism to verify users given their passport number.
  • Only allow one account per email address. Of course, I may be generating a large number of email addresses to circumvent this, but still it may help
  • Require user photographs for any active account. But, on the other hand, how would it be possible that a photo really shows the user, and not some other person?
  • Allow only active accounts, where activity relates to productive actions (such as writing a contribution). That is, remove accounts whose users have not been showing any social interaction with their peers for a given time (e. g. a week, a month)

As I am only starting to think about valid mechanisms to ensure a community of trust, I welcome any ideas that expand on my own thoughts.

Thursday, June 14, 2007

More semantics for social networks!

Profiles are difficult to generate (you mostly need metadata). And profiles are difficult to match (takes a substantial amount of computing power). That is why we see a large variety of services on the Web which work based on communities, such as Qype. So far, it's been German only, but today, Qype has their UK launch party, so I'm quite excited about how this interactive city magazine will evolve. Recommendations in Qype and elsewhere are, then, based on who your friends or acquaintances are. That is, if you trust someone to write good reviews and add that person to your list, then your're regularly updated on what that person writes about.

What's even better is that some services, including blogs and other regularly updated sites that you like reading, have RSS feeds, which you may nicely feed into twitter - and then, you can get a mixture of interesting contributions (based on your "profile") on your mobile phone, wherever you are.

Besides these technical issues, what I learnt is that if you base your service on virtual communities, they need some real equivalence on one hand (i. e. meeting the people in real life, or at least some of them, that you like by their contributions). On the other hand, you need to take care of your community by giving them some motivation to stay tuned.

These are all nice experiences, but what happens if your personal network grows too big? Then you probably need some more semantics which contributions to feed you first. But I am only starting to think about all this, so any thoughts on this are welcome!

Monday, June 04, 2007

Geotagged media

Well, the association of resources with geocoordinates, also known as geotagging, is not new, but one of the reason for me to write about Panoramio, a picture-sharing platform where photos are associated with locations, is its recent announcement of being taken over by Google (after already having closely cooperated). While Google does not yet reveal how this service will be integrated, it makes a lot of sense also from a user perspective to make Google Maps a more personalized experience. Looking at the world in hybrid mode (satellite pictures) is nice, but they're not up to date. And considering the effort that Google is taking in photographing the world (well, at least some cities, as it seems), why not take advantage of the Google community taking digital pictures and uploading them for other users to look at?

Of course it's all about gathering user-related information, and I assume that it will remain one of Google best-kept secrets of what they will actually do with all this data. Having Google Mail, Google Documents and now something that might be called Google Media, user context (time, location, interest) becomes as enriched as you could possibly think of. But hey, you are not forced to use any of these services, right?

Anyway, if pictures can be geo-tagged, any other kind of resource will also do. Videos, mash-ups, "office" documents, newspaper articles, podcasts, blogs - anything that may exist in digital form. And if you think mobile, you may get all these geo-tagged resources while walking by some marked location, or instantly leave your own photographs or videos right after shooting them.

Oh, I forgot about Orkut. This may be the foundation to share digital resources among persons you directly relate to, as an alternative to writing your friends a postcard or SMS from your vacation.

I am sure this is only the tip of the iceberg - many more possible scenarios I have not been thinking of yet ...

Wednesday, May 30, 2007

ReCaptcha, digital libraries and OCR

Over 10 years ago, digital libraries were a hot research topic. Back then, I participated in a number of projects, which finally led to my dissertation. Now, many attempts are known to make books available in digital form. For those sources which are not available in digital format, the only way seems to be OCR. Unfortunately enough, it is subject to errors, which cannot always be corrected automatically.

Enter ReCaptcha, a collaborative approach which helps preventing web sites from spam (comparable to Captcha, but with real words instead of just a bunch of characters. The idea is to present two words to the user: one of them was correctly identified via OCR, the other one produced an error. Assuming that someone who is able to correctly identify one of the words will also be able to produce a correct identification to the second one, the side effect is that the set of incorrectly identified words (via OCR) can dramatically be reduced (as a side-effect to the original purpose of Captchas). Here's a demonstration of how ReCaptchas work.

And for those who would like to use ReCaptchas, Google Code offers plugins and libraries for the reCAPTCHA API. Well done!

Monday, April 16, 2007

Personal recommendations

Regarding recommender systems, the most familiar are known to be related to audio content, i. e. personal radio. For instance, there's Pandora, which works based on manually generated metadata, and it also explains why a title was recommended. On the other hand, there's, subtitled the social music revolution, which apparently uses some kind of collaborative filtering (a similar approach is already familiar from Amazon). In both cases, the service "learns" from user ratings to better serve the end user. Other recommendation approaches rely on communities or content analysis.

Now there may be several criteria regarding popularity of recommendation-based systems, but those that are really based on such a feature (unlike Amazon, which uses that as an additional feature to better serve their customers) seem to be dependent on two core issues: the quality of their recommendations, as experienced by the end user, and the required effort to handle available content (e. g. metadata management).

If IP based entertainment services are to succeed - as compared to good old radio or television -, personalization seems to be a must. I am sure that broadcasting companies, many of which are working on providing their programs in digital archives already, will understand this as an added value and, possibly, an opportunity to generate additional business.

Wednesday, March 21, 2007

IPTV - open or closed?

While some companies, such as T-Online rely on Microsoft's IPTV platform, the foundation of the OpenIPTV forum, whose founding members are AT&T Inc., Ericsson, France Telecom, Panasonic, Philips, Samsung, Siemens Networks, Sony, and Telecom Italia was announced last Monday. An excerpt of the pres release reads:

The forum (...) will focus on development of open standards that could help to streamline and accelerate deployments of IPTV technologies

Well, I'm sure that all participating companies have their own interests, so I hope that there will be a common goal (a bit more precise than what we read here). One thing seems sure to me though: IPTV has to offer an added value compared to "ordinary" television - e. g. contextual delivery of visual content on both fixed and mobile devices, and all that at affordable costs. Which means that advertisements will play a major role, perhaps dependent on how much end users are willing to invest.

The Open IPTV Forum plans to establish requirements and architecture specifications as well as protocol specifications later in 2007.

This could well be December 2007 - if taken by the word. However, I hope we'll hear something more concrete and official a bit earlier. My guess is that live broadcasting will be reduced to events where time context is crucial (e. g. sports, news). For other programs, IPTV will be more like a filtered access to archived programs (movies, documentations etc.). At least this would be an advantage for me to have the choice to watch a program depending on whether I have the time to do so - or else, leave it for later. But hey, that's only my very personal opinion.

Wednesday, March 14, 2007

David vs. Google

There's some exciting news on search engines. Powerset is a startup whose focus is on natural language search technology and associated with Palo Alto Research Center, Inc. (PARC). Its founder, Dr. Barney Pell, managed to get together a team of excellent scientists that try to break Google's dominance by allowing users to type in whole sentences (as you would ask questions) that are supposed to lead to much better results if comparing this to pure keyword searches. Some more background on the deal with PARC can be found over there.

Seems like a lot of hype going on right now, and whether this is really a breakthrough or not I do not know. They're not the first to try out natural language technology, this is for sure. Assuming that the product really fits its expectations, the next step may be to combine this search engine with voice recognition in order to enable mobile search that really works.

One collaborative scenario that comes to mind is the search for persons of interest beyond the limits of a pre-defined user community. There sure is a large potential, since most anyone has got its own homepage, or weblog somewhere.

Tuesday, February 13, 2007

Trust in Social Networks

Many so-called collaborative services rely on networks of users, sharing the same interests or goals, that contribute on a shared platform. By adding other users (and / or their respective sites) to one's own network, it is possible to find related users, following the friend of a friend principle.

As the number of "friends" expands, linking to other users does not seem enough, as it is not expressive enough. If one models the relations to other users as edges between nodes, it is desirable to be able to assign meaning to these relations. A straightforward way to achieve this is by assigning trust levels to other users. As this trust is related to some context (i. e. I might trust someone to give me good recommendations on where to go out, but I might not trust this person as much when it comes to good movies), this concept of trusts needs semantic indexing, which can be done via tags.

Thus, by expanding the notion of so-called social networks with weighted semantics, communities in the virtual world become much more helpful, as it is possible to find users not only on the basis of what they say about themselves, but also related how other users perceive them. By aggregating the typed relations for a given user, it is then possible to express how this user is perceived in a given community of many participants.

I am interested in sharing thoughts and ideas about this topic, as it seems very relevant both in personal as well in professional networks.

Wednesday, January 31, 2007

YouTube and TV

When comparing the usage of (traditional) TV and YouTube, Harris Interactive observes that one third of frequent YouTube users are watching less TV to watch videos online. On the other hand, digital video recording, combined with harddisk storage, allows end users to become more and more time-independent when it comes to broadcast programs. Personalization of electronic program guides and online video recording as well as IPTV and triple play offers will finally lead to internet and broadcast services to merge. Time dependency seems to be only relevant for events captured live and, perhaps, news.

There is still some distinction between the type of content YouTube has to offer (mostly user generated content) and the broadcast and video world. However, as Google is expanding their collaboration with music labels and broadcasting companies (involving a share of ad revenues), we may see a further decline in TV usage among younger users.

Interestingly enough, users seem to strongly vote against the idea of airing ads before the actual video. As YouTube usage is greatest among the group hardest to reach through TV advertising, the question is how to monetize video display in the long run.

On the other hand, TV channels are expanding on delivering content online that had been broadcasted previously in an attempt to reach that part of the population that is likely not to spend their free time watching TV. I am curious who will win in the long run when it comes to collaboration between online services, telecommunication providers and content producers.

Thursday, January 18, 2007

Twitter - who needs that?

Sending SMS to friends is ok, but who would want to send to Twitter what they're doing currently (limited to 160 characters)? This seems like one of the craziest collaborative services that I found recently. Or maybe I'm getting old.

Instead, I just signed up for beta-testing Joost, a client-based P2P entertainment service. Formerly, it was called The Venice Project. Not after the italian city of that name, but named after the conference room where the idea for this service was born.

Wednesday, January 17, 2007

COMPASS as a multimodal tourist guide

Many contextual services (and this means mobile as well) seem to be the result of research projects. One example is COMPASS 2008 aiming to support the non-Chinese speaking tourists that come to see the Olympic Games in Beijing. Core ideas seem to be profile-based recommendations and on-the-fly translation on a mobile device. According to Professor Wolfgang Wahlster from the German Research Center on Artificial Intelligence (from one of his presentations, a core requirement is that

human users should not be forced to adopt to the language of technology,
but the technology should adopt the language of their human users.

Sounds ok to me. But do we really need machine translation for that? If I assume that, as a tourist, I need a couple of typical phrases in order to get around, I might as well buy a phrase book and get around well.

Other question is whether these prototypes are really easy to use for anyone. For instance, how much effort is involved in creating personal profiles? Another question would be who is taking care of managing these multilingual ontologies? And what is the business model behind this? How about latency time (between a user query and a response)? Was there a field test conducted with average users under real-world conditions? How about end user acceptance?

These are just a couple of straightforward questions that need to be asked when considering a real-world use. Can all these wonderful ideas stand the test of reality?

If this sounds like me being skeptical about artificial intelligence, you may be right ...

Tuesday, January 09, 2007

What will be the future of news?

Lots of people are blogging about Daylife with mixed interpretations. As news aggregation seems to be a somewhat hot topic, while on the other side I see lttle media convergence here, it might be worth taking a step back and re-thinking what an innovative news brokering service might be like. Let me start by a few observations.

  • News are fast (as the name already shows). Old news is an anachronism, so one requirement to deduce is that the brokerage should be fast, reaching the potentially interested reader without delay
  • Content creation has changed from a world of identified news creators / authors (working for newspapers, magazines, TV stations etc.) to almost anyone commenting on most anything nowadays. This raises the question of who will judge the quality of a news contribution. In the old days this was identifiable by a news brand (e. g. USA Today, or Washington Post, or BBC) - on the Web this is not clear at all.
  • With a multiplication of news contributions from all over the world (including their visibility) on almost anything you could think of, the need of filtering arises. The easiest way is to combine a news aggregation with search technology.
  • How to reach the masses. A lot of news services have some technology- or economy-oriented focus, as they are more likely to quickly gain a large readership that is likely to use Web sources as their first approach to news (instead of buying newspapers or magazines).
  • Expansion of news publishers from print into other forms. Most news publishers started early to also post a selection of the news articles that would appear in their publications in online form. In parallel, an expansion into online journalism was started so some news contribution only appear in online format. Last, the brand was also expanded into television, so many news contributors also produce their own magazines, focusing on specific topics that are of interest. So, we see a media diversification, but not necessariliy a media convergence.

Most of the news aggregators are into filtering, but not really into personalization. Thus, in the following, I would just like to list a few requirements that I consider important for the world of news in mixed formats.

  • Double localization. As a reader, I most likely have a relatively static location, i. e. where I live (and work). On the other hand, I may change my location (business or holiday trip). Thus, I not only have the need to be informed what is going on where I live, but also where I currently am.
  • Focus on specific topics require intelligent filtering, involving context. Topics that are of interest to me evolve in a mostly linear manner. New topics related to older ones that are of interested to me will be added, others that are of temporal interest only will fade away. One strong benefit that the online world can provide is that it puts news articles into personal context by considering what I have read in the past (related to topic, sources etc.)
  • Communities are important. If I know what other people I know are interested in, I am able to suggest them articles that I read. Likewise, recommendations from other readers connected to me can be valuable. Thus, adding a people networking service to news aggregation is valuable - if semantic indexing of relations is available (e. g. person X I am related to is known to give good pointers to sports-related news. Also, rating mechanisms might be valuable.
  • Aggregation of news for an overview on a topic. If I start to get interested in a topic, I might be interested in reading a number of more general articles first, before reading specialized articles later. Aggregation of articles sharing the same topic plus filtering and linear ordering, combined with an editorial selection of articles to be showcased, may be an additional value.
  • RSS for specific topics. While this is mostly related to a single site where headlines are aggregated, for some topics that are more specific, it might be worth getting informed (e. g. via email) in a kind of personal newsletter.

I am not sure if annotation is the right thing for news services. First, it is common to blog on topics that raise interest (as I am doing now), which involves the possibility to annotate them. Second, reading annotations for contributions is certainly interesting to see what others have to say, but it is time consuming. Thus, I am leaving out this issue for now and welcome any feedback on the ideas mentioned above.

Wednesday, January 03, 2007


Just found out about a new collaborative service named newstube, which appears to be some kind of digg clone. The idea is to contribute links to news articles related to IT with a short comment. These are first added to a queue, which they exit as soon as there are at least five favourable votes, to be shown on the frontpage.

Sounds like a pretty decent idea in order to avoid junk messages to flood the officially displayed news. But I am wondering who would take the time to actually rate news they read? Usually when I find something interesting, I will take some time to write a post for my weblog, which is first of all like a way to remember interesting things. If other people stumble accross that and benefit from what I write, even better.

The other problem is that just anything which can be referenced via a link and has to do with computers (well, most anything nowadays has to do with computers) could be considered news. Is there anyone who will check if the contributed links are really news or not? Hopefully yes, but until then let's wait and see.

And of course, they're not the first to come up with such a service, not even in the german-speaking market - yigg has been around for somewhat longer. Time will tell whether they're going to make it or not.

My personal opinion is that Google will try to expand their news gathering service with enhanced aggregation and personalization features, either on their own or by continuing to buy innovative service platforms.