A piece of advice for news sites operators: invest money in a real recommendation engine, tag-based, social, or even semantic filtering. Readers will stay longer on your site, increasing the value of their visits. On average, major news sites don't get more than 3 to 4 pages per visit. Sadly, those who manage to go well above those numbers rely on lame tricks such as increasing the page auto-refresh rate or stuffing their site with click-intensive items: online games or slide shows.

This is not an effective tactic. When you artificially increase the page view rate, your visitor spends less time on each page.  With less time spent on a given page, the result is lower click-rates on advertising banners, less revenue. A much better tactic is increasing the "real" number of page views per visit, page views resulting from news content actually relevant to the visitor. As a result, ads will be presented in a much more favorable context or, alternatively, content will be more likely to be paid for.

Don’t tell me what, tell me how... Simple answer: deploy recommendation engines comparable to what exists, already, successfully, on e-commerce or entertainment sites.

Take Amazon. Perhaps the most successful example. I left many bread crumbs on Amazon, especially on the US and UK sites. Then, Amazon sends notifications, offers by email -- among many similar ones from other online store I did business with.  I systematically open Amazon's email simply because I know they are the most relevant to my centers of interest. If I searched for books on architecture six months ago, Amazon will notify me when a new biography of Rem Koolhaas is released. When I visit the site, every time, the page makes suggestions based on my previous visits, displays catalog updates and comparable purchases from other customers. And if I put an item in my cart, Amazon proposes related bundles.

Another example: the music site Deezer.com. Starting with a single entry in the search engine, simply by bouncing from one artist to the other based on other people's taste, I've been able to unearth artists I was fond of twenty years ago. This is what I call efficient recommendation engines. Pleasant and profitable.

Hence the question: Why are online newspapers unable to provide a similar experience? Let’s look at a real situation. This week, for my news consumption, I've been following items such as the painful developments of the stimulus package in the United States ; in France, the bailout of the car industry and the social unrest in the former French colonies ; in space, the collision between an old Russian satellite and an a American one.  In most cases, I got no more than a lukewarm set of "related stories" that have been dug up and inserted manually, most of the time weakly relevant, unexciting.

Fact is: this system never gets close to taking advantage of the editorial depth of news sites adding dozens, if not hundreds of items per day. Digging deeper into a topic involves manually searching through the site or jumping elsewhere for further research. (Since I often end up in Wikipedia, I donated €30 hoping its crew will keep caring for its quality and usefulness).

A recommendation engine ought to be able to encode my interest in a subject into a vector. (OK, a list of criteria, dimensions, and numbers in front of each.) Did I just quickly scan the story? Did I spend six minutes reading it?  Did I print it for later reading (that's an interesting piece of information, easy to gather)?  Did I send the story to a friend? Did I share it on Facebook? Did I click on one, or two related links? Did I bookmarked the URL for later retrieval? Did I look for the same topic just once or three times in the week?  There are many ways to automatically measure my interest in a subject.

From such measurement, serving relevant material extracted from the archives isn’t rocket science and certainly not artificial intelligence: a box in the article page refreshes and offers relevant links (instead of serving me a new ad I don't care for). Or, if I go back to the same subject at a later time, it adds suggested articles or multimedia items I missed -- exactly as Amazon proposes new books or DVDs.

Coming back to this week's hot topics, I could have had an in-depth story on US Treasury Secretary Tim Geithner's management methods, a two-year old story in Le Monde reporting on the racial divide in the French Indies, or a feature on how the US Space Command saved the life of astronauts by preventing a collision with orbiting junk ten years ago. Without a doubt such technique would have increased my “stickiness” to my preferred sites and therefore the value of my visits.

“Off-the-shelf” technology is abundant. There are three categories of recommendation engines: Active, Social and Passive filtering.

Forget about the first one, the Active one: contrary to the "Daily Me" myth, no one will spend fifteen minutes stating his/her centers of interest by filling a form or checking 20 boxes of news-related preferences. You do that once, maybe, for your online grocery list, but not for a moving target such as news. Cereals are easier to define. The second type, social filtering, or social recommendation, is based on a statistical analysis of other people’s tastes as expressed by their consumption patterns. They liked this, then you might like it as well. It works fine for books, movies or music, not that well for news items, because of the unpredictability of the news cycle.

We’re left with the most applicable engine: Passive Filtering accumulates information from my previous browsing sessions. My navigation patterns draws a profile of my news interest that can be matched with other users, further enhancing the system. The older and thicker my history is, the better the engine will perform. It will work fine as I'm loyal to my trusted brands, whether it is Le Monde, 20 minutes, VGnett, or the New York Times.  This leads to a virtuous circle that can also encompass the news outlets beyond my original trusted brand that it would aggregate for me. A kind of self-generated, reliable web ring.

There are technical hurdles. The most obvious is the complexity of the corpus. Thousands of articles covering a vast array of topics are not easy to categorize, as compared to a catalog of cultural goods in which an artist or an author can be connected to a genre. It is the opposite of a well-structured database. A headline won't necessary help much from a retrieval capability standpoint. The essential notion of "tag" for a story is underdeveloped. The reason for such shortcoming is simple: journalists and editor are not taught to properly tag and categorize their stories. (The funny thing is that superblogs such as Henry Blodget's Alley Insider, using a simple tag system, are much better than big news sites at encouraging the reader to stay on the site.)

Technical challenges are no excuse for doing nothing. The real hurdle is a cultural one. In many online newspapers I know of, the Internet culture is recent, almost a foreign object. What mostly occupies minds there is "transition" thinking, from print to the Internet. Technical questions are downgraded as secondary, in many instances left to the techies.

This is a mistaken way of looking at the problem. Yahoo!, for one, eBay, for another, are companies that failed to keep technology as a top-level, strategic concern, putting it down as the infamous “mere matter of implementation”. As a result, the Google and Amazon geeks ate their lunch. It is time to consider a merger between editorial and technical functions, to put both at the same top level. The result will be more valuable than the sum of the parts.  —FF

Print Friendly