online publishing

The Quartz Way (1)


Quartz, a web-only business publication, just turned one year old. On both editorial and business dimensions, Quartz features all components of a modern media venture. Is this a formula for the long run? To answer the question, in the first of two articles, we take a closer look at the editorial product.

Quartz ( is the kind of media most business writers would love to be part of. It’s smart, fun, witty, basic and sophisticated at the same time. Like Jony Ive design at Apple, its apparent simplicity is the combined product of deep thought and of a series of bold moves by its owner, the Atlantic Media group, publisher of the eponymous monthly. From all standpoints, content, organization or even business model, Quartz came up with innovations (see the Monday Note I wrote for the launch in September 2012).

Ten days ago, my phone interview with editor-in-chief Kevin Delaney, started with a discussion of his newsroom of 25 writers and editors. On Tuesday September 24 at 9pm Paris Time, Quartz had this piece at the top of its infinite scroll:

Quartz illustr

Editorially, this epitomizes (in a way) what Quartz is about: topics addressed through well-defined angles (in this case, the idea that if Amazon hit large book retailers hard, it didn’t have much impact on small independent bookstores.) The story was short but right to the point — taking the opposite side of the now worn tale of Amazon devastating the book-selling landscape. To illustrate his piece, instead of using yet another photograph of Jeff Bezos haranguing a crowd, the writer picked this weird image of a girl showing off at a bookstore event.

Yes, at Quartz, journalists are the ones who get to select the pictures that go with their article. Most of the time, this yields better audience numbers.

Actually, explains Kevin Delaney, the staff is supposed to produce a complete package, ready to be processed by editors, with links, headline, photos (lifted from Reuters, Getty, AP or sometime the Creative Commons trove) properly cropped and adjusted. Everything is done within a WordPress interface, chosen for its versatility, but also because most journalists already know to use it. As for headlines (the task usually handled by editors), the Quartz newsroom relies on team chats to quickly and collaboratively work on pieces.

Kevin Delaney (photo: Quartz)

The same goes for graphics like in this snapshot of Tweeter’s IPO prospectus, a part of the magazine’s comprehensive coverage of the upcoming event. To further encourage the use of graphics and charts in stories, Quartz engineering director Michael Donohoe (a NYT alumni) ChartBuilder, a bespoke, easy to use tool.  [Correction : as pointed out by Quartz'global news editor Gideon Lichfield, ChartBuilder has been developed by David Yanofsky, one of Quartz journalist/coder/data hackers...] As an internet-native company, Quartz threw its software in the open-source world (see how it looks in Github) — an unthinkable move in the close-to-the-vest legacy media world…

While listening to Delaney describing his organization, I couldn’t help but mentally itemize what separates its super-agile setup from traditional media. A couple of months ago, I met the digital management of a major UK newspaper. There, execs kept whining about the slow pace evolution of the news staff and the struggle to get writers to add links and basic metadata (don’t even think about pix or graphics) to their work product. By and large, most legacy media I know of, in France, UK and the United States, are years behind swift boats such as Quartz, Politico or the older but still sharp Slate.

I used to think the breadth and depth of older large newsrooms could guarantee their survival in a digital world plagued by mediocrity and loose ethics. But considering great pure players like Quartz — which is just the latest offspring of a larger league — I now come to think we are witnessing the emergence of a new breed of smaller, digital-only outlets that are closing the gap, quality-wise, with legacy media. In the context of an increasingly segmented and short-on-time readership, I can only wonder how long the legacy newsroom’s strategic advantage of size and scope will last.

Quartz editorial staff has nothing to do with the low-paid, poultry farm newsrooms of many digital outlets. Most of the 25 journalists and editors (out a staff of 50) were drawn from well established brands such as Bloomberg, The Economist, Reuters, New York Magazine or The Wall Street Journal (Kevin Delaney, 41, is himself a former managing editor). “Our staff is slightly younger than the average newsroom, and it is steeped in the notion of entrepreneurial journalism”, says the Quartz editor-in-chief. “With Quartz, we had many opportunity to rethink the assumptions of traditional media”.

The original idea was to devise how The Economist would look like if it had been born in 2012 rather than in 1843, explains Delaney. It would be digital native, mostly for mobile reading, and focus on contemporary economic engines such as digital, globalization, e-commerce, the future of energy, debt, China, etc. Instead of abiding by the usual classification of business news that looks like a nomenclature from the Bureau of Labor Statistics  (Industry, Services, Markets, Trade, etc.), Quartz opted for a sexier taxonomy; its coverage is based on an evolving list of “Obsessions“, a much more cognitive-friendly way to consider the news cycle than the usual “beat” (read this on the matter). As an avid magazine reader, Delaney said he derived the idea from publications like New York Magazine.

The challenge is connecting this categorization to audience expectations… Hence the importance of the social reverberation of Quartz treatments. They translate into stunning numbers: according to Kevin Delaney, 85% to 90% of its traffic is “earned” and social referrals account 50% of the site’s traffic. In other words, the traffic coming from people typing in their browser accounts for only 10-15% of the volume. To put things in perspective, on a legacy media site, social traffic weighs about 5% — in some rare cases 10% — and around 40% to 50% of the pages views are generated via the home page.

Since the site is nothing else but an infinite rolling page of stories, there is no classic jumping board home page. Another obsession of Quartz founders: “We wanted to minimize friction and encourage readers to share our stories. We designed the site first for tablets, then for mobile and as a classic website, in that order,” insists Kevin Delaney. No apps in sight, but a site built in HTML5 and responsive design that adjusts to screen size. At first, the no-app choice sounded weird for a media aimed at a mobile audience, but considering the rising costs and complexity of building, managing, and maintaining native apps on multiple platforms, a single HTML design was probably the best approach.

I’m not through talking about Quartz. Next week, we’ll examine the venture’s business aspects, its bold ways of dealing with advertising.

Goodbye Google Reader


Three months ago, Google announced the “retirement” of Google Reader as part of the company’s second spring cleaning. On July 1st — two weeks from today — the RSS application will be given a gold watch and a farewell lunch, then it will pack up its bits and leave the building for the last time.

The other items on Google’s spring cleaning list, most of which are tools for developers, are being replaced by superior (or simpler, friendlier) services: Are you using CalDAV in your app? Use the Google Calendar API, instead; Google Map Maker will stand in for Google Building Maker; Google Cloud Connect is gone, long live Google Drive.

For Google Reader’s loyal following, however, the company had no explanation beyond a bland “usage has declined”, and it offered no replacement nor even a recommendation other than a harsh “get your data and move on”:

Users and developers interested in RSS alternatives can export their data, including their subscriptions, with Google Takeout over the course of the next four months.

The move didn’t sit well with users whose vocal cords were as strong as their bond to their favorite blog reader. James Fallows, the polymathic writer for The Atlantic, expressed a growing distrust of the company’s “experiments” in A Problem Google Has Created for Itself:

I have already downloaded the Android version of Google’s new app for collecting notes, photos, and info, called Google Keep… Here’s the problem: Google now has a clear enough track record of trying out, and then canceling, “interesting” new software that I have no idea how long Keep will be around… Until I know a reason that it’s in Google’s long-term interest to keep Keep going, I’m not going to invest time in it or lodge info there.

The Washington Post’s Ezra Klein echoed the sentiment (full article here):

But I’m not sure I want to be a Google early adopter anymore. I love Google Reader. And I used to use Picnik all the time. I’m tired of losing my services.

What exactly did Google Reader provide that got its users, myself included, so excited, and why do we take its extermination so personally?

Reading is, for some of us, an addiction. Sometimes the habit turns profitable: The hours I spent poring over computer manuals on Saturday mornings in my youth may have seemed cupidic at the time, but the “research” paid off.

Back before the Web flung open the 10,000 Libraries of Alexandria that I dreamed of in the last chapter of The Third Apple my reading habit included a daily injection of newsprint.  But as online access to real world dailies became progressively more ubiquitous and easier to manage, I let my doorstep subscriptions lapse (although I’ll always miss the wee hour thud of the NYT landing on our porch…an innocent pleasure unavailable in my country of birth).

Nothing greased the move to all-digital news as much as the RSS protocol (Real Simple Syndication, to which my friend Dave Winer made crucial contributions). RSS lets you syndicate your website by adding a few lines of HTML code. To subscribe, a user simply pushes a button. When you update your blog, it’s automatically posted to the user’s chosen “feed aggregator”.

RSS aggregation applications and add-ons quickly became a very active field as this link attests. Unfortunately, the user interfaces for these implementations – how you add, delete, and navigate subscriptions — often left much to be desired.

Enter Google Reader, introduced in 2005. Google’s RSS aggregator mowed down everything in its path as it combined the company’s Cloud resources with a clean, sober user interface that was supported by all popular browsers…and the price was right: free.

I was hooked. I just checked, I have 60 Google Reader subscriptions. But the number is less important than the way the feeds are presented: I can quickly search for subscriptions, group them in folders, search through past feeds, email posts to friends, fly over article summaries, and all of this is made even easier through simple keyboard shortcuts (O for Open, V for a full View on the original Web page, Shift-A to declare an entire folder as Read).

Where I once read four newspapers with my morning coffee I now open my laptop or tablet and skim my customized, ever-evolving Google Reader list. I still wonder at the breadth and depth of available feeds, from dissolute gadgetry to politics, technology, science, languages, cars, sports…

I join the many who mourn Google Reader’s impending demise. Fortunately, there are alternatives that now deserve more attention.

I’ll start with my Palo Alto neighbor, Flipboard. More than just a Google Reader replacement, Flipboard lets you compose and share personalized magazines. It’s very well done although, for my own daily use, its very pretty UI gets in the way of quickly surveying the field of news I’m interested in. Still, if you haven’t loaded it onto your iOS or Android device, you should give it a try.

Next we have Reeder, a still-evolving app that’s available on the Mac, iPhone, and iPad. It takes your Google Reader subscriptions and presents them in a “clean and well-lighted” way:

For me, Feedly looks like the best way to support one’s reading habit (at least for today). Feedly is offered as an app on iOS and Android, and as extensions for Chrome, Firefox, and Safari on your laptop or desktop (PC or Mac). Feedly is highly customizable: Personally, I like the ability to emulate Reader’s minimalist presentation, others will enjoy a richer, more graphical preview of articles. For new or “transferring” users, it offers an excellent Feedback and Knowledge Base page:

Feedly makes an important and reassuring point: There might be a paid-for version in the future, a way to measure the app’s real value, and to create a more lasting bond between users and the company.

There are many other alternatives, a Google search for “Google Reader replacement” (the entire phrase) yields nearly a million hits (interestingly, Bing comes up with only 35k).

This brings us back to the unanswered question: Why did Google decide to kill a product that is well-liked and well-used by well-informed (and I’ll almost dare to add: well-heeled) users?

I recently went to a Bring Your Parents to Work day at Google. (Besides comrades of old OS Wars, we now have a child working there.) The conclusion of the event was the weekly TGIF-style bash (which is held on Thursdays in Mountain View, apparently to allow Googlers in other time zones to participate). Both founders routinely come on stage to make announcements and answer questions.

Unsurprisingly, someone asked Larry Page a question about Google Reader and got the scripted “too few users, only about a million” non-answer, to which Sergey Brin couldn’t help quip that a million is about the number of remote viewers of the Google I/O developer conference Page had just bragged about. Perhaps the decision to axe Reader wasn’t entirely unanimous. And never mind the fact Feedly seems to already have 3 million subscribers

The best explanation I’ve read (on my Reader feeds) is that Google wants to draw the curtain, perform some surgery, and reintroduce its RSS reader as part of Google+, perhaps with some Google Now thrown in:

While I can’t say I’m a fan of squirrelly attempts to draw me into Google+, I must admit that RSS feeds could be a good fit… Stories could appear as bigger, better versions of the single-line entry in Reader, more like the big-photo entries that Facebook’s new News Feed uses. Even better, Google+ entries have built in re-sharing tools as well as commenting threads, encouraging interaction.

We know Google takes the long view, often with great results. We’ll see if killing Reader was a misstep or another smart way to draw Facebook users into Google’s orbit.

It may come down to a matter of timing. For now, Google Reader is headed for the morgue. Can we really expect that Google’s competitors — Yahoo!, Facebook, Apple, Microsoft — will resist the temptation to chase the ambulance?



In Bangkok, with the Fast Movers


The WAN-IFRA congress in Bangkok showed good examples of the newspaper industry’s transformation. Here are some highlights. 

Last week, I travelled to Bangkok for the 65th congress of the World Association of Newspapers (The WAN-IFRA also includes the World Editors Forum and the World Advertising Forum.) For a supposedly dying industry, the event gathered a record crowd: 1400 delegates from all over the world (except for France, represented by at most a dozen people…) Most presentations and discussions revealed an acceleration in the transformation of the sector.

The transition is now mostly led by emerging countries seemingly eager to get rid themselves as quickly as possible of the weight of the past. At a much faster pace than in the West, Latin America and Asia publishers take advantage of their relatively healthy print business to accelerate the online transition. These many simultaneous changes involve spectacular newsroom transformations where the notion of publication gives way to massive information factories equally producing print, web and mobile content. In these new structures, journalists, multimedia producers, developers (a Costa-Rican daily has one computer wizard for five journalists…) are blended together. They all serve a vigorous form of journalism focused on the trade’s primary mission: exposing abuses of power and public or private failures (the polar opposite of the aggregation disease.) To secure and to boost the conversion, publishers rethink the newsroom architecture, eliminate walls (physical as well as mental ones), overhaul long established hierarchies and desk arrangements (often an inheritance of the paper’s sections structure.)

In the news business, modernity no longer resides in the Western hemisphere. In Europe and in the United States, a growing number of readers are indeed getting their news online, but in a terrifyingly scattered way. According to data compiled by media analyst Jim Chisholm, newspapers represent 50.4% of internet consumption when expressed in unique visitors, but only 6.8% in visits, 1.3% in time spent, and 0.9% in page views!… “The whole battle is therefore about engagement”, says WAN-IFRA general manager Vincent Peyregne, who underlines that the level of engagement for digital represents about 5% of what it is for print — which matches the revenue gap. This is consistent with Jim Chisholm’s views stated a year ago in this interview to Ria Novosti [emphasis mine]:

If you see, how often in a month do people visit media, they visit the print papers 16 times, while the for digital papers it’s just six. At that time they look at 36 pages in print and just 3.5 in digital. Over a month, print continues to deliver over 50 times the audience intensity of newspaper digital websites.

One of the best ways to solve the engagement equation is to gain a better knowledge of audiences. In this regard, two English papers lead the pack: The Daily Mail and the Financial Times. The first is a behemoth : 119 million uniques visitors per month (including 42 m in the UK) and the proof that a profusion of vulgarity remains a weapon of choice on the web. Aside from sleaziness, the Mail Online is a fantastic data collection machine. At the WAN conference, its CEO Kevin Beatty stated that DMG, the Mail’s parent company, reaches 36% of the UK population and, on a 10-day period, the company collects “50 billion things about 43 million people”. The accumulation of data is indeed critical, but all the people I spoke with — I was there to moderate a panel about aggregation and data collection — are quick to denounce an advertising market terribly slow to reflect the value of segmentation. While many media outlets spend a great deal of resources to build data analytics, media buying agencies remain obsessed with volume. For many professionals, the ad market better quickly understand what’s at stake here; the current status quo might actually backfire as it will favor more direct relationships between media outlets and advertisers. As an example, I asked to Casper de Bono, the B2B Manager for the, how its company managed to extract value from its trove of user data harvested through its paywall. De Bono used the example of an airline that asked to extract the people that logged on the site from at least four different places served by the airline in the last 90 days. The idea was to target these individuals with specific advertising — anyone can imagine the value of such customers… This is but an example of the’s ultra-precise audience segmentation.

Paywalls were also on everyone’s lips in Bangkok. “The issue is settled”, said Juan Señor, a partner at Innovation Media Consulting, “This is not the panacea but we now know that people are willing to pay for quality and depth”. Altogether, he believes that 3% to 5% of a media site’s unique visitors could become digital subscribers. And he underlined a terrible symmetry in the revenue structure of two UK papers: While the Guardian — which resists the idea of paid-for digital readers — is losing £1m per week, The Telegraph makes roughly the same amount (£50m a year, $76m or €59m) in extra revenues thanks to its digital subscriptions… No one believes paywalls will be the one and only savior of online newspapers but, at the very least, paywalls seem to prove quality journalism is back in terms of value for the reader.

Why Google Will Crush Nielsen


Internet measurement techniques need a complete overhaul. New ways have emerged, potentially displacing older panel-based technologies. This will make it hard for incumbent players to stay in the game.

The web user is the most watched consumer ever. For tracking purposes, every large site drops literally dozens of cookies in the visitor’s browser. In the most comprehensive investigation on the matter, The Wall Street Journal found that each of the 50 largest web sites in the United Sates, weighing 40% of the US page views, installed an average of 64 files on a user device. (See the WSJ’s What They Know series and a Monday Note about tracking issues.) As for server logs, they record every page sent to the user and they tell with great accuracy which parts of a page collect most of the reader’s attention.

But when it comes to measuring a digital viewer’s commercial value, sites rely on old-fashioned panels, that is limited user population samples. Why?

Panels are inherited. They go back to the old days of broadcast radio when, in order to better sell advertising, dominant networks wanted to know which station listeners tuned in to during the day. In the late thirties, Nielsen Company made a clever decision: they installed a monitoring box in 1000 American homes. Twenty years later, Nielsen did the same, on a much larger scale, with broadcast television. The advertising world was happy to be fed with plenty of data — mostly unchallenged as Nielsen dominated the field. (For a detailed history, you can read Rating the Audience, written by two Australian media academics). As Nielsen expanded to other media (music, film, books and all sorts of polls), moving to the internet measurement sounded like a logical step. As of today, Nielsen only faces smaller competitors such as ComScore and others.

I have yet to meet a publisher who is happy with this situation. Fearing retribution, very few people talk openly about it (twisting the dials is so easy, you know…), but hey all complain about inaccurate, unreliable data. In addition, the panel system is vulnerable to cheating on a massive scale. Smarty pants outfits sell a vast array of measurement boosters, from fake users that will come in just once a month to be counted as “unique” (they are indeed), to more sophisticated tactics such as undetectable “pop under” sites that will rely on encrypted URLs to deceive the vigilance of panel operators. In France for instance, 20% to 30% of some audiences can be bogus — or largely inflated. To its credit, Mediametrie — the French Nielsen affiliate that produces the most watched measurements — is expending vast resources to counter the cheating, and to make the whole model more reliable. It works, but progress is slow. In August 2012, Mediametrie Net Ratings (MNR), launched a Hybrid Measure taking into account site centric analytics (server logs) to rectify panel numbers, but those corrections are still erratic. And it takes more than a month to get the data, which is not acceptable for the real-time-obsessed internet.

Publishers monitor the pulse of their digital properties on a permanent basis. In most newsrooms, Chartbeat (also imperfect, sometimes) displays the performance of every piece of content, and home pages get adjusted accordingly. More broadly, site-centric measures detail all possible metrics: page views, time spent, hourly peaks, engagement levels. This is based on server logs tracking dedicated tags inserted in each served page. But the site-centric measure is also flawed: If you use, say, four different devices — a smartphone, a PC at home, another at work, and a tablet — you will be incorrectly counted as four different users. And if you use several browsers you could be counted even more times. This inherent site-centric flaw is the best argument for panel vendors.

But, in the era of Big Data and user profiling, panels no longer have the upper hand.

The developing field of statistical pairing technology shows great promise. It is now possible to pinpoint a single user browsing the web with different devices in a very reliable manner. Say you use the four devices mentioned earlier: a tablet in the morning and the evening; a smartphone for occasional updates on the move, and two PCs (a desktop at the office and a laptop elsewhere). Now, each time you visit a new site, an audience analytics company drops a cookie that will record every move on every site, from each of your devices. Chances are your browsing patterns will be stable (basically your favorite media diet, plus or minus some services that are better fitted for a mobile device.) Not only your browsing profile is determined from your navigation on a given site, but it is also quite easy to know which sites you have been to before the one that is currently monitored, adding further precision to the measurement.

Over time, your digital fingerprint will become more and more precise. Until then, the set of four cookies is independent from each other. But the analytics firm compiles all the patterns in single place. By data-mining them, analysts will determine the probability that a cookie dropped in a mobile application, a desktop browser or a mobile web site belongs to the same individual. That’s how multiple pairing works. (To get more details on the technical and mathematical side of it, you can read this paper by the founder of Drawbridge Inc.) I recently discussed these techniques with several engineers both in France and in the United Sates. All were quite confident that such fingerprinting is doable and that it could be the best way to accurately measure internet usage across different platforms.

Obviously, Google is best positioned to perform this task on a large scale. First, its Google Analytics tool is deployed over 100 millions web sites. And the Google Ad Planner, even in its public version, already offers a precise view of the performance of many sites in the world. In addition, as one of the engineers pointed out, Google is already performing such pairing simply to avoid showing the same ad twice to a someone using several devices. Google is also most likely doing such ranking in order to feed the obscure “quality index” algorithmically assigned to each site. It even does such pairing on a nominative basis by using its half billion Gmail accounts (425 million in June 2012) and connecting its Chrome users. As for giving up another piece of internet knowledge to Google, it doesn’t sounds like a big deal to me. The search giant knows already much more about sites than most publishers do about their own properties. The only thing that could prevent Google from entering the market of public web rankings would be the prospect of another privacy outcry. But I don’t see why it won’t jump on it — eventually. When this happens, Nielsen will be in big trouble.

Google News: The Secret Sauce


A closer look at Google’s patent for its news retrieval algorithm reveals a greater than expected emphasis on quality over quantity. Can this bias stay reliable over time?

Ten years after its launch, Google News’ raw numbers are staggering: 50,000 sources scanned, 72 editions in 30 languages. Google’s crippled communication machine, plagued by bureaucracy and paranoia, has never been able to come up with tangible facts about its benefits for the news media it feeds on. It’s official blog merely mentions “6 billion visits per month” sent to news sites and Google News claims to connect “1 billion unique users a week to news content” (to put things in perspective, the or the Huffington Post are cruising at about 40 million UVs per month). Assuming the clicks are sent to a relatively fresh news page bearing higher value advertising, the six billion visits can translate into about $400 million per year in ad revenue. (This is based on a $5 to $6 revenue per 1,000 pages, i.e. a few dollars in CPM per single ad, depending on format, type of selling, etc.) That’s a very rough estimate. Again: Google should settle the matter and come up with accurate figures for its largest markets. (On the same subject, see a previous Monday Note: The press, Google, its algorithm, their scale.)

But how exactly does Google News work? What kind of media does its algorithm favor most? Last week, the search giant updated its patent filing with a new document detailing the thirteen metrics it uses to retrieve and rank articles and sources for its news service. (Computerworld unearthed the filing, it’s here).

What follows is a summary of those metrics, listed in the order shown in the patent filing, along with a subjective appreciation of their reliability, vulnerability to cheating, relevancy, etc.

#1. Volume of production from a news source:

A first metric in determining the quality of a news source may include the number of articles produced by the news source during a given time period [week or month]. [This metric] may be determined by counting the number of non-duplicate articles produced by the news source over the time period [or] counting the number of original sentences produced by the news source.

This metric clearly favors production capacity. It benefits big media companies deploying large staffs. But the system can also be cheated by content farms (Google already addressed these questions); new automated content creation systems are gaining traction, many of them could now easily pass the Turing Test.

#2. Length of articles. Plain and simple: the longer the story (on average), the higher the source ranks. This is bad news for aggregators whose digital serfs cut, paste, compile and mangle abstracts of news stories that real media outlets produce at great expense.

#3. “The importance of coverage by the news source”. To put it another way, this matches the volume of coverage by the news source against the general volume of text generated by a topic. Again, it rewards large resource allocation to a given event. (In New York Times parlance, such effort is called called “flooding the zone”.)

#4. The “Breaking News Score”:   

This metric may measure the ability of the news source to publish a story soon after an important event has occurred. This metric may average the “breaking score” of each non-duplicate article from the news source, where, for example, the breaking score is a number that is a high value if the article was published soon after the news event happened and a low value if the article was published after much time had elapsed since the news story broke.

Beware slow moving newsrooms: On this metric, you’ll be competing against more agile, maybe less scrupulous staffs that “publish first, verify later”. This requires a smart arbitrage by the news producers. Once the first headline has been pushed, they’ll have to decide what’s best: Immediately filing a follow-up or waiting a bit and moving a longer, more value-added story that will rank better in metrics #2 and #3? It depends on elements such as the size of the “cluster” (the number of stories pertaining to a given event).

#5. Usage Patterns:

Links going from the news search engine’s web page to individual articles may be monitored for usage (e.g., clicks). News sources that are selected often are detected and a value proportional to observed usage is assigned. Well known sites, such as CNN, tend to be preferred to less popular sites (…). The traffic measured may be normalized by the number of opportunities readers had of visiting the link to avoid biasing the measure due to the ranking preferences of the news search engine.

This metric is at the core of Google’s business: assessing the popularity of a website thanks to the various PageRank components, including the number of links that point to it.

#6. The “Human opinion of the news source”:

Users in general may be polled to identify the newspapers (or magazines) that the users enjoy reading (or have visited). Alternatively or in addition, users of the news search engine may be polled to determine the news web sites that the users enjoy visiting. 

Here, things get interesting. Google clearly states it will use third party surveys to detect the public’s preference among various medias — not only their website, but also their “historic” media assets. According to the patent filing, the evaluation could also include the number of Pulitzer Prizes the organization collected and the age of the publication. That’s for the known part. What lies behind the notion of “Human opinion” is a true “quality index” for news sources that is not necessarily correlated to their digital presence. Such factors clearly favors legacy media.

# 7. Audience and traffic. Not surprisingly Google relies on stats coming from Nielsen Netratings and the like.

#8. Staff size. The bigger a newsroom is (as detected in bylines) the higher the value will be. This metric has the merit of rewarding large investments in news gathering. But it might become more imprecise as “large” digital newsrooms tend now to be staffed with news repackagers bearing little added value.

#9. Numbers of news bureaus. It’s another way to favor large organizations — even though their footprint tends to shrink both nationally and abroad.

#10. Number of “original named entities”. That’s one of the most interesting metric. A “named entity is the name of a person, place or organization”. It’s the primary tool for semantic analysis.

If a news source generates a news story that contains a named entity that other articles within the same cluster (hence on the same topic) do not contain, this may be an indication that the news source is capable of original reporting.

Of course, some cheaters insert misspelled entities to create “false” original entities and fool the system (Google took care of it). But this metric is a good way to reward original source-finding.

#11. The “breadth” of the news source. It pertains to the ability of a news organizations to cover a wide range of topics.

#12. The global reach of the news sources. Again, it favors large media who are viewed, linked, quoted, “liked”, tweeted from abroad.

This metric may measure the number of countries from which the news site receives network traffic. In one implementation consistent with the principles of the invention, this metric may be measured by considering the countries from which known visitors to the news web site are coming (e.g., based at least in part on the Internet Protocol (IP) addresses of those users that click on the links from the search site to articles by the news source being measured). The corresponding IP addresses may be mapped to the originating countries based on a table of known IP block to country mappings.

#13. Writing style. In the Google world, this means statistical analysis of contents against a huge language model to assess “spelling correctness, grammar and reading levels”.

What conclusions can we draw? This enumeration clearly shows Google intends to favor legacy media (print or broadcast news) over pure players, aggregators or digital native organizations. All the features recently added, such as Editor’s pick, reinforce this bias. The reason might be that legacy media are less prone to tricking the algorithm. For once, a know technological weakness becomes an advantage.

The Google Fund for the French Press


At the last minute, ending three months of  tense negotiations, Google and the French Press hammered a deal. More than yet another form of subsidy, this could mark the beginning of a genuine cooperation.

Thursday night, at 11:00pm Paris time, Marc Schwartz, the mediator appointed by the French government got a call from the Elysée Palace: Google’s chairman Eric Schmidt was en route to meet President François Hollande the next day in Paris. They both intended to sign the agreement between Google and the French press the Friday at 6:15pm. Schwartz, along with Nathalie Collin, the chief representative for the French Press, were just out of a series of conference calls between Paris and Mountain view: Eric Schmidt and Google’s CEO Larry Page had green-lighted the deal. At 3 am on Friday, the final draft of the memorandum was sent to Mountain View. But at 11:00am everything had to be redone: Google had made unacceptable changes, causing Schwartz and Collin to  consider calling off the signing ceremony at the Elysée. Another set of conference calls ensued. The final-final draft, unanimously approved by the members of the IPG association (General and Political Information), was printed at 5:30pm, just in time for the gathering at the Elysée half an hour later.

The French President François Hollande was in a hurry, too: That very evening, he was bound to fly to Mali where the French troops are waging as small but uncertain war to contain Al-Qaeda’s expansion in Africa. Never shy of political calculations, François Hollande seized the occasion to be seen as the one who forced Google to back down. As for Google’s chairman, co-signing the agreement along with the French President was great PR. As a result, negotiators from the Press were kept in the dark until Eric Schmidt’s plane landed in Paris Friday afternoon and before heading to the Elysée. Both men underlined what  they called “a world premiere”, a “historical deal”…

This agreement ends — temporarily — three months of difficult negotiations. Now comes the hard part.

According to Google’s Eric Schmidt, the deal is built on two stages:

“First, Google has agreed to create a €60 million Digital Publishing Innovation Fund to help support transformative digital publishing initiatives for French readers. Second, Google will deepen our partnership with French publishers to help increase their online revenues using our advertising technology.”

As always, the devil lurks in the details, most of which will have to be ironed over the next two months.

The €60m ($82m) fund will be provided by Google over a three-year period; it will be dedicated to new-media projects. About 150 websites members of the IPG association will be eligible for submission. The fund will be managed by a board of directors that will include representatives from the Press, from Google as well as independent experts. Specific rules are designed to prevent conflicts of interest. The fund will most likely be chaired by the Marc Schwartz, the mediator, also partner at the global audit firm Mazars (all parties praised him for his mediation and wish him to take the job).

Turning to the commercial part of the pact, it is less publicized but at least as equally important as the fund itself. In a nutshell, using a wide array of tools ranging from advertising platforms to content distribution systems, Google wants to increase its business with the Press in France and elsewhere in Europe. Until now, publishers have been reluctant to use such tools because they don’t want to increase their reliance on a company they see as cold-blooded and ruthless.

Moving forward, the biggest challenge will be overcoming an extraordinarily high level distrust on both sides. Google views the Press (especially the French one) as only too eager to “milk” it, and unwilling to genuinely cooperate in order to build and share value from the internet. The engineering-dominated, data-driven culture of the search engine is light-years away from the convoluted “political” approach of legacy media that don’t understand or look down on the peculiar culture of tech companies.

Dealing with Google requires a mastery of two critical elements: technology (with the associated economics), and the legal aspect. Contractually speaking, it means transparency and enforceability. Let me explain.

Google is a black box. For good and bad reasons, it fiercely protects the algorithms that are key to squeezing money from the internet, sometimes one cent at a time — literally. If Google consents to a cut of, say, advertising revenue derived from a set of contents, the partner can’t really ascertain whether the cut truly reflects the underlying value of the asset jointly created – or not. Understandably, it bothers most of Google’s business partners: they are simply asked to be happy with the monthly payment they get from Google, no questions asked. Specialized lawyers I spoke with told me there are ways to prevent such opacity. While it’s futile to hope Google will lift the veil on its algorithms, inserting an audit clause in every contract can be effective; in practical terms, it means an independent auditor can be appointed to verify specific financial records pertaining to a business deal.

Another key element: From a European perspective, a contract with Google is virtually impossible to enforce. The main reason: Google won’t give up on the Governing Law of a contract that is to be “Litigated exclusively in the Federal or States Courts of Santa Clara County, California”. In other words: Forget about suing Google if things go sour. Your expensive law firm based in Paris, Madrid, or Milan will try to find a correspondent in Silicon Valley, only to be confronted with polite rebuttals: For years now, Google has been parceling out multiples pieces of litigation among local law firms simply to make them unable to litigate against it. Your brave European lawyer will end up finding someone that will ask several hundreds thousands dollars only to prepare but not litigate the case. The only way to prevent this is to put an arbitration clause in every contract. Instead of going before a court of law, the parties agrees to mediate the matter through a private tribunal. Attorneys say it offers multiples advantages: It’s faster, much cheaper, the terms of the settlement are confidential, and it carries the same enforceability as a Court order.

Google (and all the internet giants for that matter) usually refuses an arbitration clause as well as the audit provision mentioned earlier. Which brings us to a critical element: In order to develop commercial relations with the Press, Google will have to find ways to accept collective bargaining instead of segmenting negotiations one company at a time. Ideally, the next round of discussions should come up with a general framework for all commercial dealings. That would be key to restoring some trust between the parties. For Google, it means giving up some amount of tactical as well as strategic advantage… that is part of its long-term vision. As stated by Eric Schmidt in its upcoming book “The New Digital Age” (the Wall Street Journal had access to the galleys) :

“[Tech companies] will also have to hire more lawyers. Litigation will always outpace genuine legal reform, as any of the technology giants fighting perpetual legal battles over intellectual property, patents, privacy and other issues would attest.”

European media are warned: they must seriously raise their legal game if they want to partner with Google — and the agreement signed last Friday in Paris could help.

Having said that, I personally believe it could be immensely beneficial for digital media to partner with Google as much as possible. This company spends roughly two billion dollars a year refining its algorithms and improving its infrastructure. Thousands of engineers work on it. Contrast this with digital media: Small audiences, insufficient stickiness, low monetization plague both web sites and mobile apps; the advertising model for digital information is mostly a failure — and that’s not Google’s fault. The Press should find a way to capture some of Google’s technical firepower and concentrate on what it does best: producing original, high quality contents, a business that Google is unwilling (and probably culturally unable) to engage in. Unlike Apple or Amazon, Google is relatively easy to work with (once the legal hurdles are cleared).

Overall, this deal is a good one. First of all, both sides are relieved to avoid a law (see last Monday Note Google vs. the press: avoiding the lose-lose scenario). A law declaring that snippets and links are to be paid-for would have been a serious step backward.

Second, it’s a departure from the notion of “blind subsidies” that have been plaguing the French Press for decades. Three months ago, the discussion started with irreconcilable positions: publishers were seeking absurd amounts of money (€70m per year, the equivalent of IPG’s members total ads revenue) and Google was focused on a conversion into business solutions. Now, all the people I talked to this weekend seem genuinely supportive of building projects, boosting innovation and also taking advantage of Google’s extraordinary engineering capabilities. The level of cynicism often displayed by the Press is receding.

Third, Google is changing. The fact that Eric Schmidt and Larry Page jumped in at the last minute to untangle the deal shows a shift of perception towards media. This agreement could be seen as a template for future negotiations between two worlds that still barely understand each other.

Google vs. the press: avoiding the lose-lose scenario


Google and the French press have been negotiating for almost three months now. If there is no agreement within ten days, the government is determined to intervene and pass a law instead. This would mean serious damage for both parties. 

An update about the new corporate tax system. Read this story in Forbes by the author of the report quoted below 

Since last November, about twice a week and for several hours, representatives from Google and the French press have been meeting behind closed doors. To ease up tensions, an experienced mediator has been appointed by the government. But mistrust and incomprehension still plague the discussions, and the clock is ticking.

In the currently stalled process, the whole negotiation revolves around cash changing hands. Early on, representatives of media companies where asking Google to pay €70m ($93m) per year for five years. This would be “compensation” for “abusively” indexing and linking their contents and for collecting 20 words snippets (see a previous Monday Note: The press, Google, its algorithm, their scale.) For perspective, this €70m amount is roughly the equivalent to the 2012 digital revenue of newspapers and newsmagazines that constitutes the IPG association (General and Political Information).

When the discussion came to structuring and labeling such cash transfer, IPG representatives dismissively left the question to Google: “Dress it up!”, they said. Unsurprisingly, Google wasn’t ecstatic with this rather blunt approach. Still, the search engine feels this might be the right time to hammer a deal with the press, instead of perpetuating a latent hostility that could later explode and cost much more. At least, this is how Google’s European team seems to feel. (In its hyper-centralized power structure, management in Mountain View seems slow to warm up to the idea.)

In Europe, bashing Google is more popular than ever. Not only just Google, but all the US-based internet giants, widely accused of killing old businesses (such as Virgin Megastore — a retail chain that also made every possible mistake). But the actual core issue is tax avoidance. Most of these companies hired the best tax lawyers money can buy and devised complex schemes to avoid paying corporate taxes in EU countries, especially UK, Germany, France, Spain, Italy…  The French Digital Advisory Board — set up by Nicolas Sarkozy and generally business-friendly — estimated last year that Google, Amazon, Apple’s iTunes and Facebook had a combined revenue of €2.5bn – €3bn but each paid only on average €4m in corporate taxes instead of €500m (a rough 20% to 25% tax rate estimate). At a time of fiscal austerity, most governments see this (entirely legal) tax avoidance as politically unacceptable. In such context, Google is the target of choice. In the UK for instance, Google made £2.5bn (€3bn or $4bn) in 2011, but paid only £6m (€7.1m or $9.5m) in corporate taxes. To add insult to injury, in an interview with The Independent, Google’s chairman Eric Schmidt defended his company’s tax strategy in the worst possible manner:

“I am very proud of the structure that we set up. We did it based on the incentives that the governments offered us to operate. It’s called capitalism. We are proudly capitalistic. I’m not confused about this.”

Ok. Got it. Very helpful.

Coming back to the current negotiation about the value of the click, the question was quickly handed over to Google’s spreadsheet jockeys who came up with the required “dressing up”. If the media accepted the use of the full range of Google products, additional value would be created for the company. Then, a certain amount could be derived from said value. That’s the basis for a deal reached last year with the Belgium press (the agreement is shrouded in a stringent confidentiality clause.)

Unfortunately, the French press began to eliminate most of the eggs in the basket, one after the other, leaving almost nothing to “vectorize” the transfer of cash. Almost three months into the discussion, we are stuck with antagonistic positions. The IPG representatives are basically saying: We don’t want to subordinate ourselves further to Google by adopting opaque tools that we can find elsewhere. Google retorts: We don’t want to be considered as another deep-pocketed “fund” that the French press will tap forever into without any return for our businesses; plus, we strongly dispute any notion of “damages” to be paid for linking to media sites. Hence the gap between the amount of cash asked by one side and what is (reluctantly) acceptable on the other.

However, I think both parties vastly underestimate what they’ll lose if they don’t settle quickly.

The government tax howitzer is loaded with two shells. The first one is a bill (drafted by no one else than IPG’s counsel, see PDF here), which introduces the disingenuous notion of “ancillary copyright”. Applied to the snippets Google harvests by the thousands every day, it creates some kind of legal ground to tax it the hard way. This montage is adapted from the music industry in which the ancillary copyright levy ranges from 4% to 7% of the revenue generated by a sector or a company. A rate of 7% for the revenue officially declared by Google in France (€138m) would translate into less than €10m, which is pocket change for a company that in fact generates about €1.5 billion from its French operations.

That’s where the second shell could land. Last Friday, the Ministry of Finances released a report on the tax policy applied to the digital economy  titled “Mission d’expertise sur la fiscalité de l’économie numérique” (PDF here). It’s a 200 pages opus, supported by no less than 600 footnotes. Its authors, Pierre Collin and Nicolas Colin are members of the French public elite (one from the highest jurisdiction, le Conseil d’Etat, the other from the equivalent of the General Accounting Office — Nicolas Colin being  also a former tech entrepreneur and a writer). The Collin & Colin Report, as it’s now dubbed, is based on a set of doctrines that also come to the surface in the United States (as demonstrated by the multiple references in the report).

To sum up:
— The core of the digital economy is now the huge amount of data created by users. The report categorizes different types of data: “Collected Data”, are  gathered through cookies, wether the user allows it or not. Such datasets include consumer behaviors, affiliations, personal information, recommendations, search patterns, purchase history, etc.  “Submitted Data” are entered knowingly through search boxes, forms, timelines or feeds in the case of Facebook or Twitter. And finally, “Inferred Data” are byproducts of various processing, analytics, etc.
— These troves of monetized data are created by the free “work” of users.
— The location of such data collection is independent from the place where the underlying computer code is executed: I create a tangible value for Amazon or Google with my clicks performed in Paris, while the clicks are processed in a  server farm located in Netherlands or in the United Sates — and most of the profits land in a tax shelter.
— The location of the value insofar created by the “free work” of users is currently dissociated from the location of the tax collection. In fact, it escapes any taxation.

Again, I’m quickly summing up a lengthy analysis, but the conclusion of the Collin & Colin report is obvious: Sooner or later, the value created and the various taxes associated to it will have to be reconciled. For Google, the consequences would be severe: Instead of €138m of official revenue admitted in France, the tax base would grow to €1.5bn revenue and about €500m profit; that could translate €150m in corporate tax alone instead of the mere €5.5m currently paid by Google. (And I’m not counting the 20% VAT that would also apply.)

Of course, this intellectual construction will be extremely difficult to translate into enforceable legislation. But the French authorities intend to rally other countries and furiously lobby the EU Commission to comer around to their view. It might takes years, but it could dramatically impact Google’s economics in many countries.

More immediately, for Google, a parliamentary debate over the Ancillary Copyright will open a Pandora’s box. From the Right to the Left, encouraged by François Hollande‘s administration, lawmakers will outbid each other in trashing the search engine and beyond that, every large internet company.

As for members the press, “They will lose too”, a senior official tells me. First, because of the complications in setting up the machinery the Ancillary Copyright Act would require, they will have to wait about two years before getting any dividends. Two, the governments — the present one as well as the past Sarkozy administration  — have always been displeased with what they see as the the French press “addiction to subsidies”; they intend to drastically reduce the €1.5bn in public aid. If the press gets is way through a law,  according to several administration officials, the Ministry of Finances will feel relieved of its obligations towards media companies that don’t innovate much despite large influxes of public money. Conversely, if the parties are able to strike a decent business deal on their own, the French Press will quickly get some “compensation” from of Google and might still keep most of its taxpayer subsidies.

As for the search giant, it will indeed have to stand a small stab but, for a while, will be spared the chronic pain of a long and costly legislative fight — and the contagion that goes with it: The French bill would be dissected by neighboring governments who will be only too glad to adapt and improve it.   

Next week: When dealing with Google, better use a long spoon; Why European media should rethink their approach to the search giant.