google

Google might not be a monopoly, after all

 

Despite its dominance, Google doesn’t fit the definition of a monopoly. Still, the Search giant’s growing disconnect from society could lead to serious missteps and, over time, to a weakened position. 

In last week’s column, I opined about the Open Internet Project’s anti-trust lawsuit against Google. Reactions showed divided views of the search engine’s position. Granted, Google is an extremely aggressive company, obsessed with growth, scalability, optimization — and also with its own vulnerability.

But is it really a monopoly in the traditional and historical sense? Probably not. Here is why, in four points:

1. The consent to dependency. It is always dangerous to be too dependent from a supplier one doesn’t control. This is the case in the (illegal) drug business. Price and supply will fluctuate at the whim of unpredictable people.This is what happens to those who build highly Google-dependent businesses such as e-commerce sites and content-farms that provide large quantities of cheap fodder in order to milk ad revenue from Google search-friendly tactics.

326_jaws
In the end, everything is a matter of trust (“Jaws”, courtesy of Louis Goldman)

Many news media brands have sealed their own fate by structuring their output so that 30% to 40% of their traffic is at the mercy of Google algorithms. I’m fascinated by the breadth and depth of the consensual ecosystem that is now built around the Google traffic pipeline: consulting firms helping media rank better in Google Search and Google News; software that rephrases headlines to make it more likely they’ll hit the top ranks; A/B testing on-the-fly that shows what the search engine might like best, etc.

For the media industry, what should have remained a marginal audience extension has turned into a vital stream of page views and revenue. I personally think this is dangerous in two ways. One, we replace the notion of relevance, reader interest, with a purely quantitative/algorithmic construct (listicles vs depth, BuzzFeed vs. ProPublica for instance). Such mechanistic practices further fuel the value deflation of original content. Two, the eagerness to please the algorithms distracts newsrooms, journalists, editors, from their job to find, develop, build intelligent news packages that will lift brand perception and elevate the reader’s mind (BuzzFeed and plenty of others are the quintessence of cheapening alienation.)

2. Choice and Competition. In 1904, Standard Oil Inc. controlled 91% of American oil production and refining, and 85% of sales. This practically inescapable monopoly was able to dictate prices and supply structure. As for Google, it indeed controls 90% of the search market in some regions (Europe especially, where fragmented markets, poor access to capital and other cultural factors prevented the emergence of tech giants.) Google combines its services (search, mail, maps, Android) to produce one of the most potent data gathering systems ever created. Note the emphasis: Google (a) didn’t invent the high tech data X-ray business, nor (b) is it the largest entity to collect gargantuan amounts of data. Read this Quartz article The nine companies that know more about you than Google or Facebook  and see how corporations such as Acxiom, Corelogic, Datalogix, eBureau, ID Analytics, Intelius, PeekYou, Rapleaf, and Recorded Future collect data on a gigantic scale, including court and public records information, or your gambling habit. Did they make you sign a consent form?

You want to escape Google? Use Bing, Yahoo, DuckDuckGo or Exalead for your web search, or go here to find a list of 40 alternatives. You don’t want your site to be indexed by Google? Insert a robot exclusion line in your html pages, and the hated crawler won’t see your content. You’re sick of Adwords in your pages or in Gmail? Use AdBlock plug-in, it’s even available for the Google Chrome browser. The same applies for storing your data, getting a digital map or web mail services. You’re “creeped out” by Google’s ability to reconstruct every move around your block or from one city to another by injecting data from your Android phone into Maps? You’re right! Google Maps Location History is frightening; to kill it, you can turn off your device’s geolocation, or use Windows Phone or an iPhone (be simply aware that they do exactly the same thing, but they don’t advertise it). Unlike public utilities, you can escape Google. Simply, its services are more convenient, perform well and… are better integrated, which gets us to our third point:

3. Transparent strategy. To Google’s credit, for the most part, its strategy is pretty transparent. What some see as a monopoly in the making is a deliberate — and open — strategy of systematic (and systemic) integration. Here is the chart I made few months ago:

326 graph_goolge

We could include several recent additions such as trip habits from Uber (don’t like it? Try Lyft, or better, a good old Parisian taxi – they don’t even take credit cards); or temperature setting patterns soon coming from Nest thermostats (if you chose to trust Tony Fadell’s promises)… Even Google X, the company’s moonshot factory (story in Fast Company) offers glimpses of Google’s future reach with the development of autonomous cars, projects to bring the internet to remote countries using balloons (see Project Loon) or other airborne platforms.

4. Innovation. Monopolies are known to kill innovation. That was the case with oil companies, cartels of car makers that discouraged alternate transportation systems, or even Microsoft which made our life miserable thanks to a pipeline of operating systems without real competition. By contrast, Google is obsessed with innovative projects seen as an absolute necessity for its survival. Some are good, other are bad, or remain in beta for years.

However, Google is already sowing the seeds of its own erosion. This company is terribly disconnected from the real world. This shows everywhere, from the minutest details of its employees daily life pampered in a overabundance of comfort and amenities that keep them inside a cosy bubble, to its own vital statistics (published by the company itself). Google is mostly white (61%), male (70%), recruits in major universities (in that order: Stanford, UC Berkeley, MIT, Carnegie Mellon, UCLA), with very little “blood” from fields other than scientific or technical. For a company that says it wants to connect its business to a myriad of sectors, such cultural blinders are a serious issue. Combined to the certainty of its own excellence, the result is a distorted view of the world in which the distinction between right and wrong can easily blur. A business practice internally considered virtuous because it supports the perpetuation of the company’s evangelistic vision of a better world can be seen as predatory in the “real” world. Hence a growing rift between the tech giant and its partners and customers, and the nations who host them.

frederic.filloux@mondaynote.com

Google and the European media: Back to the Ice Age

 

Prominent members of the European press are joining a major EU-induced antitrust lawsuit against Google. The move is short on rationale and long on ideology. 

A couple of weeks ago, Axelle Lemaire, France’s deputy minister for digital affairs,  was quoted contending Google’s size and market power effectively prevented the emergence of a “French Google”. A rather surprising statement from a public official whose profile stands in sharp contrast to the customary high civil service profile. As an MP, Mrs Lemaire represents French citizens living overseas and holds dual French and Canadian citizenship; she got a Ph.D. in International Law at London’s King’s College as well as a Law degree at the Sorbonne. Ms. Lemaire then practiced Law in the UK and served as a parliamentary aide in the British House of Commons. Still, her distinguished and unusually “open” background didn’t help: She’s dead wrong about why there is no French Google.

The reasons for France’s “failure” to give birth to a Google-class search engine are simply summarized: Education and money. Google is a pure product of what France misses the most: a strong and diversified engineering pipeline supported by a business-oriented education system, and access to abundant capital. Take the famous (though controversial) Shanghai higher education ranking in computer science: France ranks in the 76 to 100 group with the University of Bordeaux; 101 to 150 for the highly regarded Ecole Normale Supérieure; and the much celebrated Ecole Polytechnique sits deep in the 150-200 group – with performance slowly degrading over the last ten years and a minuscule faculty of… 7 CS professors and assistants professors. That’s the reality of computer science education in the most prestigious engineering school in France. As for access to capital, two numbers say it all: according to its own trade association, the size of the French venture capital sector is 1/33th of the US’ while the GDP ratio is only 1 to 6. That’s for 2013; in 2012, the ratio was 1/46th, things are improving.

The structural weakness of French tech clearly isn’t Google’s fault. Which reveals the ideological facts-be-damned nature of the blame, an attitude broadly shared by other European countries.

A few weeks ago, a surreal event took place in Paris, at the Cité Universitaire Internationale de Paris (which wants to look like a Cambridge replica). There, the Open Internet Project uncovered the next European antitrust action against Google. On stage was an disparate crew: media executives from German and French companies; the former antitrust litigator Gary Reback known for his fight against Microsoft in the Nineties – and now said to help Microsoft in its fight against Google; Laurent Alexandre, a strange surgeon/entrepreneur and self-proclaimed visionary  living in Luxembourg Brussels where his company DNA Vision is headquartered, who almost got a standing ovation by explaining how Google intended to connect our brains to its gigantic neuronal network by around 2040; all of the above wrapped up with a speech from French Economy Minister Arnaud Montebourg who never misses an opportunity to apply his government’s seal on anti-imperialist initiatives.

The lawsuit alleges market distortion practices, discrimination in several guises, anticompetitive conduct, preference for its own vertical services at the expense of fairness in its search results, illegal use of data, etc. (The summary of EU allegations is here). The complaint paves the way for painstaking litigation that will drag on for years.

Among the eleven corporations or trade groups funding the lawsuit we find seven media entities, including the giant German Axel Springer GroupLagardère Active whose boss invoked the “moral obligation” to fight Google. There is also CCM Benchmark Group, a large diversified digital player whose boss, Benoît Sillard, had his own epiphany while speaking with Nikesh Arora in Mountain View a while ago. There and then, Mr. Sillard saw the search giant’s grand plan to dominate the digital world. (I paid a couple of visits to Google’s headquarters but was never granted such a religious experience – I will try again, I promise.)

Despite the media industry’s weight, the lawsuit fails to expose Google practices directly affecting the P&L of news providers. Indeed, some media companies have developed business that competes with Google verticals. That’s the case of Lagardère’s shopping site LeGuide.com but, again, the group’s CEO, Denis Olivennes, was long on whining and short on relevant facts. (The only fun element he mentioned was outside the scope of OIP’s legal action: with only €50m in revenue, LeGuide.com paid the same amount of taxes as Google whose French operation generates $1.6bn in revenue).

Needless to say, that doesn’t mean that Google couldn’t be using its power in questionable ways at the expense of scores of e-retailers. But as far as the media sector is concerned, gains largely outweigh losses as most web sites enjoy a boost in their traffic thanks to Google Search and Google News. (The value of Google-generated clicks is extremely difficult to assess — a subject for a future Monday Note.)

One fact remains obvious: In this legal action, media groups are being played to defend interests… that are not theirs.

In this whole affair, the French news media industry is putting itself in an awkward position. In February 2013, Google and the French government hammered a deal in which the tech giant committed €60m ($81m) over a 3-year period to fund digital projects run by the French press. (In 2013, according to the fund’s report, 23 projects have been started, totaling €16m in funding.) The agreement between Google and the French press stipulates that, for the duration of the deal, the French will refrain from suing Google on copyrights grounds – such as the use of snippets in search results. But those who signed the deal found themselves dragged in the OIP lawsuit through the GESTE, a legacy trade association – more talkative than effective – going back to the Minitel era  that supports the OIP lawsuit on antirust rather than copyrights grounds. (Those who signed the Google Funds agreement issues a convoluted communiqué to distance themselves from the OIP initiative.)

In Mountain View, many are upset by French media that, on one hand, get hefty subsidies and, on the other, file an anti-Google suit before the Europe Court of Justice. “Back home, the [Google] Fund always had its opponents”, a Google exec told me, “and now they have reasons to speak louder…” Will they be heard? It is unlikely that Google will pull the plug on the Fund, I’m told. But people I talk to also said that any renewal, under any form, now looks unlikely. So will be the extension of an innovation funding scheme in Germany — or elsewhere. “Google is at a loss when trying to develop peaceful relations with the French”, another Google insider told me… “We put our big EMEA [Europe and Middle East] headquarters in Paris, we created a nicely funded Cultural Institute, we fueled the innovation fund for the press, and now we are bitten by the same ones who take our subsidies…”

Regardless of its merits, the European press’ involvement in this antitrust case is ill-advised. It might throw the relationship with Google back to the Ice Age. As another Google exec said to me: “News media should not forget that we don’t need them to thrive…”

–frederic.filloux@mondaynote.com

 

Puzzling Over Google’s Nest Acquisition

 

Looking past the glitter, big names, and big money ($3.2B), a deeper look at Google’s last move doesn’t yield a good theory. Perhaps because there isn’t one.

Last week’s Monday Note used the “Basket of Remotes” problem as a proxy for the many challenges to the consumer version of the IoT, the Internet of Things. Automatic discovery, two-way communication, multi-vendor integration, user-interface and network management complexity… until our home devices can talk to each other, until they can report their current states, functions, and failure modes, we’re better off with individual remotes than a confusing — and confused — universal controller..

After reading the Comments section, I thought we could put the topic to rest for a while, perhaps until devices powered by Intel’s very low-power Quark processor start shipping.

Well…

A few hours later, Google announced its $3.2B acquisition (in cash) of Nest, the maker of elegant connected thermostats and, more recently, of Nest Protect smoke and CO alarms. Nest founder Tony Fadell, often referred to as “one of the fathers of the iPod”, takes his band of 100 ex-Apple engineers and joins Google; the Mountain View giant pays a hefty premium, about 10 times Nest’s estimated yearly revenue of $300M.

Why?

Tony Fadell mentioned “scaling challenges” as a reason to sell to Google versus going it alone. He could have raised more money — he was actually ready to close a new round, $150M at a $2B valuation, but chose adoption instead.

Let’s decode scaling challenges. First, the company wants to raise money because profits are too slim to finance growth. Then, management looks at the future and doesn’t like the profit picture. Revenue will grow, but profits will not scale up, meaning today’s meager percentage number will not expand. Hard work for low profits.

(Another line of thought would be the Supply Chain Management scaling challenges, that is the difficulties in running manufacturing contractors in China, distributors and customer support. This doesn’t make sense. Nest’s product line is simple, two products. Running manufacturing contractors isn’t black magic, it is now a well-understood trade. There are even contractors to run contractors, two of my friends do just that for US companies.)

Unsurprisingly, many worry about their privacy. The volume and tone of their comments reveals a growing distrust of of Google. Is Nest’s expertise at connecting the devices in our homes simply a way for the Google to know more about us? What will they do with my energy and time data? In a blog post, Fadell attempts to reassure:

“Will Nest customer data be shared with Google?
Our privacy policy clearly limits the use of customer information to providing and improving Nest’s products and services. We’ve always taken privacy seriously and this will not change.”

What else could Fadell offer besides this perfunctory reassurance?  “[T]his will not change”… until it does. Let’s not forget how so many tech companies change their minds when it suits them. Google is no exception.

This Joy of Tech cartoon neatly summarizes the privacy concern:

Thermostats

The people, the brands, the money provide enough energy to provoke less than thoughtful reactions. A particularly agitated blogger, who can never pass up a rich opportunity to entertain us – and troll for pageviews – starts by arguing that Apple ought to have bought Nest:

“Nest products look like Apple products. Nest products are beloved by people who love Apple products. Nest products are sold in Apple stores.
Nest, in short, looked like a perfect acquisition for Apple, which is struggling to find new product lines to expand into and has a mountain of cash rotting away on its balance sheet with which it could buy things.
[...] Google’s aggressiveness has once again caught Apple snoozing. And now a company that looked to be a perfect future division of Apple is gone for good.”

Let’s slow down. Besides Nest itself, two companies have the best data on Nest’s sales, returns, and customer service problems: Apple and Amazon. Contrary to the “snoozing” allegation, Apple Store activity told Apple exactly the what, the how, and the how much of Nest’s business. According to local VC lore, Nest’s Gross Margin are low and don’t rise much above customer support costs. (You can find a list of Nest’s investors here. Some, like Kleiner Perkins and Google Ventures, have deep links to Google… This reminds many of the YouTube acquisition. Several selling VCs were also Google investors, one sat on Google’s Board. YouTube was bleeding money and Google had to “bridge” it, to loan it money before the transaction closed.)

See also Amazon’s product reviews page; feelings about the Nest thermostat range from enthusiastic to downright negative.

The “Apple ought to have bought Nest because it’s so Apple-like” meme points to an enduring misunderstanding of Apple’s business model. The Cupertino company has one and only one money pump: personal computers, whether in the form of smartphones, tablets, or conventional PCs. Everything else is a supporting player, helping to increase the margins and volume of the main money makers.

A good example is Apple TV: Can it possibly generate real money at $100 a puck? No. But the device expands the ecosystem, and so makes MacBooks, iPads, and iPhones more productive and pleasant. Even the App Store with its billions in revenue counts for little by itself. The Store’s only mission is to make iPhones and iPads more valuable.

With this in mind, what would be the role of an elegant $249 thermostat in Apple’s ecosystem? Would it add more value than an Apple TV does?

We now turn to the $3.2B price tag. The most that Apple has ever paid for an acquisition was $429M (plus 1.5M Apple shares), and that was for… NeXT. An entire operating system that revitalized the Mac. It was a veritable bargain. More recently, in 2012, it acquired AuthenTec for $356M.

With rare exceptions (I can think of one, Quattro Wireless), Apple acquires technologies, not businesses. Even if Apple were in the business of buying businesses, a $300M enterprise such as Nest wouldn’t move the needle. In an Apple that will approach or exceed $200B this calendar year, Nest would represent about .15% of the company’s revenue.

Our blogging seer isn’t finished with the Nest thermostat:

“I was seduced by the sexy design, remote app control, and hyperventilating gadget-site reviews of Nest’s thermostat. So I bought one.”

But, ultimately, he never used the device. Bad user feedback turned him off:

“[…] after hearing of all these problems, I have been too frightened to actually install the Nest I bought. So I don’t know whether it will work or not.”

He was afraid to install his Nest… but Apple should have bought the company?

So, then, why Google? We can walk through some possible reasons.

First, the people. Tony Fadell’s team is justly admired for their design skills. They will come in handy if Google finally gets serious about selling hardware, if it wants to generate new revenue in multiples of $10B (its yearly revenue is approximately $56B now). Of course, this means products other than just thermostats and smoke alarms. It means products that can complement Google’s ad business with its 60% Gross Margin.

Which leads us to a possible second reason: Nest might have a patent portfolio that Google wants to add to its own IP arsenal. Fadell and his team surely have filed many patents.

But… $3.2B worth of IP?

This leaves us with the usual questions about Google’s real business model. So far, it’s even simpler than Apple’s: Advertising produces 115% or more of Google’s profits. Everything else brings the number back down to 100%. Advertising is the only money machine, all other activities are cost centers. Google’s hope is that one of these cost centers will turn into a new money machine of a magnitude comparable to its advertising quasi-monopoly.

On this topic, I once again direct you to Horace Dediu’s blog. In a post titled Google’s Three Ps, Horace takes us through the basics of a business: People, Processes, and Purpose:

“This is the trinity which allows for an understanding of a complex system: the physical, the operational and the guiding principle. The what, the how and the why.”

Later, Horace points to Google’s management reluctance to discuss its Three Ps:

“There is a business in Google but it’s a very obscure topic. The ‘business side’ of the organization is only mentioned briefly in analyst conference calls and the conversation is not conducted with the same team that faces the public. Even then, analysts who should investigate the link between the business and its persona seem swept away by utopian dreams and look where the company suggests they should be looking (mainly the future.)
There are almost no discussions of cost structures (e.g. cost of sales, cost of distribution, operations and research), operating models (divisional, functional or otherwise) or of business models. In fact, the company operates only one business model which was an acquisition, reluctantly adopted.”

As usual — or more than usual in current circumstances — the entire post is worth a meditative read. Especially for its interrogation at the end:

“The trouble lies in that organization also having de-facto control over the online (and hence increasingly offline) lives of more than one billion people. Users, but not customers, of a company whose purpose is undefined. The absence of oversight is one thing, the absence of an understanding of the will of the leadership is quite another. The company becomes an object of faith alone.  Do we believe?”

Looking past the glitter, the elegant product, the smart people, do we believe there is a purpose in the Nest acquisition? Or is Google simply rolling the dice, hoping for an IoT breakthrough?

JLG@mondaynote.com

 

News: Personalized or Serendipitous?

 

Every digital news designer faces the question: should the traditional serendipity of contents be preserved or should we go full steam for personalization? It turns out Google is already working on ways to combine both — on its usual grand scale.

Serendipity always seemed inseparable from journalism. For any media product, taking readers away from their main center of interest is part of the fabric. I go on a website for a morning update and soon find myself captured by crafty editing that will drive me to read up on a subject that was, until now, alien to me. That’s the beauty of a great news package.

Or is it still the case? Isn’t it a mostly generational inclination? Does a Gen Y individual really care about being drawn to a science story when getting online to see sports results?

Several elements concur to the erosion of serendipity and, more generally, curiosity.

First, behavioral among digital readers are evolving. These extend far beyond generations: Regardless of her age, today’s reader is short on time. At every moment of the day (except, maybe, in the loo or in bed at night), her reading time is slashed by multiple stimuli: social teases, incoming mail, alerts or simply succumbing to distractions that lie just one click (or one app) away. That’s one of the tragedies of traditional news outlets: When it comes to retaining the commuter’s attention, for instance, Slate or The Washington Post are in direct competition with addictive products such as Facebook or Angry Birds…

Second, the old “trusted news brand” notion is going away. Young people can’t be bothered to leaf though several titles to get their feed of a variety of topics; that’s why aggregators thrive. The more innocuous ones, such as Mediagazer, mostly send traffic back to the original news provider; but legions of others (Business Insider, The Huffington Post…) melt news brands into their own, repackage contents with eye-grabbing headlines and boost the whole package with aggressive marketing.

Below, see how BuzzFeed summed up the New York Times story on the NSA monitoring social traffic: 80 words in BF that capture the substance of a 2000 words article by two experienced journalists who collected exclusive documents and reported from Washington, New York and Berlin. buzzfeed nyt

(Note that BuzzFeed is serving a more appealing headline and a livelier photograph of general Keith Alexander, head of NSA.) How many BuzzFeed glancers did click on the link sending back to the original story? I’d bet no more that 5%. (Anyway, judging by the 500 comments that followed it, the NYT did well with their article.) This trends also explains why the Times is working on new digital products that take into account both time scarcity and the Gen Y way with news.

This leads us the third reason to wonder about personalization: the economics of digital news. In the devastated landscape of online advertising, it became more critical than ever to structure news content with the goal of retaining readers within a site. That’s why proper tagging, use of metadata, semantic recommendation engines and topic pages entries are so important. More pages per visit means more ads exposure, then more revenue. Again, pure players excel at providing incentives to read more stuff within their own environment, thus generating more page views.

Coming back to the customization issue, should we turn the dial fully to the end? Or should we preserve at least some of the fortuitous discovery that was always part of the old media’s charm?

Let’s first get rid of the idea of the reader presetting his/her own preferences. No one does it. At least for mainstream products. Therefore, news customization must rely on technology, not human input.

Last week, I spoke with Richard Gingras, the senior director of news and social products at Google (in other words, he oversees Google News and Google + from an editorial an business perspective). Richard is a veteran of the news business. Among many things, he headed Salon.com, one of the first and best online publication ever.

gingras

According to him, “Today’s news personalization is very unsophisticated. We look at your news reading patterns, we determine that you looked at five stories about the Arab Spring and we deduct you might like articles about Egypt. This is not how it should work. In fact, you might be interested in many other things such as the fall from grace of dictators, generation-driven revolutions, etc. These requires understanding concepts”. And that’s a matter Google is working on, he says. Not only for news, but for products such as Google Now which is the main application of Google’s efforts on predictive search. (Read for example With Personal Data, Predictive Apps Stay a Step Ahead in the MIT Technology Review, or Apps That Know What You Want, Before You Do in the NYTimes).

The idea is to connect all of Google’s knowledge, from the individual level to his/her social group context, and beyond. This incredibly granular analysis of personal preferences and inclinations, set in the framework of the large macro-scale of the digital world, is at the core of the search giant’s strategy as summed-up below:

google infos2

On the top of this architecture, Google is developing techniques aimed at capturing the precious “signals” needed to serve more relevant contents, explains Richard Gingras. Not only in the direct vicinity of a topic, but based on center of interests drawn from concepts associated to individuals’ online patterns analyzed in a wider context. In doing so, Gingras underlines the ability of Google News to develop a kind of educated serendipity (term is mine) as opposed to narrowing the user’s mind by serving her the unrefined output of a personalization engine. In other words, based on your consumption of news, your search patterns, and a deep analysis (semantic, tonality, implied emotions) of your mail and your posts — matched against hundreds of millions of others — Google will be able to suggest a link to the profile of an artist in Harper’s when you dropped in Google News to check on Syria. That’s not customized news in a restricted sense, but that not straightforward serendipity either. That’s Google’s way of anticipating your intellectual and emotional wishes. Fascinating and scary.

frederic.filloux@mondaynote.com

 

Goodbye Google Reader

 

Three months ago, Google announced the “retirement” of Google Reader as part of the company’s second spring cleaning. On July 1st — two weeks from today — the RSS application will be given a gold watch and a farewell lunch, then it will pack up its bits and leave the building for the last time.

The other items on Google’s spring cleaning list, most of which are tools for developers, are being replaced by superior (or simpler, friendlier) services: Are you using CalDAV in your app? Use the Google Calendar API, instead; Google Map Maker will stand in for Google Building Maker; Google Cloud Connect is gone, long live Google Drive.

For Google Reader’s loyal following, however, the company had no explanation beyond a bland “usage has declined”, and it offered no replacement nor even a recommendation other than a harsh “get your data and move on”:

Users and developers interested in RSS alternatives can export their data, including their subscriptions, with Google Takeout over the course of the next four months.

The move didn’t sit well with users whose vocal cords were as strong as their bond to their favorite blog reader. James Fallows, the polymathic writer for The Atlantic, expressed a growing distrust of the company’s “experiments” in A Problem Google Has Created for Itself:

I have already downloaded the Android version of Google’s new app for collecting notes, photos, and info, called Google Keep… Here’s the problem: Google now has a clear enough track record of trying out, and then canceling, “interesting” new software that I have no idea how long Keep will be around… Until I know a reason that it’s in Google’s long-term interest to keep Keep going, I’m not going to invest time in it or lodge info there.

The Washington Post’s Ezra Klein echoed the sentiment (full article here):

But I’m not sure I want to be a Google early adopter anymore. I love Google Reader. And I used to use Picnik all the time. I’m tired of losing my services.

What exactly did Google Reader provide that got its users, myself included, so excited, and why do we take its extermination so personally?

Reading is, for some of us, an addiction. Sometimes the habit turns profitable: The hours I spent poring over computer manuals on Saturday mornings in my youth may have seemed cupidic at the time, but the “research” paid off.

Back before the Web flung open the 10,000 Libraries of Alexandria that I dreamed of in the last chapter of The Third Apple my reading habit included a daily injection of newsprint.  But as online access to real world dailies became progressively more ubiquitous and easier to manage, I let my doorstep subscriptions lapse (although I’ll always miss the wee hour thud of the NYT landing on our porch…an innocent pleasure unavailable in my country of birth).

Nothing greased the move to all-digital news as much as the RSS protocol (Real Simple Syndication, to which my friend Dave Winer made crucial contributions). RSS lets you syndicate your website by adding a few lines of HTML code. To subscribe, a user simply pushes a button. When you update your blog, it’s automatically posted to the user’s chosen “feed aggregator”.

RSS aggregation applications and add-ons quickly became a very active field as this link attests. Unfortunately, the user interfaces for these implementations – how you add, delete, and navigate subscriptions — often left much to be desired.

Enter Google Reader, introduced in 2005. Google’s RSS aggregator mowed down everything in its path as it combined the company’s Cloud resources with a clean, sober user interface that was supported by all popular browsers…and the price was right: free.

I was hooked. I just checked, I have 60 Google Reader subscriptions. But the number is less important than the way the feeds are presented: I can quickly search for subscriptions, group them in folders, search through past feeds, email posts to friends, fly over article summaries, and all of this is made even easier through simple keyboard shortcuts (O for Open, V for a full View on the original Web page, Shift-A to declare an entire folder as Read).

Where I once read four newspapers with my morning coffee I now open my laptop or tablet and skim my customized, ever-evolving Google Reader list. I still wonder at the breadth and depth of available feeds, from dissolute gadgetry to politics, technology, science, languages, cars, sports…

I join the many who mourn Google Reader’s impending demise. Fortunately, there are alternatives that now deserve more attention.

I’ll start with my Palo Alto neighbor, Flipboard. More than just a Google Reader replacement, Flipboard lets you compose and share personalized magazines. It’s very well done although, for my own daily use, its very pretty UI gets in the way of quickly surveying the field of news I’m interested in. Still, if you haven’t loaded it onto your iOS or Android device, you should give it a try.

Next we have Reeder, a still-evolving app that’s available on the Mac, iPhone, and iPad. It takes your Google Reader subscriptions and presents them in a “clean and well-lighted” way:

For me, Feedly looks like the best way to support one’s reading habit (at least for today). Feedly is offered as an app on iOS and Android, and as extensions for Chrome, Firefox, and Safari on your laptop or desktop (PC or Mac). Feedly is highly customizable: Personally, I like the ability to emulate Reader’s minimalist presentation, others will enjoy a richer, more graphical preview of articles. For new or “transferring” users, it offers an excellent Feedback and Knowledge Base page:

Feedly makes an important and reassuring point: There might be a paid-for version in the future, a way to measure the app’s real value, and to create a more lasting bond between users and the company.

There are many other alternatives, a Google search for “Google Reader replacement” (the entire phrase) yields nearly a million hits (interestingly, Bing comes up with only 35k).

This brings us back to the unanswered question: Why did Google decide to kill a product that is well-liked and well-used by well-informed (and I’ll almost dare to add: well-heeled) users?

I recently went to a Bring Your Parents to Work day at Google. (Besides comrades of old OS Wars, we now have a child working there.) The conclusion of the event was the weekly TGIF-style bash (which is held on Thursdays in Mountain View, apparently to allow Googlers in other time zones to participate). Both founders routinely come on stage to make announcements and answer questions.

Unsurprisingly, someone asked Larry Page a question about Google Reader and got the scripted “too few users, only about a million” non-answer, to which Sergey Brin couldn’t help quip that a million is about the number of remote viewers of the Google I/O developer conference Page had just bragged about. Perhaps the decision to axe Reader wasn’t entirely unanimous. And never mind the fact Feedly seems to already have 3 million subscribers

The best explanation I’ve read (on my Reader feeds) is that Google wants to draw the curtain, perform some surgery, and reintroduce its RSS reader as part of Google+, perhaps with some Google Now thrown in:

While I can’t say I’m a fan of squirrelly attempts to draw me into Google+, I must admit that RSS feeds could be a good fit… Stories could appear as bigger, better versions of the single-line entry in Reader, more like the big-photo entries that Facebook’s new News Feed uses. Even better, Google+ entries have built in re-sharing tools as well as commenting threads, encouraging interaction.

We know Google takes the long view, often with great results. We’ll see if killing Reader was a misstep or another smart way to draw Facebook users into Google’s orbit.

It may come down to a matter of timing. For now, Google Reader is headed for the morgue. Can we really expect that Google’s competitors — Yahoo!, Facebook, Apple, Microsoft — will resist the temptation to chase the ambulance?

–JLG@mondaynote.com

 

Google News: The Secret Sauce

 

A closer look at Google’s patent for its news retrieval algorithm reveals a greater than expected emphasis on quality over quantity. Can this bias stay reliable over time?

Ten years after its launch, Google News’ raw numbers are staggering: 50,000 sources scanned, 72 editions in 30 languages. Google’s crippled communication machine, plagued by bureaucracy and paranoia, has never been able to come up with tangible facts about its benefits for the news media it feeds on. It’s official blog merely mentions “6 billion visits per month” sent to news sites and Google News claims to connect “1 billion unique users a week to news content” (to put things in perspective, the NYT.com or the Huffington Post are cruising at about 40 million UVs per month). Assuming the clicks are sent to a relatively fresh news page bearing higher value advertising, the six billion visits can translate into about $400 million per year in ad revenue. (This is based on a $5 to $6 revenue per 1,000 pages, i.e. a few dollars in CPM per single ad, depending on format, type of selling, etc.) That’s a very rough estimate. Again: Google should settle the matter and come up with accurate figures for its largest markets. (On the same subject, see a previous Monday Note: The press, Google, its algorithm, their scale.)

But how exactly does Google News work? What kind of media does its algorithm favor most? Last week, the search giant updated its patent filing with a new document detailing the thirteen metrics it uses to retrieve and rank articles and sources for its news service. (Computerworld unearthed the filing, it’s here).

What follows is a summary of those metrics, listed in the order shown in the patent filing, along with a subjective appreciation of their reliability, vulnerability to cheating, relevancy, etc.

#1. Volume of production from a news source:

A first metric in determining the quality of a news source may include the number of articles produced by the news source during a given time period [week or month]. [This metric] may be determined by counting the number of non-duplicate articles produced by the news source over the time period [or] counting the number of original sentences produced by the news source.

This metric clearly favors production capacity. It benefits big media companies deploying large staffs. But the system can also be cheated by content farms (Google already addressed these questions); new automated content creation systems are gaining traction, many of them could now easily pass the Turing Test.

#2. Length of articles. Plain and simple: the longer the story (on average), the higher the source ranks. This is bad news for aggregators whose digital serfs cut, paste, compile and mangle abstracts of news stories that real media outlets produce at great expense.

#3. “The importance of coverage by the news source”. To put it another way, this matches the volume of coverage by the news source against the general volume of text generated by a topic. Again, it rewards large resource allocation to a given event. (In New York Times parlance, such effort is called called “flooding the zone”.)

#4. The “Breaking News Score”:   

This metric may measure the ability of the news source to publish a story soon after an important event has occurred. This metric may average the “breaking score” of each non-duplicate article from the news source, where, for example, the breaking score is a number that is a high value if the article was published soon after the news event happened and a low value if the article was published after much time had elapsed since the news story broke.

Beware slow moving newsrooms: On this metric, you’ll be competing against more agile, maybe less scrupulous staffs that “publish first, verify later”. This requires a smart arbitrage by the news producers. Once the first headline has been pushed, they’ll have to decide what’s best: Immediately filing a follow-up or waiting a bit and moving a longer, more value-added story that will rank better in metrics #2 and #3? It depends on elements such as the size of the “cluster” (the number of stories pertaining to a given event).

#5. Usage Patterns:

Links going from the news search engine’s web page to individual articles may be monitored for usage (e.g., clicks). News sources that are selected often are detected and a value proportional to observed usage is assigned. Well known sites, such as CNN, tend to be preferred to less popular sites (…). The traffic measured may be normalized by the number of opportunities readers had of visiting the link to avoid biasing the measure due to the ranking preferences of the news search engine.

This metric is at the core of Google’s business: assessing the popularity of a website thanks to the various PageRank components, including the number of links that point to it.

#6. The “Human opinion of the news source”:

Users in general may be polled to identify the newspapers (or magazines) that the users enjoy reading (or have visited). Alternatively or in addition, users of the news search engine may be polled to determine the news web sites that the users enjoy visiting. 

Here, things get interesting. Google clearly states it will use third party surveys to detect the public’s preference among various medias — not only their website, but also their “historic” media assets. According to the patent filing, the evaluation could also include the number of Pulitzer Prizes the organization collected and the age of the publication. That’s for the known part. What lies behind the notion of “Human opinion” is a true “quality index” for news sources that is not necessarily correlated to their digital presence. Such factors clearly favors legacy media.

# 7. Audience and traffic. Not surprisingly Google relies on stats coming from Nielsen Netratings and the like.

#8. Staff size. The bigger a newsroom is (as detected in bylines) the higher the value will be. This metric has the merit of rewarding large investments in news gathering. But it might become more imprecise as “large” digital newsrooms tend now to be staffed with news repackagers bearing little added value.

#9. Numbers of news bureaus. It’s another way to favor large organizations — even though their footprint tends to shrink both nationally and abroad.

#10. Number of “original named entities”. That’s one of the most interesting metric. A “named entity is the name of a person, place or organization”. It’s the primary tool for semantic analysis.

If a news source generates a news story that contains a named entity that other articles within the same cluster (hence on the same topic) do not contain, this may be an indication that the news source is capable of original reporting.

Of course, some cheaters insert misspelled entities to create “false” original entities and fool the system (Google took care of it). But this metric is a good way to reward original source-finding.

#11. The “breadth” of the news source. It pertains to the ability of a news organizations to cover a wide range of topics.

#12. The global reach of the news sources. Again, it favors large media who are viewed, linked, quoted, “liked”, tweeted from abroad.

This metric may measure the number of countries from which the news site receives network traffic. In one implementation consistent with the principles of the invention, this metric may be measured by considering the countries from which known visitors to the news web site are coming (e.g., based at least in part on the Internet Protocol (IP) addresses of those users that click on the links from the search site to articles by the news source being measured). The corresponding IP addresses may be mapped to the originating countries based on a table of known IP block to country mappings.

#13. Writing style. In the Google world, this means statistical analysis of contents against a huge language model to assess “spelling correctness, grammar and reading levels”.

What conclusions can we draw? This enumeration clearly shows Google intends to favor legacy media (print or broadcast news) over pure players, aggregators or digital native organizations. All the features recently added, such as Editor’s pick, reinforce this bias. The reason might be that legacy media are less prone to tricking the algorithm. For once, a know technological weakness becomes an advantage.

frederic.filloux@mondaynote.com

The Google Fund for the French Press

 

At the last minute, ending three months of  tense negotiations, Google and the French Press hammered a deal. More than yet another form of subsidy, this could mark the beginning of a genuine cooperation.

Thursday night, at 11:00pm Paris time, Marc Schwartz, the mediator appointed by the French government got a call from the Elysée Palace: Google’s chairman Eric Schmidt was en route to meet President François Hollande the next day in Paris. They both intended to sign the agreement between Google and the French press the Friday at 6:15pm. Schwartz, along with Nathalie Collin, the chief representative for the French Press, were just out of a series of conference calls between Paris and Mountain view: Eric Schmidt and Google’s CEO Larry Page had green-lighted the deal. At 3 am on Friday, the final draft of the memorandum was sent to Mountain View. But at 11:00am everything had to be redone: Google had made unacceptable changes, causing Schwartz and Collin to  consider calling off the signing ceremony at the Elysée. Another set of conference calls ensued. The final-final draft, unanimously approved by the members of the IPG association (General and Political Information), was printed at 5:30pm, just in time for the gathering at the Elysée half an hour later.

The French President François Hollande was in a hurry, too: That very evening, he was bound to fly to Mali where the French troops are waging as small but uncertain war to contain Al-Qaeda’s expansion in Africa. Never shy of political calculations, François Hollande seized the occasion to be seen as the one who forced Google to back down. As for Google’s chairman, co-signing the agreement along with the French President was great PR. As a result, negotiators from the Press were kept in the dark until Eric Schmidt’s plane landed in Paris Friday afternoon and before heading to the Elysée. Both men underlined what  they called “a world premiere”, a “historical deal”…

This agreement ends — temporarily — three months of difficult negotiations. Now comes the hard part.

According to Google’s Eric Schmidt, the deal is built on two stages:

“First, Google has agreed to create a €60 million Digital Publishing Innovation Fund to help support transformative digital publishing initiatives for French readers. Second, Google will deepen our partnership with French publishers to help increase their online revenues using our advertising technology.”

As always, the devil lurks in the details, most of which will have to be ironed over the next two months.

The €60m ($82m) fund will be provided by Google over a three-year period; it will be dedicated to new-media projects. About 150 websites members of the IPG association will be eligible for submission. The fund will be managed by a board of directors that will include representatives from the Press, from Google as well as independent experts. Specific rules are designed to prevent conflicts of interest. The fund will most likely be chaired by the Marc Schwartz, the mediator, also partner at the global audit firm Mazars (all parties praised him for his mediation and wish him to take the job).

Turning to the commercial part of the pact, it is less publicized but at least as equally important as the fund itself. In a nutshell, using a wide array of tools ranging from advertising platforms to content distribution systems, Google wants to increase its business with the Press in France and elsewhere in Europe. Until now, publishers have been reluctant to use such tools because they don’t want to increase their reliance on a company they see as cold-blooded and ruthless.

Moving forward, the biggest challenge will be overcoming an extraordinarily high level distrust on both sides. Google views the Press (especially the French one) as only too eager to “milk” it, and unwilling to genuinely cooperate in order to build and share value from the internet. The engineering-dominated, data-driven culture of the search engine is light-years away from the convoluted “political” approach of legacy media that don’t understand or look down on the peculiar culture of tech companies.

Dealing with Google requires a mastery of two critical elements: technology (with the associated economics), and the legal aspect. Contractually speaking, it means transparency and enforceability. Let me explain.

Google is a black box. For good and bad reasons, it fiercely protects the algorithms that are key to squeezing money from the internet, sometimes one cent at a time — literally. If Google consents to a cut of, say, advertising revenue derived from a set of contents, the partner can’t really ascertain whether the cut truly reflects the underlying value of the asset jointly created – or not. Understandably, it bothers most of Google’s business partners: they are simply asked to be happy with the monthly payment they get from Google, no questions asked. Specialized lawyers I spoke with told me there are ways to prevent such opacity. While it’s futile to hope Google will lift the veil on its algorithms, inserting an audit clause in every contract can be effective; in practical terms, it means an independent auditor can be appointed to verify specific financial records pertaining to a business deal.

Another key element: From a European perspective, a contract with Google is virtually impossible to enforce. The main reason: Google won’t give up on the Governing Law of a contract that is to be “Litigated exclusively in the Federal or States Courts of Santa Clara County, California”. In other words: Forget about suing Google if things go sour. Your expensive law firm based in Paris, Madrid, or Milan will try to find a correspondent in Silicon Valley, only to be confronted with polite rebuttals: For years now, Google has been parceling out multiples pieces of litigation among local law firms simply to make them unable to litigate against it. Your brave European lawyer will end up finding someone that will ask several hundreds thousands dollars only to prepare but not litigate the case. The only way to prevent this is to put an arbitration clause in every contract. Instead of going before a court of law, the parties agrees to mediate the matter through a private tribunal. Attorneys say it offers multiples advantages: It’s faster, much cheaper, the terms of the settlement are confidential, and it carries the same enforceability as a Court order.

Google (and all the internet giants for that matter) usually refuses an arbitration clause as well as the audit provision mentioned earlier. Which brings us to a critical element: In order to develop commercial relations with the Press, Google will have to find ways to accept collective bargaining instead of segmenting negotiations one company at a time. Ideally, the next round of discussions should come up with a general framework for all commercial dealings. That would be key to restoring some trust between the parties. For Google, it means giving up some amount of tactical as well as strategic advantage… that is part of its long-term vision. As stated by Eric Schmidt in its upcoming book “The New Digital Age” (the Wall Street Journal had access to the galleys) :

“[Tech companies] will also have to hire more lawyers. Litigation will always outpace genuine legal reform, as any of the technology giants fighting perpetual legal battles over intellectual property, patents, privacy and other issues would attest.”

European media are warned: they must seriously raise their legal game if they want to partner with Google — and the agreement signed last Friday in Paris could help.

Having said that, I personally believe it could be immensely beneficial for digital media to partner with Google as much as possible. This company spends roughly two billion dollars a year refining its algorithms and improving its infrastructure. Thousands of engineers work on it. Contrast this with digital media: Small audiences, insufficient stickiness, low monetization plague both web sites and mobile apps; the advertising model for digital information is mostly a failure — and that’s not Google’s fault. The Press should find a way to capture some of Google’s technical firepower and concentrate on what it does best: producing original, high quality contents, a business that Google is unwilling (and probably culturally unable) to engage in. Unlike Apple or Amazon, Google is relatively easy to work with (once the legal hurdles are cleared).

Overall, this deal is a good one. First of all, both sides are relieved to avoid a law (see last Monday Note Google vs. the press: avoiding the lose-lose scenario). A law declaring that snippets and links are to be paid-for would have been a serious step backward.

Second, it’s a departure from the notion of “blind subsidies” that have been plaguing the French Press for decades. Three months ago, the discussion started with irreconcilable positions: publishers were seeking absurd amounts of money (€70m per year, the equivalent of IPG’s members total ads revenue) and Google was focused on a conversion into business solutions. Now, all the people I talked to this weekend seem genuinely supportive of building projects, boosting innovation and also taking advantage of Google’s extraordinary engineering capabilities. The level of cynicism often displayed by the Press is receding.

Third, Google is changing. The fact that Eric Schmidt and Larry Page jumped in at the last minute to untangle the deal shows a shift of perception towards media. This agreement could be seen as a template for future negotiations between two worlds that still barely understand each other.

frederic.filloux@mondaynote.com

Google’s looming hegemony

 

If we factor Google geospatial applications + its unique data processing infrastructure + Android tracking, etc., we’re seeing the potential for absolute power over the economy. 

Large utility companies worry about Google. Why? Unlike those who mock Google for being a “one-trick pony”, with 99% of its revenue coming from Adwords, they connect the dots. Right before our eyes, the search giant is weaving a web of services and applications aimed at collecting more and more data about everyone and every activity. This accumulation of exabytes (and the ability to process such almost unconceivable volumes) is bound to impact sectors ranging from power generation, transportation, and telecommunications.

Consider the following trends. At every level, Western countries are crumbling under their debt load. Nations, states, counties, municipalities become unable to support the investment necessary to modernize — sometimes even to maintain — critical infrastructures. Globally, tax-raising capabilities are diminishing.

In a report about infrastructure in 2030 (500 pages PDF here), the OECD makes the following predictions (emphasis mine):

Through to 2030, annual infrastructure investment requirements for electricity, road and rail transport, telecommunications and water are likely to average around 3.5% of world gross domestic product (GDP).

For OECD countries as a whole, investment requirements in electricity transmission and distribution are expected to more than double through to 2025/30, in road construction almost to double, and to increase by almost 50% in the water supply and treatment sector. (…)

At present, governments are not well placed to meet these growing, increasingly complex challenges. The traditional sources of finance, i.e. government budgets, will come under significant pressure over the coming decades in most OECD countries – due to aging populations, growing demands for social expenditures, security, etc. – and so too will their financing through general and local taxation, as electorates become increasingly reluctant to pay higher taxes.

What’s the solution? The private sector will play a growing role through Public-Private-Partneships (PPPs). In these arrangements, a private company (or, more likely, a consortium of such) builds a bridge, a motorway, a railroad for a city, region or state, at no expense to the taxpayer. It will then reimburse itself from the project’s cash-flow. Examples abound. In France the elegant €320m ($413m) viaduct of Millau was built — and financed — by Eiffage, a €14 billion revenue construction group. In exchange for financing the viaduct, Eiffage was granted a 78-year toll concession with an expected internal rate of return ranging from 9.2% 17.3%. Across the world, a growing number of projects are built using this type of mechanism.

How can a company commit hundreds of millions of euros, dollars, pounds with an acceptable level of risk over several decades? The answer lies in data-analysis and predictive models. Companies engineer credible cash-flow projections using reams of data on operations, usages patterns and components life cycles.

What does all this have to do with Google?

Take a transportation company building and managing networks of buses, subways or commuter trains in large metropolitan areas. Over the years, tickets or passes analysis will yield tons of data on customer flows, timings, train loads, etc. This is of the essence when assessing the market’s potential for a new project.

Now consider how Google aggregates the data it collects today — and what it will collect in the future. It’s a known fact that cellphones send back to Mountain View (or Cupertino) geolocalization data. Bouncing from one cell tower to another, catching the signal of a geolocalized wifi transmitter, even if the GPS function is turned off, Android phone users are likely to be tracked in realtime. Bring this (compounded and anonymized) dataset on information-rich maps, including indoor ones, and you will get very high definition of profiles for who goes or stays where, anytime.

Let’s push it a bit further. Imagine a big city such as London, operating 500,000 security cameras, which represent the bulk of the 1.85 million CCTVs deployed in the UK — one for every 32 citizens. 20,000 of them are in the subway system. The London Tube is the perfect candidate for partial or total privatization as it bleeds money and screams for renovations. In fact, as several people working at the intersection of geo applications and big data project told me, Google would be well placed to provide the most helpful datasets. In addition to the circulation data coming from cellphones, Google would use facial recognition technology. As these algorithms are already able to differentiate a woman from a man, they will soon be able to identify (anonymously) ethnicities, ages, etc. Am I exaggerating ? Probably not. Mercedes-Benz already has a database of 1.5 million visual representations of pedestrians to be fed into the software of its future self-driving cars. This is a type of applications in which, by the way, Google possesses a strong lead with its fleets of driverless Prius crisscrossing Northern California and Nevada.

Coming back to the London Tube and its unhappy travelers, we have traffic data, to some degree broken down into demographics clusters; why not then add shopping data (also geo-tagged) derived from search and ads patterns, Street View-related informations… Why not also supplement all of the above with smart electrical grid analysis that could refine predictive models even further (every fraction of percentage points counts…)

The value of such models is much greater than the sum of their parts. While public transportation operators or utility companies are already good at collecting and analyzing their own data, Google will soon be in the best position to provide powerful predictive models that aggregate and connect many layers of information. In addition, its unparalleled infrastructure and proprietary algorithms provide a unique ability to process these ever-growing datasets. That’s why many large companies over the world are concerned about Google’s ability to soon insert itself into their business.

frederic.filloux@mondaynote.com

 

The press, Google, its algorithm, their scale

 

In their fight against Google, traditional media firmly believe the search engine needs them to refine (and monetize) its algorithm. Let’s explore the facts.

The European press got itself in a bitter battle against Google. In a nutshell, legacy media want money from the search engine: first, for the snippets of news it grabs and feeds into its Google News service; second, on a broader basis, for all the referencing Google builds with news media material. In Germany, the Bundestag is working on a bill to force all news aggregators to pay their toll; in France, the executive is pushing for a negotiated solution before year-end. Italy is more or less following the same path. (For a detailed and balanced background, see this Eric Pfanner story in the International Herald Tribune.)

In the controversy, an argument keeps rearing its head. According to the proponents of a “Google Tax”, media contents greatly improve the contextualization of advertising. Therefore, the search engine giant ought to pay for such value. Financially speaking, without media articles Google would not perform as well it does, hence the European media hunt for a piece of the pie.

Last week, rooting for facts, I spoke with several people possessing deep knowledge of Google’s inner mechanics; they ranged from Search Engine Marketing specialists to a Stanford Computer Science professor who taught Larry Page and Sergey Brin back in the mid-90′s.

First of all, pretending to know Google is indeed… pretentious. In order to outwit both competitors and manipulators (a.k.a, Search Engine Optimization gurus), the search engine keeps tweaking its secret sauce. Just for the August-September period, Google made no less than 65 alterations to its algorithm (list here.) And that’s only for the known part of the changes; in fact, Google allocates large resources to counter people who try too game its algorithm with an endless stream of tricks.

Maintaining such a moving target also preserves Google’s lead: along with its distributed computing capabilities (called MapReduce), its proprietary data storage system BigTable, its immense infrastructure, Google’s PageRank algorithm is at the core of the search engine’s competitive edge. Allowing anyone to catch up, even a little, is strategically inconceivable.

Coming back to the Press issues, let’s consider both quantitative and qualitative approaches. In the Google universe — currently about 40 billion indexed pages –, contents coming from media amount to a small fraction. It is said to be a low single-digit percentage. To put things in perspective, on average, an online newspaper adds between 20,000 and 100,000 new URLs per year. Collectively, the scale roughly looks like millions of news articles versus a web growing by billions of pages each year.

Now, let’s consider the nature of searches. Using Google Trends for the last three months, the charts below ranks the most searched terms in the United States, France and Germany (click to enlarge):


Do the test yourself by going to the actual page: you’ll notice that, except for large dominant American news topics (“Hurricane Sandy” or “presidential debate”), very few search results bring back contents coming from mainstream media. As Google rewards freshness of contents — as well as sharp SEO tactics — “web native” media and specialized web sites perform much better than their elder “migrants”, that is web versions of traditional media.

What about monetization ?  How do media contents contribute to Google’s bottom line? Again let’s look at the independent rankings of the most expensive keywords, those that can bring $50 per click to Google — through its opaque pay-per-click bidding system. For instance, here is a recent Wordstream ranking (example keywords in parenthesis):

Insurance (“buy car insurance online” and “auto insurance price quotes”)
Loans (“consolidate graduate student loans” and “cheapest homeowner loans”)
Mortgage (“refinanced second mortgages” and “remortgage with bad credit”)
Attorney (“personal injury attorney” and “dui defense attorney”)
Credit (“home equity line of credit” and “bad credit home buyer”)
Lawyer (“personal  injury lawyer”, “criminal defense lawyer)
Donate (“car donation centers”, “donating a used car”)
Degree (“criminal justice degrees online”, “psychology bachelors degree online”)
Hosting (“hosting ms exchange”, “managed web hosting solution”)
Claim (“personal injury claim”, “accident claims no win no fee”)
Conference Call (“best conference call service”, “conference calls toll free”)
Trading (“cheap online trading”, “stock trades online”)
Software (“crm software programs”, “help desk software cheap”)
Recovery (“raid server data recovery”, “hard drive recovery laptop”)
Transfer (“zero apr balance transfer”, “credit card balance transfer zero interest”)
Gas/Electricity (“business electricity price comparison”, “switch gas and electricity suppliers”)
Classes (“criminal justice online classes”, “online classes business administration”)
Rehab (“alcohol rehab centers”, “crack rehab centers”)
Treatment (“mesothelioma treatment options”, “drug treatment centers”)
Cord Blood (“cordblood bank”, “store umbilical cord blood”)

(In my research, several Search Engine Marketing specialists came up with similar results.)

You see where I’m heading to. By construction, traditional media do not bring money to the classification above. In addition, as an insider said to me this week, no one is putting ads against keywords such as “war in Syria” or against the 3.2 billion results of a “Hurricane Sandy” query. Indeed, in the curve of ad words value, news slides to the long tail.

Then, why is Google so interested in news contents? Why has it has been maintaining  Google News for the past ten years, in so many languages, without making a dime from it (there are no ads on the service)?

The answer pertains to the notion of Google’s general internet “footprint”. Being number one in search is fine, but not sufficient. In its goal to own the semantic universe, taking over “territories” is critical. In that context, a “territory” could be a semantic environment that is seen as critical to everyone’s daily life, or one with high monetization potential.

Here are two recent examples of monetization potential as viewed by Google: Flights and Insurance. Having (easily) determined flight schedules were among the most sought after informations on the web, Google dipped into its deep cash reserve and, for $700m, acquired ITA software in July 2010. ITA was the world largest airline search company, powering sites such as Expedia or TripAdvisor. Unsurprisingly, the search giant launched Goolge Flight Search in Sept 2011.

In passing, Google showed its ability to kill any price comparator of its choosing. As for Insurance, the most expensive keyword, Google recently launched its own insurance comparison service in the United Kingdom… just after launching a similar system for credit cards and bank services.

Over the last ten years, Google has become the search tool of choice for Patents, and for scientific papers with Google Scholar. This came after shopping, books, Hotel Finder, etc.

Aside of this strategy of making Google the main — if only — entry point to the web, the search engine is working hard on its next transition: going from a search engine to a knowledge engine.

Early this year, Google created Knowledge Graph, a system that connects search terms to what is known as entities (names, places, events, things) — millions of them. This is Google’s next quantum leap. Again, you might think news related corpuses could constitute the most abundant trove of information to be fed into the Knowledge Graph. Unfortunately, this is not the case. At the core of the Knowledge Graph resides Metaweb, acquired by Google in July 2010. One of its key assets was a database of 12 million entities (now 23m) called Freebase. This database is fed by sources (listed here), ranging from the International Federation of Association Football (FIFA) to the Library of Congress, Eurostat or the India Times. (The only French source of the list is the movie database AlloCine.)

Out of about 230 sources, there are less than 10 medias outlets. Why? Again, volume and, perhaps even more important, ability to properly structure data. When the New York Times has about 14,000 topics, most newspapers only have hundreds of those, and a similar number of named entities in their database. (As a comparison, web native medias are much more skilled at indexation: the Huffington Post assigns between 12 and 20 keywords to each story.) By building upon acquisitions such as Metaweb’s Freebase, Google now has about half billion entries of all kinds.

Legacy media must deal with a harsh reality: despite their role in promoting and defending democracy, in lifting the veil on things that mean much for society, or in propagating new ideas, when it come to data, news media compete in the junior leagues. And for Google, the most data-driven company in the world, having newspapers articles in its search system is no more than small cool stuff.

frederic.filloux@mondaynote.com

Google’s Amazing “Surveywall”

 

How Google could reshape online market research and also reinvent micro-payments. 

Eighteen months ago — under non disclosure — Google showed publishers a new transaction system for inexpensive products such as newspaper articles. It worked like this: to gain access to a web site, the user is asked to participate to a short consumer research session. A single question, a set of images leading to a quick choice. Here are examples Google recently made public when launching its Google Consumer Surveys:

Fast, simple and efficient. As long as the question is concise and sharp, it can be anything: pure market research for a packaging or product feature, surveying a specific behavior,  evaluating a service, intention, expectation, you name it.

This caused me to wonder how such a research system could impact digital publishing and how it could benefit web sites.

We’ll start with the big winner: Google, obviously. The giant wins on every side. First, Google’s size and capillarity puts it in a unique position to probe millions of people in a short period of time. Indeed, the more marketeers rely on its system, the more Google gains in reliability, accuracy, granularity (i.e. ability to probe a segment of blue collar-pet owners in Michigan or urbanite coffee-drinkers in London).The bigger it gets, the better it performs. In the process, Google disrupts the market research sector with its customary deflationary hammer. By playing on volumes, automation (no more phone banks), algorithms (as opposed to panels), the search engine is able to drastically cut prices. By 90% compared to  traditional surveys, says Google. Expect $150 for 1500 responses drawn from the general US internet population. Targeting a specific group can cost five times as much.

Second upside for Google: it gets a bird’s eye on all possible subjects of consumer researches. Aggregated, anonymized, recompiled, sliced in every possible way, these multiple datasets further deepen Google’s knowledge of consumers — which is nice for a company that sells advertising. By the way, Google gets paid for research it then aggregates into its own data vault. Each answer collected contributes a smallish amount of revenue; it will be a long while, if ever, before such activity shows in Google’s quarterly results — but the value is not there, it resides in the data the company gets to accumulate.

The marketeers’ food chain should be happy. With the notable exception of those who make a living selling surveys, every company, business unit or department in charge of a product line or a set of services will be able to throw a poll quickly, efficiently and cheaply. Of course, legacy pollsters will argue Google Consumer Surveys are crude, inaccurate. They will be right. For now. Over time the system will refine itself, and Google will have put  a big lock on another market.

What’s in Google’s Consumer Surveys for publishers whose sites will host a surveywall? In theory, the mechanism finally solves the old quest for tiny, friction-free transactions: replace the paid-for zone with a survey-zone through which access is granted after answering a quick question. Needless to say, it can’t be recommended for all sites. We can’t reasonably expect a general news site, not to mention a business news one, to adopt such a scheme. It would immediately irritate the users and somehow taint the content.

But a young audience should be more inclined to accept such a surveywall. Younger surfers will always resist any form of payment for digital information, regardless of quality, usefulness, relevance. Free is the norm. Or its illusion. Young people have already demonstrated their willingness to give up their privacy in exchange for free services such as Facebook — they have yet to realize they paid the hard price, but that’s another subject.
On the contrary, a surveywall would be at least more straightforward, more honest: users gives a split second of their time by clicking on an image or checking a box to access the service (whether it is an article, a video or a specific zone.) The system could even be experienced as fun as long as the question is cleverly put.
Economically, having one survey popping up from time to time — for instance when the user reconnects to a site — makes sense. Viewed from a spreadsheet (I ran simulations with specific sites and varying parameters), it could yield more money than the cheap ads currently in use. This, of course, assumes broad deployment by Google with thousands of market research sessions running at the same time.

A question crosses my mind : how come Facebook didn’t invented the surveywall?

–frederic.filloux@mondaynote.com