Let’s discuss a developing data management contradiction. People thinking in strategic terms about the monetization of digital medias, publishers, marketers, are unanimous. Collecting and poring over data has become more important than ever.
That’s one trend.

The other involves the gatekeepers. As I briefly explained last week, we now face a small club of high tech giants — Google, Microsoft, Amazon, Yahoo and Apple — who, over the years, acquired an unprecedented ability to gather and process data. As competition is heating up among them, the data they’ll be able to get will continue to increase in tactical and strategical value. As a corollary, they are increasingly inclined to keep such data close to the vest.

The latest example came up with the April 8th launch of Apple’s iAd initiative for the iPhone platform. As Steve Jobs delicately puts it, mobile ads “suck”, his own words for what we’ve been saying all along in the Monday Note (see The future of content navigation). Digital ads suffers from an inherent flaw: they are designed to take the viewer away from the content (again: see Jobs’ keynote speech for details, go directly to the 44 minutes mark to watch the ad chapter). Hence the solution envisioned by the Cupertino boys: embedding the ad within the application – which, in the process, becomes the commercial Trojan Horse of mobile computing. Next step: connecting the ad to a transaction system that will collect a 40% commission fee.

The data? Apple keeps them. Publisher, consider yourself lucky, you get the money and a set of basic numbers. As pointed out by Peter Kafka in The Wall Street Journal’s blog All Things D, contractually speaking, the terms are unambiguous.

Section 3.3.9 of the developer agreement, stipulates:

“Notwithstanding anything else in this Agreement, Device Data may not be provided or disclosed to a third party without Apple’s prior written consent. Accordingly, the use of third party software in Your Application to collect and send Device Data to a third party for processing or analysis is expressly prohibited.”

In the developer agreement words, with iAd, today’s typical website analytics website becomes impossible for an iPhone, iPod Touch or iPad application: data extraction is limited “to the same purpose as necessary to provide services or functionality for such Application”.

Analytics is key to the advertising and publishing business. In the print press, publishers used to say ‘I know my ads works half of the time, but I don’t know which half’. Unlike print, the internet comes natively with lots of customer data. Enhanced by third party services, a media is able to analyze precisely when and where an ad is seen, in which context, for how long, on what device, using what software/hardware, etc.

Mobile advertising is even more prolific. Take location data, for instance. Using a combination of GPS, cell-tower and wifi triangulations, every mobile phone user can be tracked in real time. To get an idea, go to this Skyhook Wireless demo, or this video made by Sense Network, which shows the movement patterns of people aggregated in social “tribes”. At the carrier level, this can even be done in a nominative way, thanks to the electronic serial number broadcasted by the phone. Of course, what raises serious privacy concerns, also causes the marketing community to drool. Hence the level of frustration triggered by Apple’s restrictions. Without the ability to use some anonymized datasets, placing ads in an iPhone app loses its appeal. Advertisers will be reluctant to go back to the shot-in-the-dark system that prevailed in print.

Indifferent to these concerns, Apple is proceeding with its “my-way-or-the-highway”. And it wants to keep the mobile advertising experience in sync with its aesthetic standards. Not only will Apple provide the platforms (applications + placement systems + toll collection), but it plans to be involved in the creative process of mobile ads embedded in applications, at least temporarily. To jump start this new business, and enforce visual standards, it appears early players will have access to a dedicated creative pool setup by Apple. According to Advertising Age, which quotes an exec at Quattro Wireless, the company acquired by Apple in January: “First couple of months they [Apple] will actually be the ones coding and programming the ads”. Agencies will plan and design campaigns, but Apple and Quattro developers will code the ads in HTML 5. After a while, hopefully, standards will be set and advertisers will fly on their own. This, in itself, tells a lot about the technological shortcomings of the advertising sector – an excellent opportunity for the startup ecosystem that will fill the gap.

In the commercial internet, keeping data confidential seems to be the new obsession. This will play against many sectors. Take the electronic book publishing: in many countries, this sector is vastly under-analyzed; marketing expenses, especially post-sales analysis, are low. Hence the hope that the ebook channel will come up with a stream of data on who buys what, and more importantly, on how books are read. This could benefit the entire book industry, as well as educational professionals, sociologists, people studying reading patterns, etc.

But that’s not going to happen for books (or media products). A couple of months ago, Amazon tightened-up the release of sales data and, according to publishers who are in negotiations with Apple in the US and Europe I talked to, the Cupertino guys are not willing to go beyond the bare minimum.

The sad thing is data-mining is a thriving sector. Disciplines such as machine learning, Bayesian probability, are creating huge improvements in data management. The ability to crunch and to decipher oodles of data can be a major driver for innovation. (On this very topic, I recommend an excellent book, The Fourth Paradigm: Data-Intensive Scientific Discovery, which compiles essays about data driven science; free and legal download here).

The cell phone industry has made “reality mining” its daily bread and butter. Coming back to Boston-based Skyhook Wireless, behold their numbers: 300million check-ins a day! These come from every iPhone, iPad, Mac OS-powered laptop, as well as Dell devices and also from a growing number of Android-powered smart phones working in North America. (More in this MIT TechReview story). Thing about what gold can be extracted from such an immense dataset.

The same goes for media: the New York Times recently disclosed that its stories are twitted every 4 seconds! That’s quite a lot of popular votes, they deserves some analysis (matching the twitting patterns to the news cycle, as an example).

Finally, what is behind this increasing restriction on data access? In one word: competition between gatekeepers. Summed up into a chart, it looks like this.

It speaks for itself, don’t you think?


