The value is in the reader’s Big Data

 

Why the right use of Big Data can change the economics of digital publishing. 

Digital publishing is vastly undervalued. Advertising has yet to fulfill its promises — it is nosediving on the web and it failed on mobile (read JLG’s previous column Mobile Advertising: The $20 billion Opportunity Mirage). Readers come, and often go, as many digital publications are unable to retain them beyond a few dozen articles and about thirty minutes per month. Most big names in the digital news business are stuck with single digit ARPUs. People do flock to digital, but cash doesn’t follow — at least, not in amounts required to sustain the production of quality information. Hence the crux of the situation: if publishers are unable to extract significantly more money per user than they do now, most of them will simply die. As a result, the bulk of the population — with the notable exception of the educated wealthy — will rely on high audience web sites merely acting as echo chambers for shallow commodity news snippets.

The solution, the largest untaped value resides right before publisher’s eyes: readers profiles and contents, all matched against the “noise” of the internet.

Extracting such value is a Big Data problem. But, before we go any further, what is Big Data? The simplest answer: data sets too large to be ingested and analyzed by conventional data base management tools. At first, I was a suspicious, this sounded like a marketing concept devised by large IT players struggling to rejuvenate their aging brands. I changed my mind when I met people with hands-on experience, from large corporations down to a 20-staff startup. They work on tangible things, collecting data streams from fleets of cars or airplanes, processing them in real time and, in some cases, matching them against other contexts. Patterns emerge and, soon, manufacturers predict what is likely to break in a car, find out ways to refine the maintenance cycle of a jet engine, or realize which software modification is needed to increase the braking performance of a luxury sedan.

Phone carriers, large retail chains have been using such techniques for quite a while and have adjusted their marketing as a result. Just for fun, read this New York Times Magazine piece depicting, among other things, the predictive pregnancy model developed by Target (a large US supermarket chain). Through powerful data mining, the rightfully named Target corporation is able to pinpoint customers reaching their third pregnancy month, a pivotal moment in their consuming habits. Or look at Google Flu Trends providing better tracking of flu outbreaks than any government agency.

Now, let’s narrow the scope back to the subject of today’s column and see how these technologies could be used to extract more value from digital news.

The internet already provides the necessary tools to see who is visiting a web site, what he (she) likes, etc. The idea is to know the user with greater precision and to anticipate its needs.

Let’s project an analogy with Facebook. By analyzing carefully the “content” produced by its users — statements, photos, links, interactions among friends, “likes”, “pokes”, etc. — the social network has been able to develop spectacular predictive models. It is able to detect the change in someone’s status (single, married, engaged, etc.) even if the person never mentioned it explicitly. Similarly, Facebook is able to predict with great accuracy the probability for two people exchanging casually on the network to become romantically involved. The same applies to a change in someone’s financial situation or to health incidents. Without telling anyone, semantic analysis correlated by millions of similar behaviors will detect who is newly out of job, depressed, bipolar, broke, high, elated, pregnant, or engaged. Unbeknownst to them, online behavior makes people completely transparent. For Facebook, it could translate into an unbearable level of intrusiveness such as showing embarrassing ads or making silly recommendations — that are seen by everyone.

Applied to news news contents, the same techniques could help refine what is known about readers. For instance, a website could detect someone’s job changes by matching his reading patterns against millions of other monthly site visits. Based on this, if Mrs. Laura Smith is spotted with a 70% probability to have been: promoted as a marketing manager in a San Diego-based biotech startup (five items), she can be served with targeted advertising especially if she has also appears to be a active hiker (sixth item). More importantly, over time, the website could slightly tailor itself: of course, Mrs Smith will see more biotech stories in the business section than the average reader, but the Art & Leisure section will select more contents likely to fit her taste, the Travel section will look more like an outdoor magazine than a guide for compulsive urbanites. Progressively, the content Mrs. Smith gets will become both more useful and engaging.

The economic consequences are obvious. Advertising — or, better, advertorial contents branded as such (users are sick with banners)– will be sold at a much higher price by the web site and more relevant content will induce Mrs. Smith to read more pages per month. (Ad targeting companies are doing this, but in such a crude and saturating way that it is now backfiring). And since Mrs Smith makes more money, her growing interest for the web site could make her a good candidate to become a premium subscriber, then she’ll be served with a tailor-made offer at the right time.

Unlike Facebook who will openly soak the intimacy of its users under the pretext of they are willing to give up their privacy in exchange for a great service (good deal for now, terrible in the future), news publishers will be more careful. First, readers will be served with ads and contents they will be the only ones to see — not their 435 Facebook “friends”. This is a big difference, one that requires a sophisticated level of customization. Also, when it comes to reading, preserving serendipity is essential. By this I mean no one will enjoy a 100% tailor-made site; inevitably, it will feel a bit creepy and cause the reader to go elsewhere to find refreshing stuff.

Even with this sketchy description, you get my point: by compiling and analyzing millions of behavioral data, it is possible to make a news service way more attractive for the reader — and much more profitable for the publisher.

How far-reaching is this? In the news sector, Big Data is still in infancy. But as Moore’s Law keeps working, making the required large amounts of computing power more affordable, it will become more accessible to publishers. Twenty years ago, only the NSA was able to handle large sets of data with its stadium-size private data centers. Now publishers can work with small companies that outsource CPU time and storage capabilities to Amazon Web Services and use Hadoop, the open source version of Google master distributed applications software to pore over millions of records. That’s why Big Data is booming and provides news companies with new opportunities to improve their business model.

frederic.filloux@mondaynote.com

Be Sociable, Share!

Related columns:

  1. Can Data Revitalize Journalism ? TweetGet a demo of a Bloomberg terminal. You’ll be is blown away by the depth of available data. Thousands of statistics, historical tables, sources… Everything is available through the proprietary terminal. Bloomberg started by offering a real-time news flow dedicated to the needs of the financial community, traders, analysts, etc. Over the years, the system [...]...
  2. How to make readers pay for news TweetAn idea is gaining momentum: online readers must open their wallet. In recent weeks, several suggestions for moving from wish to implementation have popped up. The latest one comes from Google. The company proposes to give a boost to its not-so-successful Checkout service by harnessing it to online newspapers interests. Quite a change here. Only [...]...
  3. US newspapers are gaining readers online TweetAccording to Nielsen data, US newspapers online audience has grown by 6% last year, with 38% of online users visiting a paper’s site regularly. This evolution is less significant than the corresponding shift in advertising that recorded a 21% growth for online papers vs. a loss of 6% in their print operations. > Story in [...]...
  4. Privacy –You liked data-mining? You’ll love reality-mining TweetData-mining is the use of mass-data to extract behavior patterns such as food purchases or clothes consumption. That will sound rather innocent compared to this: a scientist at the MIT is willing to learn about individual behavior by analyzing, — in real time of course — data collected by our cellular phone. As explained in [...]...
  5. Magazines — The Economist’s contrarians approach to readers TweetOn the US market, The Economist is quietly eyeing the one million mark in copy sales. it did it by targeting smart people, says MarketWatch media columnist Jon Friedman. > read of MarketWatch > If you want a good insight on the Economist, watch this 2007 video interview of John Micklethwait made available by Stanford [...]...

13 Comments

  1. Antoine
    Posted September 17, 2012 at 5:35 am | Permalink

    Frederic,
    Thanks for this. Your vision is possible, but unlikely. Big data and behavioral targeting is much easier for networks that agreggate a large % of the time the user spends online. Facebook, Google, MSFT, or other behind the scenes players like ad networks and other cookie droppers will have an edge.
    Can news outfit build intelligence about their users and monetize it? Certainly! But they won’t be Big Data front runners, and this will certainly not save them. They should not overinvest, but rather follow and piggy back on better positioned players.
    Your piece regarding raising prices of offline subscriptions is a better avenue to salvation :)

  2. Stephen Howard-Sarin
    Posted September 17, 2012 at 6:07 am | Permalink

    FF, you are right! Media are moving to selling audience (commodity) and audience + context (premium, if your context is premium). audience selling comes from user data, so it’s the hidden ingredient across both. Content publishers certainly have good enough data to be competitive, and they can simply buy a lot of what they lack. Except for one thing… Consistency in user identity. We sites have a tough time knowing who users really are. Ad it’s not good enough to associate rich behavioral data with an anonymous cookie ID. those registration walls turn out to be good for something after all.

  3. Walt French
    Posted September 17, 2012 at 6:26 am | Permalink

    Color me skeptical. A year or two back, I used Google to search for which cast members were performing in the SF Ballet on our subscription nights, and for months thereafter, the NYT and other sites were festooned with ads for the SF Ballet — for performance we’d bought a year earlier.
    .
    Your hypothetical just-promoted backpacker is probably NOT looking for new ways to buy hiking boots, but rather, looking for faster, more reliable ways of doing business with a trusted outlet, when she has the opportunity to get away for a couple of days.
    .
    The Target story is interesting reading, but the marginal benefit to me, a busy individual, of lifestyle ad-targetting, is de minimis. That means that advertisers will quickly discover — thanks to Big Data, I suppose — that Big Data doesn’t really help them move product, or form loyalties, much better than Dumb Data does.

  4. aepxc
    Posted September 17, 2012 at 9:08 am | Permalink

    I’ll add another voice to sceptics here. If you ask most people about what most makes them annoyed by or indifferent to advertising, insufficiently relevancy would not be at the top of the list. Relevancy could be useful if ONLY relevant ads (i.e. with non-relevant ads being replaced by blank spaces), but not if one start to become awash with potentially interesting product messages. Responding to relevance requires attention and memory, and there is already a shortage of these.

  5. Stephen Howard-Sarin
    Posted September 17, 2012 at 4:05 pm | Permalink

    @Walt Waht you are describing is simple retargeting, and we’ve all experienced it. There’s no big data there, just clever “cookie hunting.” A Big Data approach would let to advertiser know and respond to the fact that you’d bought the tickets (or at lasts were looking at content about an event in the past). Smart retargeters also track the ads they show to retargeted individuals, changing the offer over time, and smartly stop wasting their money once you’ve see 6 of the things without biting.

  6. Posted September 18, 2012 at 10:59 am | Permalink

    As you admit here and have often stressed elsewhere, “no one will enjoy a 100% tailor-made site; inevitably, it will feel a bit creepy and cause the reader to go elsewhere to find refreshing stuff.” Nor will they enjoy a 90% or an 80% tailor-made site: those too are unnervingly creepy.

    I’ve never understood the thinking behind giving readers what some algorithm imagines they want. It’s one of those very retro-techie notions that sounds great, smells great, but tastes appallingly awful.

  7. Posted September 18, 2012 at 7:20 pm | Permalink

    Another thing that you forgot to include the equation is the increasing fear of consumers revealing their personal data. Based on that research (http://corp.upstreamsystems.com/wp-content/uploads/research/research_2012-DA-attitudes.html) people are willing to opt-out of advertising promos if they feel violated.

    Your angle is right. Big Data can add significant value but for me the most important part of the new era marketing is creative optimization of text, storytelling and not relevance.

  8. Walt French
    Posted September 19, 2012 at 3:35 pm | Permalink

    @Stephen Howard-Sarin wrote, “Smart retargeters also track the ads they show to retargeted individuals…”.
    .
    I suppose firms could be smarter than Google and its ad network that delivered all those useless ads — really, many dozen appearances during the last season — but I don’t know who has could have more data on me.
    .
    So Double-Click should be able to deliver ads that are more relevant than any other, and yet I see ads in one of three classes: (1) “individual relevant” ones that have a tiny glimpse into the types of performances and art that I patronize; (2) “site relevant” ads such as Samsung ads on a news article about iPhones; and (3) “location-relevant” ads that tout how Minnesotans with bad driving records should know a trick to buy cheap insurance (which I would see accompanying some news story when connecting from MSP).
    .
    I suppose the example I gave first is the most likely one to reinforce my already being a subscriber (which the ad server would not know that I was), but on balance it seems rather useless for as much as I use Google. The second and third examples seem humorous or even annoying to think that I’m seen as the sort of poor soul who would buy overpriced insurance or “lose 30 pounds in a week” diet scams.
    .
    In other words, I have yet to see any examples of actually smart targeting. If this is the best that advertising can do with the obscenely well-populated database that Google has on me, the industry will collapse pretty soon.

  9. Posted November 26, 2012 at 12:17 pm | Permalink

    Very good post. I definitely appreciate this website.
    Keep it up!

  10. Posted December 13, 2012 at 1:00 am | Permalink

    Merci bien pour le profesionnalisme que vous mettez en oeuvre. Tout à fait ce que je n’espérais plus. Au plaisir de vous relire.

  11. Posted April 17, 2013 at 11:32 am | Permalink

    Hi! I just wanted to ask if you ever have any trouble with hackers?
    My last blog (wordpress) was hacked and I ended up losing several weeks of hard work
    due to no backup. Do you have any solutions to protect against
    hackers?

  12. Posted April 22, 2013 at 3:02 am | Permalink

    If there is then you have not been as active
    and involved in your child’s life and decisions as you should’ve or could’ve been. Parental demandingness relates more to controlling a child’s
    behavior that is seen as inappropriate, and a parent’s willingness to enforce gentle disciplinary efforts, and confronting a child who intentionally disobeys or has committed a mistake. re done with that letter, you can read it to me and we.

  13. Posted May 10, 2013 at 5:27 pm | Permalink

    There are three required core courses, namely: (1) “Soils for Environmental Professionlas” (2) “Environmental Soil, Water and Land Use” (3) “Forest and Soil Ecosystem”.
    There are many basic elements in landscape backyard design that should be incorporated into your landscape design.
    Proper planning must be done to ensure that each and every landscape gardening option has a proper space.

15 Trackbacks

  1. [...] The value is in the reader’s Big Data – Digital publishing is vastly undervalued. Advertising has yet to fulfill its promises — it is nosediving on the web and it failed on mobile. Readers come, and often go, as many digital publications are unable to retain them beyond a few dozen articles and about thirty minutes per month. [...]

  2. By Quartz: Interesting… and uncertain | Monday Note on September 30, 2012 at 7:25 pm

    [...] 3 . Practically, it means Quartz will have to deploy the most advanced techniques to qualify its audience: it will be doomed if it is unable to tell its advertisers (more than four we hope) it can identify a cluster of readers traveling to Dubai more than twice a year, or another high income group living in London and primarily interested in luxury goods and services (see a previous Monday Note on extracting reader’s value through Big Data) [...]

  3. [...] 3. Practically, it means Quartz will have to deploy the most advanced techniques to qualify its audience: it will be doomed if it is unable to tell its advertisers it can identify a cluster of readers traveling to Dubai more than twice a year, or another high income group living in London and primarily interested in luxury goods and services (see a previous Monday Note on extracting readers’ value through Big Data). [...]

  4. [...] 3. Practically, it means Quartz will have to deploy the most advanced techniques to qualify its audience: it will be doomed if it is unable to tell its advertisers it can identify a cluster of readers traveling to Dubai more than twice a year, or another high income group living in London and primarily interested in luxury goods and services (see a previous Monday Note on extracting readers’ value through Big Data). [...]

  5. [...] 3. Practically, it means Quartz will have to deploy a many modernized techniques to validate a audience: it will be cursed if it is incompetent to tell a advertisers it can brand a cluster of readers roving to Dubai some-more than twice a year, or another high income organisation vital in London and essentially meddlesome in oppulance products and services (see a prior Monday Note on extracting readers’ value by Big Data). [...]

  6. [...] 3. Practically, it means Quartz will have to deploy a many modernized techniques to validate a audience: it will be cursed if it is incompetent to tell a advertisers it can brand a cluster of readers roving to Dubai some-more than twice a year, or another high income organisation vital in London and essentially meddlesome in oppulance products and services (see a prior Monday Note on extracting readers’ value by Big Data). [...]

  7. [...] 3. Practically, it means Quartz will have to deploy a many modernized techniques to validate a audience: it will be cursed if it is incompetent to tell a advertisers it can brand a cluster of readers roving to Dubai some-more than twice a year, or another high income organisation vital in London and essentially meddlesome in oppulance products and services (see a prior Monday Note on extracting readers’ value by Big Data). [...]

  8. [...] 3. Practically, it means Quartz will have to deploy the most advanced techniques to qualify its audience: it will be doomed if it is unable to tell its advertisers it can identify a cluster of readers traveling to Dubai more than twice a year, or another high income group living in London and primarily interested in luxury goods and services (see a previous Monday Note on extracting readers’ value through Big Data). [...]

  9. [...] Lire l’article (en anglais) [...]

  10. [...] van een Frans mediabedrijf en schrijft al meer dan 5 jaar over technologie op Monday Note. In een artikel over de waarde van grote datasets schreef hij het volgende: By analyzing carefully the “content” produced by its users — [...]

  11. [...] network to become romantically involved, says Frederic Filloux, writing in Monday Note about how Big Data can change the economics of digital publishing. The internet already provides the necessary tools to see who is visiting a web site, what he (she) [...]

  12. [...] la traduzione su LSDI, qui invece l’articolo originale Share this:TwitterFacebookLinkedInGoogle +1EmailLike this:LikeBe the first to like this. This [...]

  13. By Big Data, Bigger Considerations | jennacola101 on January 29, 2013 at 6:19 pm

    [...] The value is in the User’s Big Data – Felix Filloux (2012) [...]

  14. [...] Filloux, F. (2012). The value is in the reader’s big data. Monday Notes. Retrieved April 7, 2013, from http://www.mondaynote.com/2012/09/17/the-value-is-in-the-readers-big-data/ [...]

  15. [...] 3. Practically, it means Quartz will have to deploy the most advanced techniques to qualify its audience: it will be doomed if it is unable to tell its advertisers it can identify a cluster of readers traveling to Dubai more than twice a year, or another high income group living in London and primarily interested in luxury goods and services (see a previous Monday Note on extracting readers’ value through Big Data). [...]

Post a Comment

Your email is never shared. Required fields are marked *

*
*