A few thoughts on Big Data, self-knowledge, and my hopes for the emergence of a new genre of services.

I’m about to fulfill an old fantasy — the Great American Road Trip. Over the next three weeks, we’ll be driving all the way from Key West, FL to Palo Alto. In that spirit, today I’ll luxuriate in another, more distant reverie: Mining my own data exhaust.

I’m spurred to this indulgence by the words of Satya Nadella, Microsoft’s new CEO, at an April 15th event in San Francisco [emphasis mine]:

“The core evolution of silicon, software and hardware is putting computing everywhere humans are present,” Nadella said. “And it’s generating a massive data exhaust of server logs, sensor data and unstructured social stream information. We can use that exhaust to create ambient intelligence for our users.”

Nadella’s right. I leave a cloud of data exhaust with my Web browsing, credit card purchases, cell phone use, monthly blood tests, pharmacy purchases, and airline trips. Server logs detail the length and breadth of my social interactions on Facebook and Google+… And I don’t have to be on a computer to add to the cloud: I’m tracked by toll passes and license plate snapshots as I drive my car. The car itself monitors my driving habits with its black box recording of my speed and direction. This list, far from exhaustive [no pun intended], is sobering – or exciting, full of possibilities.

Today, we’ll skip the Orwellian paranoia and fantasize about an alternate universe where I can “turn the gratis around”, where I can buy my data back.

Google, Facebook, and the like provide their services for free to induce us to lead them to the mother lode: Our cache of product preferences, search history, and web habits. They forge magic ingots from our personal data, sell the bullion to advertisers, and thus fuel the server farms that mine even more data. I’m not here to throw a monkey wrench into this business model; au contraire, I offer a modest source of additional revenues: I’d like to buy my data back. And I’ll extend that offer to any and all entities that mine my activities: For you, at a special price today, I’m buying my data.

(We all understand that this fantasy must take place in an alternate universe. If our legislators and regulators were beholden to us and not to Google, Verizon, and “Concast” [a willful typo from Twitter wags], they would have long ago made it mandatory that companies provide us with our own data exhaust.)

Pursuing this train of thought, one can conceive of brokers scouring the world for my exhausts — after having secured the right permissions from me, of course. Once this becomes an established activity, no particular feat of imagination is required to see the emergence of Big Data processing companies capable of merging and massaging the disparate flumes obtained from cell carriers, e-merchants, search engines, financial services and other service providers.

So far, especially because it lacks numbers and other annoying implementation details, the theory sounds nice. But to what end?

The impulse can be viewed as a version of the old Delphic injunction: Know Thyself, now updated as Know Thine Quantified Self: Quantify what I do with my body, time, money, relationships, traveling, reading, corresponding, driving, eating… From there, many derivations come to mind, such as probabilistic diagnoses about my health, financial situation, career, and marriage. Or I could put my data in turnaround, mandate a broker to shop facets of my refined profile to the top agencies.

Even if we set aside mounds of unresolved implementation details, objections arise. A key member of my family pointedly asks how much do we really want to know about ourselves?

This reminds me of a conversation I once had with a politely cynical Parisian tailor. I ventured that he could help his customers choose a suit by snapping a picture and displaying it on a 80” flat screen TV in portrait mode. My idea was that the large scale digital picture would offer a much more realistic, a more objective image than does a look in the mirror. The customer would be able to see himself as others see him, what effect the new suit would produce – which, after all, is the point of new duds.

“No way,” said the Parisian fashionista, “are you nuts? My customers, you included,” he tartly added, “really don’t want the cruel truth about their aging bodies…”

Still, I’m curious. And not just about the shape and color of the data exhaust that I leave in my wake, about the truths — pleasant or embarrassing — that might be revealed. I’m curious about the types of companies, services, and business models that would emerge from this arrangement. Even more fascinating: How would the ability to mine and sell our own data affect our cultural vocabulary and social genetics?

JLG@mondaynote.com

PS: As offered here, I recently downloaded my Facebook data set. The data doesn’t appear to be very revealing, but that could be the result of my low Facebook involvement — I’m not a very active user. I’d be curious to see the size and detail level associated with a more involved participant.

PPS: I’ll be on the road, literally, for the next three weeks and may or may not be able to post during that time.