twitter

Twitter, Facebook and Apps Scams

Here is the latest Twitter scam I’ve heard this week. Consider two fictitious media, the Gazette and the Tribune operating on the same market, targeting the same demographics, competing fort the same online eyeballs (and the brains behind those). Our two online papers rely on four key traffic drivers:

  1. Their own editorial efforts, aimed at building the brand and establishing a trusted relationship with the readers. Essential but, by itself, insufficient to reach the critical mass needed to lure advertisers.
  2. Getting in bed with Google, with a two-strokes tactic: Search Engine Optimization (SEO), which helps climb to the top of search results page; and Search Engine Marketing (SEM), in which a brand buys keywords to position its ads in the best possible context.
  3. An audience acquisition strategy that will artificially grow page views as well as the unique visitors count. Some sites will aggregate audiences that are remotely related to their core product, but that will better dress them up for the advertising market (more on this in a forthcoming column).
  4. An intelligent use of social medias such Facebook, Twitter, LinkedIn and of the apps ecosystem as well.

Coming back to the Tribune vs. Gazette competition, let’s see how they deal with the latter item.

For both, Twitter is a reasonable source of audience, worth a few percentage points. More importantly, Twitter is a strong promotional vehicle. With 27,850 followers, the Tribune lags behind the Gazette and its 40,000 followers. Something must be done. The Tribune decides to work with a social media specialist. Over a couple of months, the firm gets to the Tribune to follow (in the Twitter sense) most of the individuals who already are Gazette followers. This mechanically translates into a “follow-back” effect powered by implicit flattery: ‘Wow, I’ve been spotted by the Tribune, I must have voice on some sort…’ In doing so, the Tribune will be able to vacuum up about a quarter or a third — that’s a credible rate of follow-back — of the Gazette followers. Later, the Tribune will “unfollow” the defectors to cover its tracks.

Compared to other more juvenile shenanigans, that’s a rather sophisticated scam. After all, in our example, one media is exploiting its competitor’s audience the way it would buy a database of prospects. It’s not ethical but it’s not illegal. And it’s effective: a significant part of the the followers so “converted” to the Tribune are likely stick to it as the two media do cover the same beat.

Sometimes, only size matters. Last December, the French blogger Cyroul (also a digital media consultant) uncovered a scam performed by Fred & Farid, one of the hippest advertising advertising agencies. In his post (in French) Cyroul explained how the ad agency got 5000 followers in a matter of five days. As in the previous example, the technique is based on the “mass following” technique but, this time, it has nothing to do with recruiting some form of “qualified” audience. Fred & Farid arranged to follow robots that, in turn, follow their account.  The result is a large number of new followers from Japan or China, all sharing the same characteristic: the ratio between following/followed is about one, which is, Cyroul say, the signature of bots-driven mass following. Pathetic indeed. His conclusion:

One day, your “influence” will be measured against real followers or fans as opposed to bots-induced accounts or artificial ones. Then, brands will weep as their fan pages will be worth nothing; ad agencies will cry as well when they realize that Twitter is worth nothing.

But wait, there are higher numbers on the crudeness scale: If you type “increase Facebook fans” in Google, you’ll get swamped with offers. Wading through the search results, I spotted one carrying a wide range of products: 10,000 views on YouTube for €189; 2000 Facebook “Likes” for €159; 10,000 followers on Twitter for €890, etc. You provide your URL, you pay on a secure server, it stays anonymous and the goods are delivered between 5 and 30 days.

The private sector is now allocating huge resources to fight the growing business of internet scams. Sometimes, it has to be done in a opaque way. One of the reasons why Google is not saying much about its ranking algorithm is — also — to prevent fraud.

As for Apple, its application ecosystem faces the same problem in. Over time, its ranking system became questionable as bots and download farms joined the fray. In a nutshell, as for the Facebook fans harvesting, the more you were willing to pay, the more notoriety you got thanks to inflated rankings and bogus reviews. Last week, Apple issued this warning to its developer community:

Adhering to Guidelines on Third-Party Marketing Services

Feb 6, 2012
Once you build a great app, you want everyone to know about it. However, when you promote your app, you should avoid using services that advertise or guarantee top placement in App Store charts. Even if you are not personally engaged in manipulating App Store chart rankings or user reviews, employing services that do so on your behalf may result in the loss of your Apple Developer Program membership.

Evidently, Apple has a reliability issue on how its half million apps are ranked and evaluated by users. Eventually, it could affect its business as the AppStore could become a bazaar in which the true value of a product gets lost in a quagmire of mediocre apps. This, by the way, is a push in favor of an Apple-curated guide described in the Monday Note by Jean-Louis (see Why Apple Should Follow Michelin). In the UK, several print publishers have detected the need for independent reviews; there, newsstands carry a dozen of app review magazines, not only covering Apple, but the Android market as well.

Obviously there is a market for that.

Because they depend heavily on advertising, preventing scams is critical for social networks such as Facebook or Twitter. In Facebook’s pre-IPO filing, I saw no mention of scams in the Risk Factors section, except in vaguest of terms. As for Twitter, all we know is the true audience is much smaller than the company says it is: Business Insider calculated that, out of the 175 million accounts claimed by Twitter, 90 million have zero followers.

For now, the system stills holds up. Brands remain convinced that their notoriety is directly tied to the number of fan/followers they claim — or their ad agency has been able to channel to them. But how truly efficient is this? How large is the proportion of bogus audiences? Today there appears to be no reliable metric to assess the value of a fan or a follower. And if there is, no one wants to know.

frederic.filloux@mondaynote.com

Datamining Twitter

On its own, Twitter builds an image for companies; very few are aware of this fact. When a big surprise happens, it is too late: a corporation suddenly sees a facet of its business — most often a looming or developing crisis — flare up on Twitter. As always when a corporation is involved, there is money to be made by converting the problem into an opportunity: Social network intelligence is poised to become a big business.

In theory, when it comes to assessing the social media presence of a brand, Facebook is the place to go. But as brands flock to the dominant social network, the noise becomes overwhelming and the signal — what people really say about the brand — becomes hard to extract.

By comparison, Twitter more swiftly reflects the mood of users of a product or service. Everyone in the marketing/communication field becomes increasingly eager to know what Twitter is saying about a product defect, the perception of a strike or an environmental crisis. Twitter is the echo chamber, the pulse of public feelings. It therefore carries tremendous value.

Datamining Twitter is not trivial. By comparison, diving into newspaper or blog archives is easy; phrases are (usually) well-constructed, names are spelled in full, slang words and just-invented jargon are relatively rare. By contrast, on Twitter, the 140 characters limit forces a great deal of creativity. The Twitter lingo constantly evolves, new names and characterizations flare up all the time, which excludes straightforward full-text analysis. The 250 million tweets per day are a moving target. A reliable quantitative analysis of the current mood is a big challenge.

Companies such as DataSift (launched last month) exploit the Twitter fire hose by relying on the 40-plus metadata included in a post. Because, in case you didn’t know it, an innocent looking tweet like this one…

…is a rich trove of data. A year ago, Raffi Krikorian, a developer on Twitter’s API Platform team (spotted thanks to this story in ReadWriteWeb) revealed what lies behind the 140 characters. The image below…

…is a tear-down of a much larger one (here, on Krikorian’s blog) showing the depth of metadata associated to a tweet. Each comes with information such as the author’s biography, level of engagement, popularity, assiduity, location (which can be quite precise in the case of a geotagged hotspot), etc. In this WiredUK interview, DataSift’s founder Nick Halstead mentions the example of people tweeting from Starbucks cafés:

I have recorded literally everything over the last few months about people checking in to Starbucks. They don’t need to say they’re in Starbucks, they can just be inside a location that is Starbucks, it may be people allowing Twitter to record where their geolocation is. So, I can tell you the average age of people who check into Starbucks in the UK.
Companies can come along and say: “I am a retail chain, if I supply you with the geodata of where all my stores are, tell me what people are saying when they’re near it, or in it”. Some stores don’t get a huge number of check-ins, but on aggregate over a month it’s very rare you can’t get a good sampling.

Well, think about it next time you tweet from a Starbucks.

DataSift further refined its service by teaming up with Lexalytics, a firm specialized in the new field of “sentiment analysis“, which measures the emotional tone of a text — very useful to assess the perception of a brand or a product.

Mesagragh, a Paris-based startup with a beachhead in California plans a different approach. Instead of trying to guess the feeling of a Twitter crowd, it will create a web of connections between people, terms and concepts. Put another way, it creates a “structured serendipity” in which the user will naturally expand the scope of a search way beyond the original query. Through its web-based application called Meaningly, Mesagraph is set to start a private beta this week, and a public one next January.

Here is how Meaningly works: It starts with the timeline of tens of thousands Twitter feeds. When someone registers, Meaningly will crawl his Twitter timeline and add a second layer composed by the people the new user follows. It can grow very quickly. In this ever expanding corpus of twitterers, Meaningly detects the influencers, i.e. the people more likely to be mentioned, retweeted, and who have the largest number of qualified followers. To do so, the algorithm applies an “influence index” based on specialized outlets such as Klout or Peer Index that measure someone’s influence on social medias. (I have reservations regarding the actual value of such secret sauces: I see insightful people I follow lag well behind compulsive self-promoters.) Still, such metrics are used by Meaningly to reinforce a recommendation.

Then, there is the search process. To solve the problem of the ever morphing vernacular used on Twitter, Mesagraph opted to rely on Wikipedia (in English) to analyze the data it targets. Why Wikipedia? Because it’s vast (736,000 subjects), it’s constantly updated (including with the trendiest parlance), it’s linked, it’s copyright-free. From it, Mesagraph’s crew extracted a first batch of 200,000 topics.

To find tweets on a particular subject, you first fill the usual search box; Meaningly will propose a list of predefined topics, some expressed with its own terminology; then it will show a list of tweets based on the people you’re following, the people they follow, and “influencers” detected by Meaningly’s recommendation engine. Each Tweet comes with a set of tags derived from the algorithm mapping table. These tags will help to further refine the search with terms users would have not thought of. Naturally, it is possible to create all sorts of custom queries that will capture relevant tweets as they show up; it will then create a specific timeline of tweets pertaining to the subject. At least that’s the idea; the pre-beta version I had access to last week only gave me a sketchy view of the service’s performances. I will do a full test-drive in due course.

Datamining Tweeter has great potential for the news business. Think of it: instead of painstakingly building a list of relevant people who sometimes prattle endlessly, you’ll capture in your web of interests only the relevant tweets produced by your group and the group it follows, all adding-up in real-time. This could be a great tool to follow developing stories and enhance live coverage. A permanent, precise and noise-free view of what’s hot on Twitter is a key component of the 360° view of the web every media should now offer.

frederic.filloux@mondaynote.com

Losing value in the “Process”

Digital media zealots are confused: they mistake news activity for the health of the news business. Unfortunately, the two are not correlated. What they promote as a new kind of journalism carries almost no economic value. As great as they are from a user standpoint, live blogging / tweeting, crowdsourcing and hosting “experts” blogs bring very little money – if any, to the news organization that operates them. Advertising-wise and on a per page basis, these services yield only a fraction of what a premium content fetches. On some markets, a blog page will carry a CPM (Cost per Thousand page views) of one, while premium content will get 10 or 15 (euros or dollars). In net terms, the value can even be negative, as many such contents consume manpower in order to manage, moderate, curate or edit them.

More realistically, these contents also carry some indirect but worthy value: in a powerful way, they connect the brand to the user. Therefore, I still believe news organization should do more, no less of such coverage. But we should not blind ourselves: the economic value isn’t there. It lies in the genuine and unique added value of original journalism deployed by organizations of varying size and scope, ranging from traditional media painfully switching to the new world, to pure online players — all abiding by proven standards.

What’s behind the word standard is another area of disagreement with Jeff Jarvis, as he opposes the notion of standards to what he calls “process”, or “journalism in beta” (see his interesting post Product v. process journalism; The myth of perfection v. beta culture).  Personally, I’d rather stick to the quest for perfection rather than embrace the celebration of the “process”. The former is inherently more difficult to reach, more prone to the occasional ridicule (cf. the often quoted list of mishaps by large newspapers). As for the latter, it amounts to shielding behind the comfortable “We say this, but we are not sure; don’t worry, we’ll correct it over time”.

To some extent, such position condones mediocrity. It’s one thing to acknowledge live reporting or covering developing stories bear the risk of factual errors. But it is another to defend inaccuracies as a journalistic genre, as a French site did (until recently): it labeled its content with tags like “Verified”, “Readers’ info”, etc.

Approximation must remain accidental, it should not be advocated as a normal journalistic way.

In the digital world, the rise of the guesstimate is also a byproduct of the structure in which a professional reporter finds himself competing with the compulsive blogger or twitterer. Sometimes, the former will feel direct pressure from the latter (“Hey, Twitter is boiling with XY, could you quickly do something about it? — Not yet, I’m unable to verify… — Look pal, we need to do something, right?). Admittedly, such competition can be a good thing: we’ll never say enough how much the irruption of the reader benefited and stimulated the journalistic crowd.

Unfortunately, the craze of instant “churnalism” tends to accommodate all the trade’s deviances. Today, J-Schools consider following market demands and teaching the use of Twitter or live-blogging at the expense of learning more complex types of journalism. Twenty years ago, we were still hoping the trade of narrative writing could be taught in newsrooms populated with great editors, but this is no longer the case. Now, most of the 30-40 something who plunged into the live digital frenzy have already become unable to produce long form journalism. And the obsessive productivism of digital serfdom won’t make things better (as an illustration, see this tale of a burned-out AOL writer in Faster Times).

The business model will play an important role in solving this problem. Online organizations will soon realize there is little money to be made in “process-journalism”. But, as they find it is a formidable vector to drive traffic and to promote in-depth reporting, they will see it deserves careful strategizing.

Take Twitter. Its extraordinary growth makes it one of the most potent news referral engines. Two weeks ago, at the D9 conference, Twitter CEO Dick Costolo  (video here) released a stunning statistic: it took three years to send the first billion tweets; today, one billion tweets are send every six days.

No wonder many high profile journalists or writers enjoy tweeter audiences higher than many news organizations, or became a brand on their own, largely thanks to Twitter. The twice Pulitzer prize winner and NY Times columnist Nicholas Kristof has 1.1m followers, that is one third of the New York Times’ official Twitter accounts followers.  And Nobel Prize economist Paul Kurgman, who also writes for the New York Times, has more than 610,000 followers. Not bad for specialized writing.

In some cases, the journalist will have a larger Twitter audience that the section where he/she writes: again for the NY Times, the business reporter Andrew Ross Sorkin has 20 times more followers (370,000) than Dealbook, the sub-site he edits. According to its CEO Arthur Sulzberger, a NY Times story is tweeted every four seconds, and all Times Twitter accounts have four times more followers that any other paper in America. Similarly, the tech writer Kara Swisher has 50 times more Twitter followers (757,000) that her employer, the WSJ tech site AllThingsD .

There are several ways to read this. One can marvel at the power of a personal branding that thrives to the mother ship’s benefit. Then, on the bean counter floor, someone else will object this stream of  tweets is an unmonetized waste of time. Others, at the traffic analytic desk, will retort Twitter’s incoming traffic represents a sizable part of the audience, and can therefore be measured in hard currency. Well… your pick.

frederic.filloux@mondaynote.com


The News Cycle Heartbeat

How do mainstream media and blogs interact? How do they feed each other ? Everyone in the newsmedia would love to get a better view of the mating dance. A few weeks ago, scientists at the Cornell University unveiled a thorough analysis of the relationship between the two universes. Borrowing from genomics techniques, they dug into a huge corpus of politically-related sentences and tracked their bounces between mainstream media (MSM) and the blogosphere.

Their dataset:

  • About 90 million documents (blog posts and news sites articles) collected between August 1 and October 31, 2008, i.e. at the height of the last US Presidential race.
  • 1.65 million blogs scanned.
  • 20,000 media sites reviewed, marked as mainstream because they are part of GoogleNews.
  • From this dataset, researchers extracted 112 million quotes leading to 47 million phrases, out of which 22 million were deemed “distinct”. These phrases were important enough to be considered as news.
  • The phrases where political statements or sound bytes pertaining to the political race  and uttered by the two candidates, their running mates or their staff.
  • Processing these 390GB of data took about nine hours of computer time (using a complex set of algorithms, involving “markers”, as in genetics).

The findings, in a nutshell:

  1. Mainstream media lead the news cycle. They are the first to report a quote, the story behind it, the context, etc.
  2. The 20,000 MSM sites generate 30% of the documents in the entire dataset and 44% of  the documents that contained frequent phrases.
  3. It takes about 2.5 hours for a phrase to reverberates through the blogosphere.
  4. The phrases that propagate in the opposite way (from blogs to MSM) amounts to a mere 3.5%.
  5. A news piece decays faster on the MSM than on the blogosphere.

The comparative curve looks like this :

For those who want the complete analysis, the full report is available here.

As expected, this research triggered controversy. More

The end of the breaking news — as we know it

In the internet storm sweeping the media, breaking news is, without a doubt, the main casualty. This branch of the information stream is the most likely one to endure a kind of “commodity syndrome”. The breaking news circa 2010 will be ubiquitous, instantaneous and simultaneous. Its value, its market price actually, will tend to zero as a result.

Two forces are at work, here: the professionalization of the blogosphere and the impact of Twitter. Dealing with this is critical for the survival of traditional media. Let’s have a closer look. More

The news flow: Dealing with the fire hose

In the Seventies, Peter Herford, CBS bureau chief in Saigon, used to send his stories the physical way: rolls of 16 mm film, usually shot with an Eclair (a French camera) and sound tapes (recorded on a Swiss Nagra recorder, a jewel of those analog times) were shipped to HongKong, courtesy the US Air Force, and then transfered to a regular US-bound flight, with a stop in Hawaii or Okinawa. “The CBS Evening News was hosted by Walter Cronkite who wanted half of its newscast filled with Vietnam stories”, Herford told me. Hence the daily routine. But once the stories were sent, Herford and his staff had time for reflexion, for working their sources and for thinking about the next stories. No satellite link, no cell phones. “Today, I would be stuck doing live reports all the time.” Hereford is in no way nostalgic about this totally analog era. As he was in Paris for a conference, a couple weeks ago, he was constantly taking pictures with a professional Canon camera. Today, he teaches journalism at the Shantou University in China and still exudes unabated enthusiasm for journalism.

Walter Cronkite in Vietnam (Feb. 1968) -- National Archives

Walter Cronkite in Vietnam (Feb. 1968) -- National Archives

Revisited with today’s journalistic tools, coverage of the Vietnam war would be different, in many ways. Live would be de rigueur.  Think about it. We would have had:
- a TV correspondent doing a standup (or rather a duck-down) right in the midst of the Khe-Sanh siege
- the Tet offensive twittered or live-blogged
- a retired general bashing the “delicate” tactics of carpet-bombing on his blog
- a heavily linked-to chemist-blogger, for his expert depiction of the horrendous effects of Agent Orange, a defoliant spread for ten years over the jungle (400,000 deaths, 500,000 birth defects). We can be sure it would have triggered a national outrage in the US, forcing the Kennedy/Johnson administration to stop
- the My Lay Massacre inevitably leaked, thanks to a disgusted soldier posting a video on YouTube soon after the the fateful day of March 16, 1968. Instead, we had to wait 18 months for a reporter for the Saint-Louis Post Dispatch to break the story; his name was Seymour Hersh, he was to become an iconic investigative reporter, bound to reveal the 2004 Abu Ghraib abuses in Iraq;
- for good measure, North Vietnamese bloggers would give the world a different perspective on the war as, on the other side, US soldiers-bloggers would have lifted the veil on the low morale, drug abuses among the troops, and their acceptance of inevitable defeat;
- in the end, the April 30th, 1975 evacuation of Saigon would have been reported live using citizens and evacuees cell phones and twitters. More

The success story of a technology-enhanced media brand

‘A fan of ours wrote an iPhone application, just for the sake of it.’ How many media companies can make such a bragging statement? One does: NPR, the American National Public Radio. Bradley Flubacher, is a professional programmer who moonlights as a volunteer firefighter in a small Pennsylvania town. A few months ago, Brad decided he wanted to learn a new programming language and to develop for the iPhone. Et voilà: NPR Addict, a free app that gives access to thousands of podcasts in a simple and efficient way. The author didn’t make a dime in the process: his app is free. If you want to give a few bucks, he will encourage you to do so directly to a local NPR affiliate. This is what I call a true fan – and a testament to NPR’s place in American culture.

Two thoughts to be drawn from this anecdote. First, the relationship a great media brand such as a Public Radio enjoys with its audience. Second, how such bond can be boosted by a clever use of digital technology.

In France, we praise ourselves as being the champions of public broadcasting. We have many brands around Radio France, great shows, excellent journalistic crews and so on. Brands such as France Inter or the all-news channel France Info appeal to a large audience; others, France Culture being one example, target only small circles and feel themselves totally liberated from vulgar strictures such as attracting large audiences. Fine. More