Sunday, November 20, 2011

Whatever Will Tweet Will Be?

A story on China Daily USA came to my attention this evening: about Zhong Lin, a graduate of Tsinghua University; and Zhao Siqi, from Hong Kong University. Both are engineers at Rice University, who designed a computer program that analyses tweets in real time. They hope to use it to predict the winner of the next US presidential election. To date, Lin and Siqi have been working on a project called SportSense, which examines tweets posted by NFL fans to infer what is happening in a game, and how excited the fans are. "It does so in real-time and provides visualized results for live games."

"SportSense is part of a larger project that aims to utilize people as sensors to infer what is happening in the physical world and what people feel about it." SportSense is not the first project to harness the power of Twitter, of course. I posted on the old Geary blog about a Dublin-based start-up called "WePredict". Kevin posted on this blog a few months ago about a study in Science which shows that work, sleep and the amount of daylight people are exposed to all affect mood. There has also been work (by InboxQ) on where the highest concentration of tweeters with the most knowledge about a specific topic are located. WiseWindow is a marketing firm that uses social-media activity to forecast demand for products.

Another group to keep an eye on is Derwent Capital Markets. They use Twitter sentiment to manage their hedge fund. There's been a lot of demand for the fund, according to this article. The strategy is based on an academic study by Johan Bollen (Indiana University), Huina Mao (Indiana University), and Xiao-Jun Zeng (University of Manchester) that established the connection between emotion-related words appearing in Twitter posts and subsequent movements in the Dow Jones Industrial Average. Here's the original research paper.

This article is critical of the approach taken by Derwent. It makes a number of points, one of which is: "Beyond the difficulty of assigning sentiment to tweets, there's a much bigger issue at play. If you look at the patterns of tweets what you find is that most are reactive rather than proactive... Twitter sentiment is likely to be a lagging indicator, at least in the real-time world of algo trading."

Nonetheless, this is an area which has garnered a lot of interest. This BBC story mentions a PhD student at Munich who has done similar work on predicting the stock market (and elections) with Twitter sentiment. The Economist had a piece on the topic in their second Technology Quarterly for this year. That article raised a number of interesting issues; such as the role of meaning in Twitter updates:
Humans excel at extracting meaning and sentiment from even the tiniest snippets of text, a task that stumps machines. To a computer, a tweet that reads “Feeling joyful after my trip to the dentist. Yeah, really” says that the author has been to the dentist and is now happy. Researchers have recently made strides in teaching machines to recognise such sarcasm, as well as double meanings or cultural references. In February Watson, a supercomputer devised by IBM, trounced two human champions at “Jeopardy!”, an American quiz show renowned for the way its clues are laden with ambiguity, irony, riddles and puns. But, for the most part, processing natural language remains a challenge.
While there may be skepticism about the predictive power of Twitter (especially for use in the domains of marketing and finance), there is no doubt that the medium produces a lot of user-generated information. While I am not (yet) a Twitter user, I have been keen to tap into it as a source of information for some time now. Recently, I found the means to do so: inagist is a Twitter-based news-service; probably as useful to both Twitter users and non-users alike. Even better is TweetMinster (due to its automatic updating): it's essentially a twitter-feed about current affairs (London-orientated). I'm following their live feed on breaking news.

TweetMinster tracks "the content most shared between expert users on Twitter and (we) use that data to discover and organise content for our news platform... this is what politicians, civil servants, activists, academics, business analysts and journalists think is the most important news of the day... We also feature live feeds of relevant twitter posts by the expert networks we track, so that you can follow the breaking news stories, big events and trending topics live – even if you’re not on Twitter..."

Addendum: It turns out that Twitter is only part of the story. I just read about a company called "Recorded Future" and blogged about them here: Using the Internet to Predict the Future.

6 comments:

Liam Delaney said...

interesting stuff Martin. the use of twitter for research something on my mind a lot.

Martin Ryan said...

It's certainly a select sample, but still... a fascinating one.

The selection issue is very important though: a Yahoo! Research study found that "only 20,000 people are pretty much responsible for half of all tweets on Twitter. How many of those people actually make up Twitter’s user base? Less than one percent."

Story from TIME Techland.

Yahoo! Research: Who Says What to Whom on Twitter.

The production, flow, and consumption of information will probably be an interesting area of microeconomic research to follow in the future.

Martin Ryan said...

Taken from the web-page for the study from Yahoo! Research:

"Clarification: Recent media reports have misinterpreted the result reported above that 'roughly 50% of tweets consumed are generated by just 20K elite users.' The result does not imply that 50% of tweets are broadcast by 20,000 users. In fact, the 20,000 “elite” users in question broadcast only a very small percentage of all tweets. However, many of these “elite” users have huge a large numbers of followers, thus their tweets constitute a much larger percentage of what other users receive."

Ronan Lyons said...

Can I refer the interested reader to an Irish success in this area too?
http://www.computing.dcu.ie/news/computing-student-adam-bermingham-wins-irish-software-association-award

It's been a busy few days as Adam successfully defended his thesis in a viva yesterday too!

Liam Delaney said...

Thanks for point that out Ronan. A really interesting area. Be interested to hear Adam speak on this at some stage./

Martin Ryan said...

That was a very interesting story Ronan; thanks for sharing the link. I hope to hear much more about SentiSense in the future. And the best of luck to Adam in commercialising his research.