Sentiment analysis in semantic search

The dictionary of opinion states that it is “a view or judgment formed about something, not necessarily based on fact or knowledge.” It’s been about two years since Google teamed up with university researchers to launch the Google Data Analytics Social Science Research and we’re beginning to see some of the fruit of that labour.

The Dictionary Definition of Opinion

Semantic search (and the semantic web that Google is building around it) acquires value at the contextual and predictive levels. Data is not much use unless it can provide fast, relevant answers or provide solutions to potential problems before you begin to realise you have them (predictive search).

That kind of granularity transforms search from a product you may want to use into a service you would not want to live without. This kinda makes the stakes high and there is one more element attached to it. The moment you can mine sentiment in social networks you can find out what people think. This is not just useful to branding (where brands spend a lot to generate a positive reception by the online population) but it also means a lot to advertisers (who will be able to target consumers at their most emotionally responsive state).

Sentiment and opinion are closely related. The difference being, perhaps, on the clarity of articulation with the latter being more openly expressed while the latter intimates the feeling or thought through associated expressions. Before we get into the entire “mind reading/Big Brother” thing let’s clarify a few issues:

  • First, sentiment mining is hard to do. User-generated content is unstructured, noisy and potentially spammy with a large redundancy factor in its iteration of expressions used.
  • Given the fact that semantic search is also predicated around the provision of answers to qualifying questions (What’s the best restaurant? What’s the most reliable camera to buy?) its ability to mine opinion across the web and extract sentiment on products and services is critical.

The formula for mining sentiment from comments and content is relatively simple:

Sentiment := <Holder, Target, Polarity, Auxiliary>

Basically the breakdown is into four elements:

  • Holder: who expresses the sentiment
  • Target: what/whom the sentiment is expressed to
  • Polarity: the nature of the sentiment (positive, negative, or neutral)
  • Auxiliary: strength, summary, confidence, time

They can be summarized as Who, Whom, What and How. Within G+ profile it is quite possible for Google to build a comment graph that addresses Target and Polarity in relation to a specific profile. Across the wider web the task becomes exponentially harder.

The basic building blocks of sentiment analysis

Sentiment and Emotion

At the heart of the sentiment analysis research model lies an old philosophical debate: are emotions subject to calculation? If the answer to that is no then the Turing Test proposed by British mathematician Alan Turing in the 1950s does not mean very much and this also changes specific aspects of what we consider to be the development of Artificial Intelligence the real future of which might lie in what is now being called affective computing (the detection and expression of emotions).

There are, however, practical applications that are way more immediate. The first of these is veracity. Arguably the hardest of the 4Vs of semantic search (Volume, Velocity, Variety and Veracity) to handle, sentiment analysis can become one of the defining elements used to determine the truthfulness (or not) of a fast-changing piece of information through the mining of the relational connections of profiles across social networks. Simplified this is a little like someone in your place of work checking to see who agrees with whom and over what in order to determine fact from fiction in a rumor about lay-offs. If the story has come from the cleaning staff and has other corroborative elements and everyone else disagrees, the volume and velocity of the conversation do not matter, it is most probably false. But if it started in the mailroom and human resources have become involved with neutral comments there may be a higher element of truth to it.

If this sounds ‘noisy’ and ‘messy’ it’s because it is. Sentiment mining can work at scale but it is an imprecise science at best for the exact same reasons that the Turing Test may be unimportant.

Can We Be Emotionally Manipulated via Social Media?

At the heart of the fears (and objections) regarding sentiment analysis lies the fear that we can, somehow, be manipulated. Facebook, very recently used Linguistic Inquiry and Word Count (LIWC) a tool designed to mine sentiment, to carry out a large-scale emotion-manipulation experiment in its network without its user-base’s knowledge. The study was published with an editorial comment of concern signed by the editor-in-chief of PNAS (the journal in which it appeared).

Predictably, it created a furor for reasons Gideon Rosenblatt outlined in his blog, The Vital Edge. Finn Arup Nielsen, a Danish university researcher, weighed in on this with a piece that verifies that, with some caveats, that emotions can be mined from text, emotional states can be predicted and an element of manipulation, as a result, can take place.

Sentiment Analysis and Prediction in Search

Google does use its Prediction API to create sentiment analysis models. The company was also recently granted a patent to create suggestions for automated reactions in a social network setting - what the BBC called Google’s Social Media Robot.

The Research Foundation Of The State University Of New York has been awarded a patent for “Large-scale Sentiment Analysis”. The power of semantic technologies to change the way we use data was signalled by Google’s blogpost on its Prediction API which was titled “every app a smart app”.

Microsoft is also entering the fray with its Oslo project designed to pull in data from across all its products and make useful recommendations.

The message from all these players entering the market is that ethical considerations raised by the Facebook experiment, aside, sentiment mining is only going to escalate.

The Practicalities of Branding

Comments now really matter. Whether expressed in Tweets, reviews across the web, microblogs or social media networks, user-generated comments, which in the past were being used to add volume but not much else, now matter the same way as they would in a real-world setting. As a matter of fact, when viewed against the recently-leaked Google’s Human Raters Quality Guidelines it becomes clear that Reputation and Authority (two out of the three elements that raters look for) are driven, to a large extent, by comments.

If you are a brand (or even if you’re not) you should be looking at creating the kind of content that generates positive comments and sentiment. You should then be looking to identify these positive comments (perhaps by using a semantic analysis tool like NOD3x) and amplifying them through engagement and further interaction.

You should be making the Who, Whom, What and How approach key to your marketing and that will drive your branding and your visibility in search.


Learn how Semantic Search affects your marketing with Google Semantic Search: Search Engine Optimization (SEO) Techniques That Get Your Company More Traffic, Increase Brand Impact, and Amplify Your Online Presence.

External Links

Aspect-based Opinion Mining from Online Reviews
Sentiment Analysis and Opinion Mining 
Opinion Mining and Search
Bing Lieu research papers
Researchers team up with Google to explore data