Designing Serendipity in Semantic Search

From a certain perspective it is oxymoronic to even consider engineering serendipity (the occurrence and development of events by chance in a happy or beneficial way) in search. By its very definition, the moment you engineer “chance occurrences” they stop being chance occurrences and become, instead, part of the engineered algorithms of ‘ordinary’ search.

This shows that the argument about serendipity is deep and it has both a philosophical and a practical angle. It is also, intrinsically linked to semantic search and, as it happens will also affect your marketing.

Now, I confess to an affinity on this particular subject because from an engineering point of view it combines in search everything that fascinates us: the ability of machines to think like humans, our ability to code machines to understand what delights us, humans, the ability of search to pull in data in a way that is not only highly contextual in terms of a search query but also relevant from a very human point of view. You could say, as a matter of fact, that programming serendipitous discovery is akin to the Turing test of search. If search algorithms successfully pass that test every time, we’re into the realm of intelligent search.

The reason I am writing about it here however is because beyond the philosophical implications and engineering challenges that serendipity in search presents us with, it also offers opportunities in terms of visibility across the web and marketing.

It all has to do, of course, with you finding your target audience. Some of that audience you know where to find and you know who it includes. But there is always a missing segment of that audience that you cannot reach because you:

• May not have the budget to target it cost-effectively
• Suspect that it’s so widely scattered that it’s unrealistic to they and target it
• May not know how to find it
• Did not know it existed

Many of those in that missing segment of your target audience will be unaware of your existence or may not know that you have what they are looking for.

Squaring that exact circle is what serendipitous discovery in search is designed to do. This is a unique feature of semantic search.

Programmed Serendipity is a Semantic Search Issue

In pre-Humming Bird days serendipity in search was provided by the usual stumbling as we went from one little blue link to another, one page in search to another, checking each page for context and relevance.

Clearly, serendipity is not about just adding random information on a search page hoping that some of that will resonate with the end user who will then find it “delightful”. That’s just adding noise and spam, not value. In experiments carried out by both Yahoo and Microsoft research the experimenters found that “Syntactic but not semantic connections introduce unwanted noise and errors”.

This is something Google understands well. Google search now is fully semantic. When an algorithm checks context and relevance there is a risk that it will spin in ever tightening circles, narrowing the search horizon down in each case until, eventually, search becomes a blinkered, tunnel-vision affair. You might argue that it would then satisfy, perhaps, our need for a quick answer to a pressing problem but it would hardly do justice to our ability to ‘see’ our world through the information we uncover. In plain English search could become efficient but dull.

Researchers working at Microsoft Research tested to see if that were true and discovered that personalization in search results tends to increase serendipity. In particular they mentioned how “Collaborative filtering systems identify interesting content by matching individuals with other, similar, individuals.” In Google search Google+ is the ultimate “collaborative filtering system”. The ability to develop and then define a personal network of contacts that allows content to surface that has a strong feel of serendipity is directly attributed to what Yahoo researchers call the “entity network”.

Basically in order to successfully engineer serendipity in search requires that beyond the precise answers that search delivers it also delivers a subset that is drawn on entirely different ranking parameters the success of which is based on whether the results are:

• Surprising (how unexpected they are)
• Useful (how related they are to the searcher’s original intent)
• Empowering (do they contribute to the domain expertise of the searcher)

These three become the strict requirements of a serendipitous find. As a Microsoft Research paper on the subject put it: “When a user with no a priori intentions interacts with a node of information and acquires useful information, then serendipitous information retrieval occurs. The success of serendipitous discovery is not just the find itself, but being able or willing to do something with it, so that users get more insight and can enhance the domain expertise.

While there are many different ways to try and program serendipity in search they all have some common characteristics and it is in that commonality that we begin to understand the governing mechanics of serendipity.

Here’s what makes serendipity work:
Enhanced metadata - not just about the subject topic but also its popularity, importance and context. The appeal it has for certain people and their engagement patterns with it. Detailed sentiment analysis plays a key role here.

Entity networks – Ontologies (the grouping items and topics by subject), data-rich content, identity-defining online activities, all help hyper-contextualize content and personal connections and create high-trust mini-networks around specific interests or topics. These then become the first port of call of search when it comes to looking for material that will help broaden the search horizon but can still meet the requirements the three strict requirements of a serendipitous find. For Google, Google+ is probably one of the easiest ways it has to develop high-trust entity networks around shared interests. This is also why personalization in search, rather than narrowing down the search horizon, actually broadens it, enriching the available search content by the particular likes and affinities of those we are strongly connected to in our personal networks.

Temporal Characteristics – Recency in content, trendiness (sometimes referred to as topicality) and novelty (or uniqueness) are all characteristics that serendipitous programming looks for in contextual content it is about to surface.

Quality – This is perhaps the least surprising characteristic as Google has been insistent about quality for a long time. Content that is of real value to the end user has the best chance of surfacing as a serendipitous find in search.

Proving the point of serendipity in Google+ (and Google in general) is a video that was surfaced by +Martin Shervington where Google Chairman Eric Schmidt talks about serendipity, amongst other things at Zeitgesit Europe, 2013: 


What You Need to Know

Great as all this may be to know its real value is when we go past the geek factor and actually ask the question “How can I take advantage of serendipitous discovery in search to surface my content to a wider audience and find people who are usually hard to reach?”

Here is what you need to do:

Have a data-dense presence on the web that makes it easy for Google to extract entities from your content. Make sure, for instance, that your website, your Google+ presence and your Twitter account are easily mined from your online activities. Ensure that content you share is surrounded by enough high-trust data (links to trustworthy sources, cites authoritative websites, etc) for Google to feel that it’s an important piece of information that must be included in search.

Deliver value in every item of content you create that helps bolster your online identity (see point above).

Create wide networks of online contacts that you use to find content (people you follow on Google+, Facebook and Twitter, for instance).

Generate engagement in your content through the social web. Commenting, sharing and ‘smaller’ interactions such as Re-Tweets, Likes and +1s all have an impact that becomes part of the “social proof” of a piece of content.

Add context – don’t just create (or curate) great content. Add the kind of deep, detailed information that helps it link to trends, establish its fresheness and allows the end-user to establish a sense of its value in relation to the time of its publication.

Create content that resonates at an emotional level with your audience. Comments, their frequency, the type of comment (positive or negative) and its length, who made it, when and in what context are all signals that serendipitous discovery in search can use to help serve your content to a searcher.

Bottom line, content that’s created only for marketing, that fails to support and enhance your digital identity, that ignores the emotional touchpoints and needs of its prospective audience and which has no impact or relation with anything that’s happening across the web (or even across our social networks and affects us) is unlikely to do very much for you in terms of search, branding or reputation. Follow the six points above however and you will find your business getting attention from the most unexpected quarters at almost negligible cost to you.



