Semantic search, word vectors and concept associations

Language is difficult for computers to understand because words are both representational and relative. Yes, each word has a definition that represents something specific such as the word “wooden” that shows the material a boat is made out of, but it also evolves according to cultural dynamics that ascribe a relative value to each word that is only apparent when viewed in relation to other words such as the qualitative judgment that: “The actor gave us his most wooden performance to date.”

Google first started using vectors to denote the relative meaning of words in relation to their effect on other words back in 2014 and used it successfully on image search. Since then it has made inroads using the technique to also include concepts, sentence structures and, thoughts.

To understand all this a little better consider that a single word vector is made up of 300-500 numbers that represent a word’s meaning in context as it relates to other words. The difference from hard-coded logic rules that provide a dictionary definition of a word (and the part where machine learning, neural networks and AI come in) is in the fact that because there are no hard-coded values in the relationships described by each vector the values fluctuate, evolve and change according to context, usage, cultural significance and so on.

The principle is scalable and transferable to practically anything: sentiment analysis, document processing, conversations on social media platforms, translation between languages, image processing and so on. I’ve written before that search is at the heart of Artificial Intelligence (AI) breakthroughs, it’s important to iterate that because semantic search deals with a nuanced approach to real-world queries that are governed by the fluidity of the Big Data attributes of Volume, Velocity, Variety and Veracity contextual comprehension of language queries is key. Word vectors look at the attributes of words which means that they are better able to model natural language and understand its fluidity.

Things are advanced enough now for Google to have launched Semantic Experiences a website where in a slightly gamified environment visitors can choose to “talk to books” mining the interrelated ideas, concepts and thoughts across volumes stored in Google’s database and play a Tetris-like game, appropriately called Semantris, of word blocks and their associated meanings.

What Does All This Mean?

Cool as all this may sound to search buffs and machine learning enthusiasts the real value comes when we understand its impact on ranking websites and surfacing their content in response to search queries. I’ve said a few times in the past that chasing search engines is a self-defeating activity. That is now truer than ever.

Search engine developments are not so much about understanding and ranking a website as about understanding and ranking a website in direct relation to searcher queries and searcher intent. Context and intent are key in your content creation now. That means when you do create content you need to take into account the context in which it will appear and the intent of those searching for it. A website selling “Butter knives” that appears on a search where someone is looking for the kind of knife that will chop down the branch of a tree is a waste of time and resources which will affect its ranking because they affect the searcher experience.

I understand that thinking about search engines is always easier than thinking about people. Search engine developments are always clearer to understand and better defined in their intent. People’s intent is messy, subject to change and frequently when people search for what a website sells they have no idea how to actually look for it.

The increased use of vectors in modelling concepts as well as in modelling words will go some way towards addressing that imbalance. But webmasters and website owners are still on the hook to provide:

  • Clarity of purpose of their business
  • Their unique selling proposition
  • Their unique online style and identity
  • A memorable end-user experience

These are basic, minimum requirements to actually turning website visitors into paying customers. Without a structured content creation strategy in place and a means of promoting your content on social media networks your content creation efforts will be just another dot in a canvas of dots that fail to add up to a coherent picture.

Additional Research

A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music


Google Semantic Search: Search Engine Optimization (SEO) Techniques That Get Your Company More Traffic, Increase Brand Impact, and Amplify Your Online Presence 
SEO Help: 20 Semantic Search Steps that Will Help Your Business Grow