Semantic Search Terms you need to know

Mention the word “semantic” and things get all weird, for a little more confusion add an “s” at the end and throw in the word “technology” and by the end of the sentence no one is quite sure what we are talking about. 

The following terms are a must-know. 

Arcs – Points in a semantic graph at which lines or pathways intersect or branch. It is usually a central or connecting point. Or a single piece of data constituting an Entity. Arcs are also called Edges. 

Attributes – Are inherent qualities of a person or an object that are characteristic of it. For example, the fact that wood floats and stones sink are attributes of wood and stone respectively. In semantic search the collection of attributes that go into characterizing something are language independent and constitute its unique identity. 

Concept Search – In semantic search concept search is when the information retrieved and presented is mapped onto the concept represented by the search query rather than the words used in the search query itself. Conceptual search makes use of a number of distinct signals that include the ideas contained in the search query, the ideas contained in the web documents presented, search-user history, global search query patterns and even location-awareness and device-awareness. 

Domain – Domain or domain authority is the gradual increase in expertise that is acquired, in search, through the accumulation of signals. These can include search-user behavior, click-through rate (CTR) scores, social engagement, sentiment, mentions, citations and links, amongst others. Domain expertise is desirable because it increase the likelihood of visibility in search when concepts that are directly mapped to your expertise are expressed in a search query. 

Edges – This is the technical name for connections established between a person or a thing and attributes and a person or a thing and other person or things. Edges basically govern the relationships between everything that is mapped in semantic search. There are many complex ways of determining the quality of an edge and its importance (usually called edge weighing) and each method ascribes a value that determines the importance of the person or thing being mapped. It is important to remember that edge values between persons and things are not fixed and can change as the relationships established between them evolve and change. Edges are also called arcs. 

Entity – An entity is a person or a thing that has a number of established attributes. While it is tempting to think that only real persons and things are entities this is not true. Harry Potter (a fictional character) has established attributes that characterize him and make him instantly recognizable so he is also an entity so the dictionary definition of an entity being as something that exists or is in being, is not entirely accurate when it comes to semantic search. An entity, therefore is a readily verifiable person or thing which has a number of established attributes that characterize it. 

Entity Extraction – Semantic search involves the extraction of entities (i.e. structured data) from the largely unstructured information that exists across the web. There are many ways to extract entities from web pages. This is mentioned in the Introduction to Semantic Search video:

Graph – In mathematics graphs are structures which are used to model the connections between pairs of items across large (or even relatively infinite) data sets. Within semantic search graphs allow the relative importance of pieces of data to surface in connection to other pieces of data.  

Hierarchy – Within a semantic network arcs and nodes have a taxonomic hierarchy that determines the relational values that are ascribed to them. In other words the position of a node (or arc) in the semantic hierarchy is a first evaluation of its importance. 

Linked Data - In semantic search terminology, Linked Data is the term used to describe a method of exposing and connecting data on the Web from different sources. The pre-semantic search web uses hypertext links that allow people to move from one document to another. Linked data applies a machine readable layer to any web document that allows information to be linked and surface not just for humans but also machines. 

Machine Learning – Is a type of artificial intelligence (AI) programming that allows computers to learn from new data they encounter without being explicitly programmed to do so. Machine learning is behind semantic search in its different guises (mobile search, image search, Google Hummingbird on desktop search, etc). 

Natural Language Processing – Also known as NLP for short, natural language processing is where artificial intelligence and linguistics meet. When you speak to your Android phone, Siri or Cortana you are experiencing the results of NLP research. Natural Language Processing is at work in all human-computer interaction instances. 

Neural Networks – These are computer systems that are modelled on the human brain and the nervous system. They are used to estimate or approximate functions that rely on multiple input sources the value of which is largely unknown. Neural networks are marked by their ability to recognize patterns and display adaptive behavior. 

NLP – Natural Language Processing (see definition above). 

Nodes – a Node represents a concept in a semantic network.  

Ontology – Is a name from philosophy and refers to existence and being. Applied to semantic search and the semantic web it refers to the grouping of specific content into collections of data that are united by nature or have other similarities. This allows the assessment of Domain Authority to be made a little easier. It also allows the creation of Hierarchies (see definition above) and helps in Entity Extraction. 

Semantic – The word semantic refers to meaning, usually applied to language, but it is also applied to logic and, in computer science, to the inferences that can be drawn from the metadata and relational connections between sets of data or even individual nodes. 

Semantic Networks - A semantic network, also called a frame network, is a network which represents semantic relations between concepts or Entities. Semantic networks are used to create forms of representational knowledge like Google’s Knowledge Graph

Semantic Search – Uses relational data and entity extraction to deliver results from the indexable dataspace of the web that uses searcher intent and context to generate more relevant results. 

Semantic Technology - Is the name applied to any technology that uses an encoding process whereby meaning is stored separately from data and content. This enables a fluidity to searches and systems operations which was not possible before semantic search came along. 

Semantic Web – The idea of the semantic web was first aired by the father of the web Tim Berners-Lee in a Scientific American interview in 2001 where he expressed a vision of a web geared more to the end user in an more intuitive intelligent way. TBL’s vision required the user-actioned implementation of structured data. While this is happening, the semantic web we are experiencing right now is the version of the web stored in Google’s servers and that of other search engines. 

Structured Data – Is the name given to any data set where the nodes within it are linked by edges. Structured data is stored in relational databases It is differentiated from unstructured data where each piece of data exists without any significant relational connection to any other. There is also a type of data that exists in an in-between state, called semi-structured.  

Taxonomy – In computer science it is the branch of science concerned with classification, especially of collections of ontologies. 

Triples - A triple is a data entity composed of subject-predicate-object which helps create the relational connections that define entities (eg. David is an author or the knowledge graph is a relational database). 

Tuples – In semantic search and computer science a tuple is an ordered, finite, list of elements. Tuples are part of the Entity Extraction (see definition above) process that Google employs in its indexing of the web. 

Vocabularies – in semantic search vocabularies define the concepts and relationships used to describe and represent an area of study. RDF (Resource Description Framework) and RDFa (Resource Description Framework in attributes) are both examples of such a vocabulary. 


You can download a free copy of Google Semantic Search here.