In 1929, Hungarian author Frigyes Karinthy proposed the concept of six degrees of separation: “that any two individuals could be connected through at most five acquaintances”. This idea grew from a belief that the modern world was “shrinking” due to advances in travel and communication, which meant that friendship networks were expanding. And this was decades before Facebook!
Academics, from mathematicians to psychologists have explored this hypothesis in relation to the mechanics of social networks. Some like Michael Gurevich and Manfred Koch predicted three degrees of separation across the US population. Others cited isolated indigenous populations to invalidate the strictest interpretation of this theory. Regardless of proponents or objectors, the basic premise illustrates that we are far more globally interconnected than we realise.
So what does this have to do with graphs? Well, if we as billions of human beings are so interconnected, then our terabytes of data are too – and in ways you might not even consider. Swipe your card at Pick n Pay or Woolworths and your data is indirectly connected to millions of other customers across the country. Perhaps with Sibongile in Johannesburg and Pieter in Port Elizabeth who bought the same special or Ninia in Cape Town who likes the same cheese.
But, as a company, how do you exploit this information to identify opportunities and insights, mitigate risk and, ultimately, maximise profit? If you’re running on legacy tech that is a challenge and a half. All of the extra lines of code and JOIN tables required to create this level of inter-relatedness in the tabular nature of relational database management systems makes it almost impossible. Unless, of course, you want to heavily impact operational processing power, significantly increase expenditure and receive the information days or weeks after it is useful.
Cross referencing tables of information also requires an understanding of which of the many elements need to be connected and how. This alone takes time and effort, plus it increases the complexity of your systems, which means more can go wrong. Then consider the fact that your business and user needs evolve so the elements you wish to focus on will also change. Relational databases are not proficient for rapid adjustments. NoSQL databases help but won’t solve the problem either because their disconnected construction makes it more difficult to harvest connected data properly. The issues just keep mounting up. However, there is a simple and effective solution: graph database technology.
What is a graph?
In 1735, Leonhard Euler (1707–83) presented his solution of the Königsberg bridges problem (to find a path over seven bridges, crossing each only once) to prove that it was unsolvable. This was the birth of modern graph theory and highlighted the need for suitable techniques of analysis. Nearly two centuries later, these techniques are widely available and, according to Gartner, “Finding relationships in combinations of diverse data, using graph techniques at scale, will form the foundation of modern data and analytics.”
A graph is composed of two aspects: nodes and edges. Nodes represent entities, e.g. a person, place or object. Edges, or relationships, show how two entities are related. Together, these aspects form a network of linkages that illustrate associations between all of the data in a graph. To enrich context and refine the data model further, nodes can be assigned defining attributes as key-value properties, labelled with roles or types and grouped into categories or sets. Edges can also have properties and can be single, directed (a source node pointing to a target node), self-linking (e.g. a fraudulent suspect depositing company funds into his own account) or multiple (to show more than one relationship between two entities).
New nodes and linkages can be added without compromising your existing network or migrating your data. So, essentially, graphs allow you to manipulate and weight data elements and their relationships to reflect their importance in your unique business environment. For example, if your company sells coffee but makes more profit on tea, you can choose whether to assign more significance to the higher profit or the higher quantity purchases.
The power of relationships
Graph technology prioritises the relationship over individual entities with a core rule of no “broken links”. This means that an existing relationship always points to an existing end point. Think of it like a dot-to-dot drawing. Each dot by itself is essentially irrelevant. It only provides meaning when it is connected to another one to bring the picture into being. This connections first approach results in more expressive yet simpler data models that are agile and scalable whilst maintaining continuous performance. So you don’t have to plan for each eventuality, exception or expansion because the model is easily adaptable to your changing requirements.
When data is stored as relationships the connections are already embedded. This makes it quicker and easier to process any number of analysis queries, simple, deep or complex, in real-time.
Whilst all that sounds promising, it doesn’t truly convey the magnitude of its potential. Graph is to databases what quantum is to physics. It revolutionises the way we see our data world; opens up new avenues of perceiving and analysing information and blows the cause and effect, mechanical view right out of the water. But don’t get me wrong, relational databases still provide a vital contribution, it’s just that graph takes it to a whole new experience on a micro and macro level because everything is inter-related.
Harnessing the power
Gartner identified graph technology as one of the top 10 data and analytics technology trends for 2020 forecasting that: “By 2023, graph technologies will facilitate rapid contextualization for decision making in 30% of organizations worldwide.” Its capacity to explore exponentially growing dynamic data and mine it for hidden relationships completely outstrips traditional analytics.
Let’s just take a look at a few well-known uses:
- Recommendation engines The ability to associate and classify relationships between any two entities (people, products, purchases, etc.) makes graph perfectly suited to the complex comparisons required for accurate, personalised and real-time recommendations. In addition, it can provide you with a Customer 360o view [link to download] to empower your customer services, improve company-user relations and enhance customer experience [link to blog].
- Master Data Management: Traditional siloed storage systems can lead to misinformation and miscommunication between departments, suppliers and customers. Using graph technology, you can aggregate data from all sources into one central database, control access and manage distribution whilst maintaining quality and persistence.
- Fraud detection and risk management: Relational databases struggle to process real-time link analysis across large or complex datasets. Graphs do this with ease, are agile and scalable and have minimal operational risk. They find implicit correlations that provide early-warnings of suspicious fraudulent behaviour and minimize false positives.
- Context-aware services: These services react and adapt to changing circumstances (e.g. location, user history) to provide the user with relevant real-time information, e.g. traffic updates, restaurants, tourist attractions. The agility of graphs to connect any two nodes creates a multitude of applications for this, from contextual marketing to health alerts (e.g. COVID-19 awareness and high pollen count notifications) and diagnostics to crime prevention.
- Network management: Any network (telecoms, IT, power grids, supply chains) benefits from a unified view of operations. Especially those that have built up over time, been modified or merged, are misaligned or have multiple vendors using different systems. This results in complex communication, maintenance and feedback structures and siloed data. Using a graph database, managers can catalogue assets, visualise deployment, predict maintenance, ensure end-to-end redundancy (e.g. if a unit is down, are there alternative routes and how are users impacted) and have an overall view of even the most complex business domains.
- Situational awareness: This can be summarised as perceive, understand, plan. Graphs can map a myriad of environmental elements (such as the weather or traffic) in real-time to determine optimal times for specific events (e.g. deployment of outdoor structures or conventions), plan traffic routes (e.g. for logistics or taxi fleets) and mitigate risk (e.g. trawlers avoiding storms or piracy zones).
Then there are the leading edge applications like tracking infections , semantic search and symbolic AI . The options are, quite literally, endless.
Graph3
Locstat is to graph what Stephen Hawking is to physics. That is, we’ve taken it to a completely new level of intelligence. Our unified data analytics platform combines graph database technology with key components like event processing, a strong rules engine and Machine Learning to create LightWeaver® – an exceptionally potent, big and fast data solution. Important steps such as feature engineering and the essential human intelligence element ensure the platform is operationally ready for your unique business requirements. So whatever your data intelligence needs are – Locstat has the answer.
Harness the power of graph!