A typical German car manufacturer on average assembles ~10 cars per minute. Each of these cars consists of dozens of thousands of mechanical, electronic and software components produced and supplied by a few thousand companies. Each individual component has its own lengthy life cycle starting with the definition of requirements and functions through development, testing and production to field usage and service. In addition, each component has various manifestations through e.g. versions, release states, region specific and time-dependent certifications, changes in production processes, prices and my favourite: compatibility rules.
In fact, the chances that you have ever seen two exactly identical cars in your life are fairly low, independent of your age. Just an Audi A6 (C8 Series) has 10³³ possible configurations (10³³ = 1000000000000000000000000000000000). And it’s not just the automotive industry that is facing this level of immense product complexity. Aerospace, mechanical engineering and rail transport industries have very similar challenges:
There are 10⁹⁰ possible configurations of a Siemens Railway System.
Due to mass production and strong dynamics the automotive industry is one of the most complex manufacturing related industries in terms of product data management. Compound this with the fact most car manufacturers do not offer only one single model to their clients. All models must be strongly intertwined as they share parts, delivery trucks and manufacturing lines.
So much data. So many systems. So many users. So many questions. Unfortunately so many different and incomplete answers to the same questions. You can’t blame OEMs for drowning in an ocean of uninterpretable data.
But the data is there. Somewhere. In some format. And there is hope that one day we will have full transparency over everything.
But is it even realistic? Will there be a day where a homologation manager will be able to open a browser and ask:
“Show me all parts used in Model A and Model D, used for production in one of the plants in Poland,
that have a chrome coating and are not certified for Europe starting from 2023.” ?
One might argue we could build a system with traditional technology that would answer similar questions. Well we could, but what if this time a product developer wants to know from the same system:
“What happens to downstream processes in production, supply chain and
services if I change the coating of part X from chrome to nickel?”.
First, this question is much harder to answer with any traditional technology. You would require full integration of the downstream processes making up the product life cycle of the respective part. Secondly, this question is fundamentally different from the first one. It requires a different schema, different set of attributes and information from many different systems, without even elaborating on the problem of data quality within and integration between systems.
Relational technologies struggle to resolve any of these problems for two reasons.
Automotive OEMs and their suppliers run into a very respectable amount of problems, especially in terms of data management and governance. Here are a few examples:
On the other hand insufficient integration and interpretability of data as well as low speed in analysing complex data structure make data analytics difficult, especially if the requirements change or the search space is unknown. Here are a few examples for calculations that are complicated to execute with traditional systems:
These are just a few examples but you basically can add anything that requires accurate, interpretable and integrated data in dynamic environments.
A “graphy” nature. Tons of data. Many systems. Many data formats. Dynamic environment. Lots of unstructured and raw data. Despite all the complexities and problems we have just discussed, the good news is there is a solution — Knowledge Graphs.
The concept is simple: A semantic layer representing the business logic of the company integrates all source data models and acts as a translation layer between users and the ocean of data: a Knowledge Graph.
Semantic Knowledge models put every bit of data into context, aligns the meta models of the different source systems and acts as a basis for extracting information from unstructured data as text in order to make the data interpretable.
Lets decompose the term “Semantic Knowledge Model”:
There is a fundamental difference between Semantic RDF Knowledge Graphs and Labelled Property Graphs with respect to the problem described above. While LPGs have the advantage of being very easy to model due to a simplified graph scheme they lack features that are crucial to data management such as separation of data model and meta model, standardisation and more. This perpetuates the culture of building a new silo for every new problem. A few features that differentiate the two types of Graphs:
Dashboard from integrated data from 3 sources:
… to answer a question like:
“Which suppliers/ contracts are most critical to take care of based on the number of different parts supplied,
number of configurations using these parts and production numbers of these configurations?”.
Some other out of the box functions you can use in Anzo (s. below) once turning your data into a Graph:
Anzo is a complete knowledge graph platform built on a high-performance graph database engine, called AnzoGraph, that uses an in-memory MPP processing paradigm to execute queries against datasets extremely quickly, enabling agile data integration, transformation, and analytics at enterprise scale. Anzo leverages standards including W3C’s RDF, OWL, SKOS, and SPARQL to combine knowledge graphs of metadata and data which can be powerfully explored, transformed and analyzed, while also ensuring open data interoperability and easy integration with other systems. Anzo is an open overlay platform that allows users to assemble knowledge graphs against the underlying data resources without displacing or disrupting existing processes or platforms. Anzo integrates with enterprise metadata, governance, security controls and policies, and includes APIs for lights-out integration into other processes.