At Cambridge Semantics we use the W3C semantic web standards to create conceptual canonical data models, in particularly using the web ontology modeling language called OWL. The conceptual models are declarative and express information in the way that the domain expert or business user, understands it – usually as a series of interlinked concepts and properties. Unlike most traditional technologies, these conceptual models are independent of how data is stored and provide an abstraction sometimes called “a semantic layer” for our Anzo software.
The result is unprecedented flexibility.
The conceptual models can reflect different versions of the truth if necessary; either evolved over time or in how different groups of users understand different concepts just sharing what is common between them, since they are independent of storage system constraints. OWL names concepts uniquely across foreign languages and even multiple human readable names or labels for properties that describe the same concepts. Using W3C open data standards ensures that the conceptual models can encode a common vocabulary for different parts of an enterprise or members of an information supply chain to talk jointly about and share their data far more easily.
The conceptual models can encode property restrictions, controlled vocabularies and can be annotated with information useful to users, ETL processes, query & form building tools, downstream data consumer applications; validation logic etc. Again this is all in expressed in declarative fashion, independent of all the storage systems and information consuming/producing applications but reusable by any of them.
The conceptual models themselves are often based or can import elements of a growing set of existing industry or domain models expressed in OWL or other representations and since they are standards based the models can easily be shared for reuse by partners, customers, vendors or anyone else that would like to align their information conceptually.
The OWL ontology language facilitates the tagging of data instances from multiple data sources with their meanings, to form a single integrated view of that data built on a multitude of simple factual statements called RDF triples. RDF is an open standards based graph oriented data representation in which objects or graph nodes have properties, some of which have data values and others are pointers to further nodes in the graph. The graph model is an intuitive one for humans who tend to think by associations of objects and their properties. It is far easier for most of us to traverse a series of interlinked concepts to figure out what data we have or need and how it is related, than think about say the interlinked table structures in the data schema model offered by the relational data technology for example.
Once ontologies are established, these conceptual models can easily be operationalized. Together middleware and tooling software that supports the family of W3C standards designed to “play nice together”, the models can drive and underpin nearly every aspect of a system. Here are some examples supported in our own Anzo software:
The Anzo software is entirely driven by standards based conceptual models. It includes many software components that are all designed to work together driven by the same conceptual models:
The reason the Anzo approach is so flexible and dynamic is that it takes a holistic approach to all these software components providing different functions and has arranged that all coordinate as a cohesive system using the common understanding provided by the shared conceptual model. In the traditional world, each of these components would be different piece parts, often provided by different vendors, requiring a system integrator to configure or program what is necessary to tie them into a single system.
In Anzo, a change to the conceptual model or the creation of a new model, as the business changes, is better understood or a new need develops, is reflected everywhere immediately and repairs to dashboards and ETL maps can quickly be affected to reflect the new reality. Indeed the old reality can often be left to co-exist in the same system if there are downstream applications that still rely on it. Often it will be the end users themselves who make these changes since access to data has been so simplified through the use of the conceptual models.
Contrast this conceptual semantic layer approach with traditionally built systems where for every alteration you will need the different people skilled in understanding the interactions of all the different piece part components of a solution, that do not share an abstracted model, to modify the logic used to glue those parts together – generally a long and costly business that soaks up the greatest proportion of the overall IT spend.