Why Data Fabrics need Semantics and Graph Data Models
Even with advances in data management and analytic technologies, companies struggle to balance the robust analytics demands that business users require for insight, along with the challenges IT faces in managing massive quantities of complex and distributed data. The data fabric enables the data-driven business to prevail through both.
Forrester’s Noel Yuhanna has published an update to “The Forrester Wave™: Big Data Fabric, Q2 2018”. Noel’s new report, “Big Data Fabric 2.0 Drives Data Democratization”, recommends that data-driven businesses make a big data fabric part of their data strategy to minimize time and effort spent ingesting, integrating, curating and securing data insights.
In our view, the report reinforces the idea that a semantic layer and graph engine, like that powering Anzo®, Cambridge Semantics’ data discovery and integration platform, is key to reducing data management complexity and accelerating data democratization.
In this blog post, we explore how and why semantics and graph data models are necessary when implementing a data fabric architecture.
Semantics and graph data models reduce time-to-answer
The disruptive, digital transformation enabled by the data fabric is predicated on using all of one’s data. The data fabric accelerates time-to-answer by making all of an enterprise’s data connected and reusable. It helps organizations put their data into the hands of business users, on-demand, so that it is available for the complex analytics that drive transformation. A modern discovery and integration layer at the top of the data fabric is essential, and most powerful when based on semantics and graph data models.
More so than other approaches, the graph data model is well-suited to integrating and connecting data. By definition, graphs model the connections between entities, and these relationships are prioritized. These connections provide context and meaning to the data. When applied at the discovery and integration layer, more of these connections can be made faster to serve business needs.
Semantics, based on open W3C standards, enable these connections by presenting the data in the business terms or language that the users are accustomed to, as opposed to the often cryptic, unintelligible naming of data typically found in the underlying databases. This common business-oriented model enables users understand and use vast collections of complex, siloed data to build valuable, blended, analytics-ready data products, as needed.
This type of business driven, on-demand data discovery and preparation is a game changer for organizations accustomed to relying on (and often, waiting on) IT to provide highly reusable access to enterprise data. If business users have on-demand access to all of an enterprise’s data and are able to create personalized data products, quickly, they will do it more often, making data-driven decision-making an organizational norm. And before long, this culture shift will result in increased operational efficiencies, new revenue streams, an elevated customer experience, and more.
Semantics and graph data models simplify integration of complex, siloed data
Enterprise data is rarely homogenous, centrally located, or static. Rather, it is complex, unstructured, and siloed. It is also constantly changing - new data sources are added continuously and new use cases generated. To be reusable, enterprise data must be made easy to find, simple to understand, and meaningful to business users, when, where, and how they want it.
The inherent flexibility of graph data models accommodates the numerous and changing connections between entity types across data sources. Complex sources of data can be added on the fly with minimal effect on performance or operations. Additionally, when overlaid with semantics, the graph data model adopts a business context that makes it easier to blend data around a topic of interest, regardless of its source or structure. Related data sets from siloed sources can be easily linked and combined in blended data products.
Semantics and graph data models enable users to extract value from data
As part of the data fabric, semantics and graph models provide business users with a highly granular map of all enterprise data. Every single datapoint of an organization - down to the most detailed atomic level - can be captured, mapped, and queried through semantic concepts in a graph model. Business users intuitively traverse this map to access previously siloed data.
With such a granular level of access and understanding, business users quickly find answers to known and unanticipated questions as well as expose connections between related data. For organizations limited to traditional relational databases, this degree of data discovery is impractical, if not impossible, given the number of joins required and the scope of the underlying SQL queries. However, with MPP automated in-memory graph query, it is both possible and practical at enterprise scale.
Semantics and graph data models make data accessible to more users
Business needs will vary across levels of access, data maturity, cleanliness, and standardization. Users need to spend more time using data in analytics, and less on preparation. They will want a common data model, to enable greater transparency and meaning, clearer communication, and increased reuse and collaboration around data assets across projects and teams. Finally, this expanded group of data consumers want to trust their data more through better lineage, metadata, and business context.
Unlike relational models, graph data models overlaid with semantics have “built-in” meaning. They use easy-to-understand business concepts and context to present data to users and enable them to organize and group the data in meaningful ways, while also sheltering them from the complexity and obscurity of how particular data sets are actually stored and formatted at the physical level, thus accelerating their use of the data. They also allow data at any point in the raw-to-ready continuum to be made available to users, while also allowing for definition and enforcement of user access controls, ensuring that the right data is shown to the right users. This business context provided by the combination of graph data models and semantics reduces organizational dependency on IT and makes the data accessible to more users - including those of limited technical ability or analytic expertise - further reinforcing a data-driven culture.
The explosion in enterprise data sources, silos, systems, and workflows have created too much chaos for IT and business users. As the data supply continues to expand unabated and organizations become more data-driven, this challenge will only become more complex. A data fabric with discovery and integration enabled by graph data models and semantics addresses both business users’ demand for data access and robust analytics, as well as IT’s need for simplifying data integration and data management.