The Smart Data Blog

The Triple Crown of Graph Databases: Anzo Graph Query Engine

Posted by Patrick Wall on May 4, 2016 2:00:00 PM

Find me on:

All of the possibilities of big data analytics, semantic graph databases, and Smart Data Lakes™ have been realized with the emergence of Anzo Graph Query Engine (AGQE).

This massively parallel, distributed querying engine utilizes in-memory processing of semantically tagged data in graph Databases to revolutionize the power of analytics. It provides peerless analytics querying capabilities in terms of scope, scale, and speed - especially when compared to conventional relational database methods.

Scope

Relational databases are hampered by rigid data modeling and schema constraints, which constricts the types of data loaded into them and require the schema to exist before data can be written. Semantic graph databases, the underlying repositories of Smart Data Lakes, shatter that paradigm by storing data as billions of interrelated facts in a data network. They link them with an evolving semantic model that readily includes new data sources, including facts extracted from text, and entity types to produce a single graph of all enterprise data.

AGQE capitalizes on the breadth of data in these databases in two ways. First, it enables users to issue much broader ad hoc join queries than they easily could by relying on relational technologies, since there are more data types and sources to access simultaneously. Secondly, it holds parallels and relationships between disparate data that users might not be aware of, greatly enriching their analytic value and making it easier to represent far richer models of reality.

Scale

AGQE processes enormous data quantities, translating into billions of semantic facts (triples) on the largest big data sets. Such scalability is accounted for by the simultaneous querying of these data across many compute nodes. AGQE’s in-memory capabilities enable a self-service atmosphere for even laymen end users, who can peruse through huge portions of their enterprise data interactively. Such scale, counterpointed by the detailed results of the contextualized linking of the overarching semantic model, is difficult to match with any other technologies.

Speed

AGQE’s speed advantages are realized in two ways. Firstly, in the preparation and ongoing additions of data and in flexibility when it comes to changes. The modeling and schema concerns of relational methods require long periods of data preparation, which are exacerbated when incorporating new sources. This process is expedited in graph databases because all of that work is done upfront, once, and readily incorporates new sources into an evolving semantic mode, providing a huge degree of flexibility. Secondly, when actually processing queries, the aforementioned engine can process millions of facts per second to deliver results nearly instantaneously. Here flexibility is also paramount as no special indexes or re-organizations or optimization of data is necessary to achieve extraordinary interactive performance for complex ad hoc joins queries issued by users discovering, exploring and finally analyzing their data.

The Winner

The scope, scale, and speed of analytics of AGQE exceeds those for other technologies, including relational ones. Its ability to traverse all enterprise data for insight at the pace of modern business represents the triumph of technology against typical traditional stack constraints.

To learn more, watch our on-demand webinar recording "Enterprise Analytics at Scale Using Graph Database".

View Webinar

Topics: Graph, Analytics, Smart Data Lake