The Smart Data Blog

Cambridge Semantics Shatters Previous Record of Loading and Querying ‘Trillion Triples’ by 100X - So What?

Posted by Marty Loughlin on Dec 20, 2016 4:37:00 PM

"So what?" you might say. Another hyperbole-fueled headline in tech is hardly a notable event. To answer, let's start with what we did.

Cambridge Semantics Inc. (CSI) recently completed the Lehigh University Benchmark (LUBM) at one trillion triple scale with our massively parallel, in-memory graph database, the Anzo Graph Query Engine (AGQE).

For reference, one trillion triples is equivalent to 133 facts for each of the 7 billion people on earth.rally-395899_640.jpg

We executed the benchmark, which involved loading and querying the data, on the Google Cloud platform (GCP) and we completed it more than 100 times faster than the previous record set by Oracle in September 2014.

Why is this important?

  1. Open semantic web technologies have offered powerful solutions for data integration, management and discovery analytics for over a decade, The challenge with these solutions has been that they didn't scale to enterprise data volumes. Today, with AGQE running on big data platforms like GCP, this challenge is solved. It is now possible to interactively query very large and diverse data sets, well beyond what most enterprises need today. We can also benefit from the many years of development in tooling like the Anzo Smart Data Lake that natively support AGQE.

  2. The previous benchmark took over 200 hours to run. CSI completed it in under two hours with AGQE. This ability to rapidly load and query very large data sets allows us to take advantage of elastic cloud capacity. We can spin up instances on-demand and only pay for what we use. For example, this benchmark cost less than $1,000 to run. The potential cost savings are enormous.

For a copy of the benchmark report and to learn more about how you can harness the power of AGQE in your enterprise, download the whitepaper "TRILLION-TRIPLES BENCHMARKING" here.

Download the Whitepaper

This post was originally posted on LinkedIn.

Topics: Big Data, Anzo