The Smart Data Blog

The Smart Data Lake

Posted by Marty Loughlin on Jun 9, 2015 10:31:53 AM

The Data Lake promises to transform enterprise data management and analytics by providing ubiquitous access to all enterprise data. Unlike traditional data warehouses that are often inflexible, brittle and expensive, the data lake accommodates any type of data and stores it cheaply, in very large volumes, on commodity hardware.

mountain lakeThe point is to increase the business value of enterprise data by making it more accessible to business users. While the data lake offers some compelling benefits, it also introduces a new set of challenges:

  • Data is frequently stored in the lake in its untransformed, source format. While this preserves the data for maximum reuse it is difficult to present business users with a good understanding what data is available and what It means.
  • Most data lake solutions lack maturity so you need to piece together tools from many vendors to create business solutions. You also need very specialized, expensive resources to get value from your data.
  • Important enterprise data management practices (security, quality, provenance) are difficult to implement and enforce.

View our infographic "The Tale of Two Data Lakes: Smart Data Lake vs Traditional Data Lake"


The Anzo Smart Data Lake® (ASDL) architecture from Cambridge Semantics offers a revolutionary new way to address these challenges in a single, horizontally-scalable, high-performance platform. It delivers all the powerful Anzo capabilities: flexible dashboards, pragmatic modeling and semantic data integration, at big data scale.

Key features of the Smart Data Lake include:

  • Business-friendly semantic models (enterprise knowledge graphs) that describe all of your data in common business terms
  • Consistent, analyst-ready, tool sets for data ingestion, transformation and consumption
  • Rapid data ingestion into Hadoop HDFS from any source, structured or unstructured
  • Linking and contextualization of data across very large and diverse data sets
  • Business user self-service data catalog
  • In-memory analytics accelerator supporting interactive queries on big data sets
  • Model driven data transformation using Apache Spark – no coding required

The Anzo Smart Data Lake combines the benefits of a single, consistent tool set with business friendly semantic models to enable flexible, sophisticated big data solutions to be developed in record time. Business use cases include Pharma clinical trial data management and data governance in Financial Services.

To learn more about Smart Data Lakes, download our whitepaper on the Anzo Smart Data Lake.

Download the Whitepaper

Topics: Data Management, Data Lake, Smart Data Lake