The Smart Data Blog

Data Integration up to 90% Faster and Cheaper

Posted by Marty Loughlin on Nov 18, 2014 5:51:00 PM

Data Integration is one of those necessary but evil tasks that is part of almost every project - M&A, system consolidation, customer onboarding and regulatory reporting all require data to be moved, transformed or combined.

The traditional approach to doing data integration is to have experts in the source and target systems create a map that describes how every field in the source is transformed and moved to the appropriate field in the target. This map, often captured in Excel, is then handed off to an IT team to code the ETL job that does the work. This code is then handed off to yet more teams for testing and ultimately, deployment. This process is time consuming, error-prone and expensive.

A powerful new approach to addressing this challenge involves using semantic web technology as the "data glue" to guide integration and dramatically simplify the process. There are several key components to this approach:

  • Using semantic models to describe data in standard business terms (e.g., FIBO, CDISC, existing enterprise model etc.)
  • Mapping source and target data to the semantic model instead of directly from source to target
  • Combining these maps as needed to create end-to-end semantic descriptions of ETL jobs
  • Automatically generating ETL code from the semantic descriptions for leading ETL tools (e.g., Informatica and Pentaho)

There are significant benefits to this approach:

  • Data integration can be done by business analysts with minimal IT involvement
  • Adding a new source or target only requires an expert in that system to map to the common model as all maps are reusable
  • The time and cost do an integration project can be reduced up to 90%
  • Projects can be repurposed to a new ETL tool with the click of a mouse
  • The semantic model that describes that data, sources, maps and transformation is always up-to-date and can be queried for data meaning and lineage

Check out Anzo Smart Data Integration from Cambridge Semantics to see this approach in action.

Topics: Smart Data, Data Management, Data Integration, Big Data