The Smart Data Blog

Interested in Apache Spark without the Coding?

Posted by Marty Loughlin on Jun 22, 2015 11:08:20 AM

Many Hadoop users, seeking higher performance and a better analytics engine, are turning to Apache Spark for data transformation (ELT) on HDFS. While Spark offers many advantages, you still need programmers (Scala or Java) to create your jobs.

Cambridge Semantics is adding Spark support to our Anzo Smart Data Integration (ASDI) platform. This will allow business analysts to automatically generate Spark jobs from business friendly source/target mappings, without any coding. We are interested in your feedback on this approach as well as input on any “must have” features. 

Today, Anzo Smart Data Integration (ASDI) makes the data integration process faster and cheaper by eliminating the need for coding and leveraging reusable business assets such as common data models and source/target maps.

Key capabilities include:

  • Enabling business analysts to capture reusable data maps (including complex transformations) from common models to a wide variety of source and target formats
  • Combining the maps to create end-to-end data mapping and transformation jobs
  • Automatically generating ETL for industry standard platforms (Informatica, Pentaho,..)
  • Providing search tools for data lineage 

You can see a short intro video to ASDI here:http://www.cambridgesemantics.com/products/anzo-smart-data-integration 

The proposed enhancements to ASDI will enable Apache Spark jobs for ELT to be automatically generated from the same high-level mappings and common models currently used for ETL. This will allow business analysts to rapidly create and execute big data analytics and transformation jobs directly from business-friendly domain models without any coding.

We are very interested to hear your feedback on the value of this approach to your organization - please send comments to marty@cambrigdesemantics.com

Topics: Smart Data, Data Integration, Big Data, Spark, Hadoop