The Smart Data Blog

Enabling Data Scientists by Reducing the Burden of IT

Posted by Kirk Newell on Aug 7, 2016 2:00:00 PM

Find me on:

The cliches are well known by now: data scientists spend the majority of their time simply preparing data for analytics, inheriting the responsibilities of IT teams that traditionally took months to process simple query results.


But not if they utilize semantics. A number of semantic technologies are directly responsible for reducing the time and effort required for basic data management staples of data preparation, data discovery, and analytics.

Today, these technologies are able to substantially accelerate data management from initial ingestion to analytic insight, enabling them to focus on building solutions to solve business problems while reducing the data backlog of IT departments.

Preparing Data
Data preparation is a data management synonym that applies to the tedious aspects of data cleansing, transformation, and integration that consumes the time of IT and data scientists. Smart data technologies expedite this onus in multiple ways. Inclusive ontologies (semantic models) quickly adjust to incorporate new data and requirements so that all data adheres to uniform standards. Data governance and data quality principles can also be modeled and mapped to business glossaries, which impacts data cleansing outcomes. These models can also generate code for transformation, a vital prerequisite for loading applications. The autonomous nature of such preparation hastens integration efforts, allowing data scientists to explore the implications of application or analytics results.

Discovering Data
Data discovery is the means by which data sets are deemed germane for specific business problems. Semantic graphs assist ontologies in this endeavor by connecting all data in a single framework. The underlying RDF system is designed to hone in on relationships between data elements, considering various attributes and metadata as they pertain to each node. In this environment, the semantic graphs are able to determine a contextualized relevancy between data that is crucial for timely, apropos data discovery. When deployed in departmental or enterprise-wide semantic data lakes, these graphs facilitate the discovery of a host of relationships and context that might otherwise be missed. This framework substantially assists the workloads of data scientists while reducing time to action.

Easier, Faster, Better
Semantic technologies are an enabler for data scientists. They tame and accelerate data preparation necessities, and engender the same effect for data discovery. Data-driven action - analytics or application operations - becomes easier, faster, and better with semantics, helping them do their jobs while reducing the wait for IT teams to assist.

To learn more about semantic technology, watch the on-demand webinar "Semantic Graph Databases: The Evolution of Relational Databases".

View Webinar

Topics: Smart Data, Semantics