Self Service Data Preparation Are You Seeing The Full Picture Background

Join Our Webcast on June 13: Accelerating Value From Your Azure Data Lake

June 5, 2018
by
· 3 min read

Through 2018 90% of deployed data lakes will be rendered useless as they’re overwhelmed with information assets captured for uncertain use cases, according to Gartner.1 This is despite growth from pure play Hadoop vendors like Hortonworks and Cloudera. Join our webcast to learn key steps to accelerate value based on our learnings with numerous customers that leverage Self-Service Data Prep solutions in Azure and other cloud environments.

Why is it so Difficult to Get Value Out of the Data Lake?

Without giving away the entire webcast, there are a number of issues that contribute to the challenges. I will highlight a few of them here, but join our webcast to learn more on how to overcome these challenges:

  • The premise of the data lake requires a new data “lifestyle.” Our traditional Enterprise Data Warehouse (EDW) was modeled, designed, and built with specific predefined questions in mind, such as we knew the quest and then built a dataset to answer that. Easy enough. The data lake on the other hand is more about collecting data – any kind of data – and then see what you can answer with this data.
  • Traditional thinking and technologies will not help you much. The reason why I refer to it as a “lifestyle” is because it requires different thinking, different tools, and in cases, different people. In this new lifestyle, self-service and empowerment of the data analysts, data scientists, and power users is a must. We cannot gate the exploratory analytical with having to rely on costly, scarce IT resources.
  • Designing for the pilot vs. design for success. The Open Source world of Hadoop is a wonderful playground with all kinds of tools for different things. While it is a great place to start, quite often these point tools can be extremely complex and require technical skills you do not have or at least not enough of them. They also often lack the enterprise characteristics you need for success in production – governance, security, data lineage. Success is not generating the insight – success is when your insight is informing every person’s behavior or every application is being improved.

Still Confused About the Business Value of the Data Lake?

Data is a strategic asset for business, but most of us treat it as a silo’d insight that informs one person or one team. It has to touch everyone! The challenge with data, as outlined above, is that 80% of the effort can be spent finding, shaping, and cleaning the data for analytics, data science or maybe a new app.

Self-Service Data Prep for Microsoft Azure

Moving to the cloud is a given for most companies today and Microsoft Azure provides a robust, enterprise ready environment to run your apps and analytical workloads. With our partnership with Microsoft, Data Prep has brought our industry leading Self-Service Data Preparation Solution to the Azure Marketplace to help you get up and running very quickly and have the elastic scaling on demand when you need it.

1 Gartner Derive Value from Data Lakes Using Analytics Design Patterns, Svenltana Sicular, Joao Tapadinhas, Cindi Howson, 26 September 2017.

Free Trial
DataRobot Data Prep

Interactively explore, combine, and shape diverse datasets into data ready for machine learning and AI applications

Try now for free

About the author
DataRobot

Value-Driven AI

DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.

Meet DataRobot
  • Listen to the blog
     
  • Share this post
    Subscribe to DataRobot Blog
    Newsletter Subscription
    Subscribe to our Blog