Close
  • Home
  • Applications
      • Asset Management
      • Business Intelligence
      • CEM
      • Cognitive
      • Compliance
      • CRM
      • Data Center
      • E-Invoicing/E-Billing
      • Enterprise Communication
      • Enterprise Mobility
      • ERP
      • Facility Management
      • GDPR
      • Human Resource
      • Information Security
      • ITSM
      • Managed IT Services
      • MarTech
      • Payment and Card
      • Procurement
      • RegTech
      • Risk Management
      • RPA
      • Software Testing
      • Unified Communication
  • Verticals
      • Automotive
      • Casino Tech
      • Contact Center
      • Enterprise Startups
      • Field Service
      • FinTech
      • Healthcare
      • Legal Tech
      • PropTech
      • Telecom
      • Travel and Hospitality
  • Technologies
      • Agile
      • Artificial Intelligence
      • Augmented & Virtual Reality
      • Big Data
      • Blockchain
      • Cloud
      • Data Analytics
      • DevOps
      • Drone
      • HPC
      • IoT
      • Robotics
      • Smart City
      • Storage
  • Company Eco System
      • Adobe
      • Dassault Systemes
      • HPE
      • IBM
      • Microsoft
      • Oracle
      • Salesforce
      • SAP
  • News
  • conferences
  • Newsletter
  • Specials

  • Menu
      • Big Data
      • Blockchain
      • Casino Tech
      • CEM
      • Cloud
      • CRM
      • DevOps
      • Drone
      • Facility Management
      • GDPR
      • IoT
      • Legal Tech
      • Oracle
      • PropTech
      • RPA
  • Blockchain
  • Cloud
  • CRM
  • Drone
  • Facility Management
  • IoT
  • Oracle
Specials
  • Specials

  • Big Data
  • Blockchain
  • Casino Tech
  • CEM
  • Cloud
  • CRM
  • DevOps
  • Drone
  • Facility Management
  • GDPR
  • IoT
  • Legal Tech
  • Oracle
  • PropTech
  • RPA
×
#

CIO Applications Europe Weekly Brief

Be first to read the latest tech news, Industry Leader's Insights, and CIO interviews of medium and large enterprises exclusively from CIO Applications Europe

Subscribe

loading
  • Home
  • DevOps
Editor's Pick(1 - 4 of 8)
left
DevOps and Security

Nadeem De Vree, CCO & CIO, NN

Technology Key to Building a Disruptive Networked Business

Robert Crudup, Executive Vice President & CIO, SEI

Leading the Digital Business Culture

Chad Sheridan, CIO, USDA

Engaging Citizens through Technology

Martin P.Rose, CIO, Pinellas County

Scaling Devops without Hiring

Dmitri Lerko, Head of Devops, Loveholidays

Lean before you Agile.

Morgan Martins, AMBCS, CISMP, Head of Digital Delivery and Solutions, Institute of Physics International Public Speaker

Digital Transformation and Competitive Advantage through Continuous Delivery

Keith Watson, Director of DevOps, ADP UK

The Sustainable Advantage of Agility

Lee Sexton, Director Agile & DevOps TransformationSoftServe

right

THANK YOU FOR SUBSCRIBING

Data Reliability Engineering- Tackling The Data Quality Problem

By Torq Pagdin, Director, Technology (Data Engineering), Hotels.Com

Tweet
content-image

Torq Pagdin, Director, Technology (Data Engineering), Hotels.Com

As the business world starts to rely more on machine learning (ML), the accuracy of the underlying data that ML models are trained on has become far more prevalent.

It is no longer acceptable to have ’mostly’ usefuld data; even the smallest amount of bad data can cause inaccuracies in predictive analytics.

As data engineers, we bear the brunt of any criticism and rightly so—data scientists often bemoan the fact that much of their time is spent cleaning up data rather than producing the models they are trained to do. We are the first part of a long chain and the world of data engineering has to embrace this responsibility.

Most failures seem to go like this:

• Production Support is alerted to a failure in the middle of the night
• They apply a ’Band-aid’ fix to get the application running again
• The next day they inform the dev team who own the code to assess options
• The dev team then plan the reprocessing of bad data to stop users from exploding
• A permanent fix is suggested, estimated, and then put on the backlog (often never to be seen again)

Another problem that arises with bad data quality is that feature development teams are often subjected to spend multiple days within a sprint, trying to get to the bottom of failures.
This means that published roadmap items get pushed further and further back, making the teams less efficient and causing frustration or mistrust among the stakeholders.

So, what can we do about it?

Step forward the Data Reliability Engineering team!

Data Reliability Engineering (DRE) is what you get when you treat data operations as a software engineering problem. Using the philosophy of DRE, Data Reliability Engineers are 20 per cent operators and 80 per cent developers, and they sit outside, independent of the feature teams.

This is not about being a production support team, but about being a talented and experienced development team that specialises in data pipelines across multiple technical disciplines.

The 6-step mission of DRE is:

1. To apply engineering practices to identify and correct data pipeline failures
2. To use specialist knowledge to analyse pipelines for weaknesses and potential failure points, and fixing them
3. To determine better ways of coping with failures, along with increasing automation of reprocessing functionality
4. To work with pipeline developers to advise on potential DQ issues with new designs
5. Utilize and contribute to Open Source DQ Software products
6. Improve the ‘first to know rate’ for DQ issues
So, the DRE team own the failure, the fix, and the message out to users. They can call in feature team developer help if specialist knowledge is required but aim to handle in-house as much as possible, thus freeing feature teams to continue with their roadmap.

OK great...but does that mean the feature teams throw Data Quality responsibilities over the fence to DRE? Certainly not! Each team still has a responsibility for their pipeline and DQ should be a core element of the architecture and design. The DRE teamwork with both feature development and Product teams to make sure that DQ is included in designs and estimates. They are also part of the sign off process for QA/ UAT—no DRE sign off means no move to Production.

So, is DRE the complete solution to all Data Quality problems? Unfortunately not—bad data issues will always occur as edge cases for data, in particular, are so hard to predict. However, having a dedicated engineering team for DQ shines a light on issues and provides transparency to stakeholders and data consumers, building trust among data engineers, scientists, and analysts who depend on the accuracy of their data.

Read Also

Scaling Devops without Hiring

Scaling Devops without Hiring

Dmitri Lerko, Head of Devops, Loveholidays
Lean before you Agile.

Lean before you Agile.

Morgan Martins, AMBCS, CISMP, Head of Digital Delivery and Solutions, Institute of Physics International Public Speaker
Digital Transformation and Competitive Advantage through Continuous Delivery

Digital Transformation and Competitive Advantage through Continuous Delivery

Keith Watson, Director of DevOps, ADP UK
The Sustainable Advantage of Agility

The Sustainable Advantage of Agility

Lee Sexton, Director Agile & DevOps TransformationSoftServe

Weekly Brief

loading
cxoinsights
Top 10 DevOps Consulting/Service Companies - 2019
ON THE DECK

DevOps 2019

Top Vendors

DevOps 2018

Top Vendors

Previous Next

Top Trending News

  • 4 Latest Enterprise Mobility Trends to Follow
    4 Latest Enterprise Mobility Trends...
  • Is 5G a Game-Changer for the European Healthcare Industry?
    Is 5G a Game-Changer for the European...
  • 3 Digital Trends Influencing the European Hospitality Industry
    3 Digital Trends Influencing the...
  • How is the European M&E Industry Preparing for 5G?
    How is the European M&E Industry...
View More ›

Copyright © 2019 CIOApplicationsEurope. All rights reserved. Registration on or use of this site constitutes acceptance of our Terms of Use and Privacy Policy |  Sitemap |  Subscribe |  About Us

follow on linkedin follow on twitter follow on rss
This content is copyright protected

However, if you would like to share the information in this article, you may use the link below:

https://devops.cioapplicationseurope.com/cxoinsights/data-reliability-engineering-tackling-the-data-quality-problem-nid-1485.html?utm_source=google&utm_campaign=cioapplicationseurope_topslider