Andreas Bartsch, Head of Service Delivery at PBT Group
The emergence of the data lake as a repository for the massive amount of unstructured data is changing how organisations approach analysis, using those insights to develop more customised solutions and deliver an enhanced customer experience. But beyond that, it has also changed the skills required in data teams.
Unstructured data introduces a new layer of complexity. To fully explore its potential requires new technologies. This is where data scientists play a critical role in making sense of the information at hand and use those sophisticated tools to explore Big Data. The technologies therefore become an enabler to help them find that diamond in the rough and build a use case based on their insights.
It has also resulted in the evolution of the data engineer and scientist roles. These specialists must understand how the unstructured data is stored and interrogate it through different mechanisms. They must also be capable of using different design and modelling techniques to extract the required value from the Big Data.
Clearly, the role of the data specialist is evolving. And yet, it remains essential to be technology-agnostic as data lakes are everywhere to be found whether in AWS, Google, Azure, or some other cloud environment. These specialists must grasp how these data lakes work regardless of the cloud platform and extract the value the organisation needs to identify new opportunities.
Driving the edge
The frequency of data updates has also increased exponentially. Previously, the likes of claims, underwriting, financial or other data sources were downloaded overnight to provide users with the previous day’s statistics. But as technology has advanced so too have the requirements changed. Data consumers are now able to receive updates on a near real-time basis making for much more efficient decision-making.
The Internet of Things (IoT) and edge computing have further exacerbated the data flows, pushing data through even more frequently. Thanks to the growth of 5G connectivity, the mechanisms are starting to fall in place to enable real-time updates while managing the influx of devices previously not able to generate data.
The role of the DataOps team in this regard should not be underestimated. Its focus area is on test-driven development, automated testing, and quicker release management, amongst others. Previously, this was a manual-intensive process with specialists, but the focus has turned to automating as much of this as possible.
The goal is to reduce human intervention and free up those skilled resources to concentrate more on the data analysis and – engineering function. This creates an environment where DevOps has evolved into DataOps. While similar to the former, the latter is dedicated to the data side of things.
Ultimately, even though the way of working with data is changing thanks to new technologies and trends, the underlying principles will remain the same. Data specialists who understand this and who can apply their knowledge in technology-agnostic ways will be the ones that thrive in a digitally-driven environment.