How To Reduce Pressure On Your Data Teams

Aug 19, 2024

•

Wannes Rosiers

Data demand grows, pressuring small teams. Shift to focused data product teams and use portals to stay efficient and avoid data siloes.

In August 2016, BARC published the results of a global survey on Data-Driven Decision-Making in Business.Results are astonishing: only 22% of executives use mostly information for decision making, with an additional 25% expected to do so in the future. More recently, Forbes did publish that Data-Driven companies are 23 times more likely to top their competitors, referring to a 10 year old study from McKinsey.

Data-Driven companies are 23 times more likely to top their competitors

Companies must have understood this. In a S&P Global study from 2022 it was indicated that 70% of organizations take most to nearly all strategic decisions based on data, outperforming the 2016 predictions. It’s safe to say that the number of data driven decisions is rising rapidly and has not yet reached its limit.

More data driven decisions equals more data demand

When more decisions are backed by data, more data needs to be made available in a usable format. When more people are discussing the same insights, this readily available data needs to be shared with more people. And when fully automating the data-driven decision-making process, the process needs to be monitored, which again creates new data to be discussed. We have entered a continuous loop of increasing the amount of data, the number of users, and the number of decisions to be taken. As only one of those three has a strict upper-bound, which is far from being reached, we can quite comfortably say that we have not yet reached the end state of data.

Data teams are already drowning, while a new wave of data and stakeholders is heading towards them — Photo by Tim Marshall on Unsplash

As more data driven decisions give rise to the amount of data and the amount of stakeholders, it implies that more data driven decisions also lead to more demand on the data team. Data teams have more data to process and more stakeholders to service and we have not yet reached the end state. Will current data teams be able to effectively serve all these data demands?

And more demand brings more pressure

Currently, on average 1 to 5% of employees at tech scale ups are data people and it is my personal opinion that companies are capping the data efforts on this level. As the number of data workers is stabilizing, but the demand is rising, the pressure on data teams is rising as well. More stakeholders are standing at the desk of the data team every morning to ask the following questions:

Why has my report not yet been updated?
There must be something wrong with this report, I had way more sales yesterday…
I need this field to be added to my report, and I need it yesterday!

In the afternoon planning session of the data team, questions are quite different, and more in the line of:

We have bought a new tool, and the data should be disclosed for some insights, could this be put on the backlog?
Yet another company has updated there website with a recommendation algorithm, we could’t stay behind any longer, we have therefor bought a tool and want you to maintain it…
I have read about a new method, this GenAI thing, I need it now!

While more and more work is being requested to the data team, and the use of more tools heavily increases the cognitive load of this team, they start drowning. Typically they start forgetting about maintenance, performance and cost optimizations, which leads to the data team being considered as a cost department, rather than a value driver. Instead of increasing the investments to get the foundations right, they need to cut costs. As letting go of someone is often easier than stopping to pay for a certain tool, this only increases the pressure on the team even more.

How to prevent your data team from drowning?

I know I did make the statement that letting go of data people is increasing the pressure on those who stay. You might be guided to the direction, that hiring additional people would lower the pressure. But is this true? There is something like an ideal team size. A great summary has been provided on Leadingboat: an optimum team size, is the size that the team is neither too small to complete the work nor too large to waste their time and effort in coordinating and engaging everyone. In scrum frameworks, this team size is often referred to as something between 3 and 9, scientific research puts it somewhere close to 5 people.

Hence hiring more people in the same team, would not make it more productive. Adding more teams in parallel might be, but again you must not increase the effort of coordinating too heavily. This means that multiple teams should be able to perform tasks independently and not rely on others to complete their own tasks. Just like we have seen product teams rising in software development, this has lead to the emergence of data product teams. A data product team is responsible for the end-to-end delivery of services (ingestion, consumption, discovery, observability, etc.) required by the data product.

Data Product teams take end-to-end responsiblity on the full lifecycle — Image by Author and Kris Peeters depicting the capabilities of the Data Product Portal

The solution is not to hire more people, the solution is to provide focus to the people that you have. One single data engineer should no longer be responsible to check infra failures in the morning, restart data pipelines at 10am, add some fields to the marketing report at 11, fix the sales figures at noon and attend the planning session with a focus on a recommendation algorithm in the afternoon. Making sure that these responsibilities land in a team, and hence with people that thoroughly understand the business they are working for, is what will get your team again up to speed.

But wait… What about data siloes?

One of the biggest reasons to install a central data team, has been to avoid data siloes. The team topology of having multiple data product teams all focussing on their own data product and probably business domain, risks of introducing these data products anyway. To mitigate this risk, it is important to make sure to maintain a proper overview of which data products exist and where they live. This is one of the main reasons that we have launched the Data Product Portal.

This portal — which has recently been open sourced — aims to provide this overview, simplify processes and access requests, and automate implementations.

If you want to learn more about the portal from one of the first users, watch this talk with one of our first users.