How digital waste is polluting the planet
The threat of Dark Data, and Redundant, Obsolete or Trivial Data
Data growth across the planet is accelerating at a far higher rate than preventative sustainability measures can cope with, contributing to the already significant and growing data carbon footprints of organisations.
In a recent OECD-OPSI article, Professors Tom Jackson, Ian Hodgkinson and Lisa Jackson highlight the dangers of assuming that digitalisation and digital data is carbon neutral, as it does in fact carry a CO2 footprint. As organizations generate, process and store data, they place huge demands on energy usage to power storage facilities such as data centres. Digital data and digital transformations are critical to driving decarbonisation solutions, and we observe a growing trend towards investing in digital solutions in pursuit of net zero. For example, in 2022 Climate Tech investments represented more than a quarter of every venture dollar invested, according to PWC. However, there is another side to the relationship between digital and decarbonization, which is the impact digital data and bad digital practices can have on the planet.
A triple threat
In their article, Dark Data is Killing the Planet, Professors Jackson and Hodgkinson illustrate the environmental cost of the dark data firms generate i.e. data they collect, process and store but only ever use once or never at all. The volume of dark data has grown as digital transformations have advanced, with organisations seeking to store all data without giving thought to what the consequences for the environment may entail. Creating and storing Redundant, Obsolete or Trivial (ROT) data is another common trend and one that creates a huge additional drain on energy supply, both from on-the-grid and by diverting renewable energy supplies. To illustrate the magnitude of these wasteful digital practices, a recent article in the Journal of Business Strategy identifies as much as 55% of data stored by organisations may be dark data, while Forbes suggest ROT data in organisations may be as high as 33%. While there may be some overlap between ROT data and Dark Data, it is crucial to highlight the potential worst-case scenario: a staggering 88% of data stored by an organization could be irrelevant. The consequence of storing this data is a significant increase to the organization's data CO2 footprint.
PUE Explained
In data centres, a widely used metric for measuring energy efficiency is Power Utilisation Effectiveness (PUE). Data centres need a range of auxiliary services, including cooling, to support the main ‘work’ of the IT systems, PUE measures the size of this ‘overhead’ energy used as a ratio to the energy used to power the computing equipment. Since PUE is a ratio, the closer the number is to 1.0, the more energy efficient the data centre is.
According to a recent report, today’s data centres are processing more than sixteen times as much internet traffic, and house ten-times the computing power, while using nearly the same amount of energy as in 2010. The vast majority of the estimated 16,000 medium-sized data centres in Europe are operated by businesses, local and national governments, and public sector organisations. Many of these data centres are wasting energy. 451 Research calculated that if the 16,000 data centres of 100kW or more across Europe were to achieve PUEs of 1.4, the energy savings would be dramatic. Reducing the PUE of each data centre from above 2.0 down to 1.4 would cumulatively save an estimated 11 terawatts hours of power annually. To put that into context, it is approximately the same amount of energy consumed by a city the size of Hamburg in a year.
Consequently, many data centre operators are now signatories of the Climate Neutral Data Centre Pact, a self-regulated initiative launched in 2021 by CISPE and EUDCA in collaboration with the European Commission. Under this initiative, data centre operators and trade associations agree to make data centres climate neutral by 2030.
How common is dark data and ROT data?
The growth of ROT and dark data usually stems from two main motives; the fear of deleting something that might be needed later, and poor housekeeping / management in keeping track of where and what data is used, and potentially duplicated. We have seen instances on premise where 80% of primary business data, petabytes (1000’s of terabytes) in size, has not been accessed in years but has been consuming power to keep it available. Grappling this issue of truly understanding the data and what to do with it has been around for a long time, but there are solutions out there to help.
The net zero data journey
There is a need for a mindset change in how digital data is viewed by organisations, recognising there is an environmental cost to how data is used, processed, and stored. Collaboration between academics, data scientists, researchers, policymakers, and organisations is key to fostering responsible data practices. By sharing knowledge on methods for energy efficiency, establishing best practices, and promoting transparency in CO2 use, we can harness the potential of data while minimising potential environmental impact.
Amazon Web Services, for example, is helping organizations improve their sustainability in the cloud by increasing understanding of the impact of their data and highlighting positive actions that can be taken, such as deleting data that is not needed, and intelligent tiering of data that is rarely accessed to “colder” tiers that typically require less energy consumption.
There are a range of free and publicly available tools to assist organisations on their net zero data journey. For example, tools developed by Loughborough University and The London Data Company provide easy-to-use forecasting tools and scorecards to evaluate the CO2 cost of the proposed data requirements of new projects and are available here: digital decarb toolkit and LDCo CO2 Calculator.
Open-source tools play a vital role in empowering organizations to make informed decisions regarding data creation, utilization, storage, and sharing. These tools serve as valuable resources in prompting thoughtful considerations for emerging technologies like Edge Analytics, while also fostering diverse conversations around data usage and practices such as Federated Analytics.
By leveraging the tools available, organizations gain access to proactive measures that enable them to assess potential environmental harm before embarking on new projects. This proactive approach is essential in minimizing unnecessary actions that can harm the environment and accelerates progress toward achieving net-zero goals.
Share this page