Blog

Navigating the Data Jungle in ESG and Sustainability Reporting

Written by Mateusz Panek | Jul 16, 2024 7:26:33 AM

With the rigorous reporting obligations under the Corporate Sustainability Reporting Directive (CSRD), enterprises face a myriad of challenges in sustainability reporting that is rooted in data collection and management. How they handle sustainability data ultimately affects the accuracy and credibility of their reports. In this first part of our series on data management for ESG and sustainability reporting, we explain the common pitfalls that enterprises should avoid when handling or collecting data.

The most common that we’ve observed with organizations that we’ve worked with are their dependence on files that cannot support complex analysis and modern data management infrastructures as well as manual, error-prone processes of updating and managing sustainability data. Additionally, most of the data is stored in various sources, which leads to miscommunication. This misunderstanding often results in duplicate, redundant efforts in data collection. 


Relying on flat files for sustainability reporting

In a nutshell, flat files are simple data files that contain records without structured relationships. These include .csv, .txt, and .tsv files. Unlike databases that can enforce data types, constraints, and relationships among data, flat files typically organize data in a plain text format where each line represents a single record. 

For example, electricity usage or fuel consumption data might be stored in flat files where each line provides information on daily or monthly metrics for different facilities or departments. Similarly, emissions data, including CO2 and other greenhouse gases (GHG), is often recorded in this format to manage company emissions reporting. Waste management data, such as amounts of recycled materials and waste generated, are also typically stored in flat files. This format is similarly used for recording sustainability metrics within the supply chain, like supplier sustainability scores or data concerning the sourcing of raw materials. 

Flat files, while simple and initially easy to use, pose substantial challenges when it comes to managing complex data required for compliant sustainability reporting. They lack capabilities for dynamic updates, complex querying, data relationships, and quality checks that are crucial for accurate sustainability analysis and reporting. Expanding on regular audits and strategic technology upgrades enable more informed decision-making based on reliable, timely, and accessible data. 

For example, a CPG company managing its freight operations might use flat files to store and analyze data relevant to CO2eq emissions. Each file could record detailed information such as the type of fuel used by each vehicle in their fleet like diesel, gasoline, and electric, the distance traveled per trip, and the load carried by each vehicle. Additionally, these files could track the GHG emissions produced by each vehicle per trip, calculated based on the fuel type and distance covered.

The complex relationships and dynamic nature of CO2eq emissions data in a CPG company’s freight operations are something that flat files cannot handle efficiently. These files lack the capability to manage large volumes of interconnected data, struggle with real-time processing, and do not support the advanced analytical tools necessary for comprehensive emissions management and reporting. Flat files also have minimal security features and lack robust compliance mechanisms, making them unsuitable for handling sensitive and regulations-driven data.

These are the challenges that we helped a multinational beverage company with back in 2022. They previously relied on manually updated flat files, which bogged down their sustainability efforts. By transitioning to a modern data platform, they can automatically extract raw GHG emissions data from flat files where it can be transformed in a centralized platform and be used in real-time analysis and customized reports.

Figure 1. A sample screenshot of a dashboard showing how sustainability-related data on a company’s transportation operations can be standardized and visualized through a modern business intelligence tool

 

Additionally, companies should implement ways to transform their data to address the limitations of flat files. Structuring, standardizing, and aggregating data, for example, make it easier to use or manage large volumes of data. Transforming data into formats that support analytics enables timelier decision-making. Cleaning data from multiple sources reduces redundancy and errors, which improves reliability. 

Type of Sustainability Data Commonly Stored as a Flat File Data Transformation Approach
Energy consumption data Standardize and aggregate data into a structured format, enabling time-series analysis and real-time monitoring for identifying patterns and optimizing energy usage.
GHG emissions data Clean and integrate disparate data sources to create a unified dataset and to identify emission trends and areas for reduction.
GHG emissions data Consolidate data for time-series analysis, enabling assessment of reduction efforts and compliance with regulatory requirements.
Water usage statistics Structure data for analysis, continuous monitoring, trend analysis, and effective water conservation strategies.
Supply chain sustainability scores Integrate and standardize supplier data, enhancing the ability to perform deep analysis and compare sustainability practices across the suppliers’ supply chains.
Waste management records Organize data into a consistent structure to enable detailed and compliant tracking, analysis, and reporting including circularity assessment. 
Corporate carbon footprint Integrate data points for comprehensive Scope 1, 2, and 3 carbon accounting across all business units.
Product life cycle data Transform life cycle data, enabling complex queries and analysis to optimize product sustainability. 
Water quality monitoring data Include timestamp and geolocation data for real-time pollution monitoring and regulatory compliance analysis.
Diversity and inclusion metrics Integrate and standardize employee demographic data for detailed analysis of diversity trends and the effectiveness of inclusion initiatives. 
Financial data on sustainability investments Aggregate financial data related to sustainability initiatives, enabling ROI analysis and financial performance reporting in the context of sustainability goals and, for instance, EU Taxonomy. 

Table 1. A table of sample sustainability data commonly stored as a flat file (i.e., .csv, .txt, tsv) and how they can be transformed for compliant sustainability reporting

 

Using unreliable Excel workbooks stored in collaboration platforms

Many organizations find Excel files on collaboration platforms like SharePoint to be convenient and familiar tools. However, they have limitations in managing complex and large-scale sustainability data:

Disadvantages of Using Excel Files for Sustainability Reporting Why Issues Arise How To Address the Limitations
Data integrity issues Excel files can be accessed and modified by multiple users simultaneously, leading to conflicting changes and data corruption. Implement a centralized database that ensures data integrity and supports real-time updates.
Scalability limitations Excel is not designed to handle very large datasets efficiently, leading to slow loading times, performance issues, and increased chances of file corruption. Transition to a centralized cloud-based database that can handle large datasets and scale with growing data needs.
Lack of advanced analytics Excel’s basic analytical tools are insufficient for performing complex analyses such as predictive modeling, trend analysis, and real-time data processing. Adopt advanced analytics platforms (e.g., Power BI, Tableau, Domo) that offer robust analytical tools and data visualization capabilities.
Security concerns Excel files on SharePoint can be vulnerable to unauthorized access, data breaches, and lack of proper security measures like encryption and access controls. Use platforms with strong security features, including access controls, encryption, and audit trails to protect sensitive data.
Version control Managing multiple versions of Excel files can lead to confusion and errors, making it difficult to determine the most accurate and up-to-date data. Implement a version control system that tracks changes and maintains a single source of truth.
Manual errors Manual data entry is prone to human error, resulting in typos, incorrect entries, and inaccuracies in sustainability reports. Employ automation tools to minimize manual data entry and reduce errors, increasing data accuracy and efficiency.
Limited collaboration SharePoint allows for basic file sharing but lacks robust collaboration features like real-time editing, commenting, and workflow management. Use collaboration platforms that support real-time editing, commenting, and workflow management to improve teamwork and communication.

Table 2. A summary of some of the disadvantages of using Excel-based files
for sustainability reporting and how they can be addressed

 

To address these challenges, organizations should transition to more advanced data management systems. Implementing a centralized database can ensure data integrity, support real-time updates, and scale efficiently with growing data needs. Advanced analytics platforms provide real-time processing capabilities.

These were the challenges that one of the world’s leading food and CPG companies faced. The company has extensive data on its environmental impact from production, transportation, and other supply chain activities. However, their reliance on Excel files led to issues and limitations that hindered effective and compliant reporting.

In 2023, we partnered with the company to transition from Excel files to more advanced data management systems. We moved their data to a centralized platform, added advanced analytics and data visualization capabilities, and automated the process of updating information from their multiple data sources. By centralizing the automatic collection and management of sustainability data, their leaders can see which facilities or business operations are contributing most to their environmental footprint. In turn, they can better invest in energy-efficient technologies or minimize consumption, which are then reflected in their sustainability reports.

 

Overestimating the quality of data in HR systems

Sustainability extends beyond environmental issues. The CSRD, for instance, explicitly requires companies to report on their social sustainability, which includes workforce diversity, labor practices, and community engagement. 

It’s quite common for companies to assume that their HR data is more accurate and reliable than it really is. HR systems often encompass a wide range of data, from employee demographics to detailed labor practices. There are several aspects why the quality of this data might be overestimated:

Assuming completeness: Companies often presume that their HR systems capture all necessary data. However, important details, especially related to temporary or subcontracted workers, might be missing. This incomplete data can lead to gaps in reporting workforce diversity or labor standards.

Ignoring inconsistencies: In large companies, especially those operating internationally, HR data can vary significantly between regions or departments. This inconsistency might be overlooked, leading to reports that aren’t truly reflective of the company.

Outdated information: HR data can quickly become outdated and using it can skew sustainability reports.

Manual errors: Data in HR systems is typically entered manually, which is susceptible to human error. Mistakes in data entry can lead to significant inaccuracies in sustainability reporting.

Granularity issues: Sometimes, the data captured isn’t detailed enough for specific ESG metrics or regulatory requirements, such as exact hours of community service or in-depth employee engagement. This lack of granularity can impede the depth and specificity of sustainability reports.


Addressing these challenges involves a mix of better data management practices and technology adoption:

Conduct regular data audits. Routine checks for data completeness, consistency, and accuracy can help ensure that the data remains reliable and up to date.

Standardize data entry. By standardizing how data is collected and entered across all regions and departments, companies can reduce inconsistencies and improve the uniformity of the data collected.

Automate data collection. Implementing automated data collection tools can help minimize the risks associated with manual data entry.

Integrate data systems. Using data integration tools to consolidate data from various HR systems into a centralized database can provide a more holistic view.

Upgrade HR systems. Enhancing HR systems to capture more detailed data relevant to ESG/sustainability metrics can greatly improve the specificity and utility of sustainability reports.

Utilize analytics. Using advanced analytics and data validation techniques can help identify and correct anomalies.

Sustainability-Related Data in HR Systems Examples of Data Collected Impact to Sustainability Reporting
Employee demographics and diversity Gender, age, ethnicity, disability status Essential for reporting on diversity and inclusion initiatives and aligning with CSRD requirements for social sustainability 
Labor practices and human rights Conditions of employment, incidents of labor rights violations Provides crucial insights into the company’s adherence to international labor standards and human rights
Employee engagement and well-being Engagement scores, health and well-being metrics Reflects company culture and employee satisfaction
Training and development Sustainability training participation rates, effectiveness of training programs Indicates investment in employee development and capacity building, aligning with the CSRD’s focus on long-term value creation
Equal pay and compensation Gender pay gap data, compensation equity Important for assessing fair compensation practices
Turnover and retention rates Reasons for leaving, retention strategies Offers insights into workforce stability and effectiveness of HR policies
Employee safety and accident data Workplace accidents, safety training compliance Vital for reporting on health and safety standards, which is directly related to social and governance aspects under the CSRD
Community involvement and CSR Volunteer hours, number of CSR programs, CSR program participation Demonstrates the company’s engagement with local communities
Employee mobility and travel Data on business travel, commuting patterns, remote work adoption Helps measure the environmental impact of employee transportation and supports initiatives aimed at reducing travel-related emissions
Labor practices and ethics Employee feedback on ethical practices, labor disputes Provides insights into labor standards compliance and ethical business conduct
Supplier diversity  Supplier diversity metrics, percentage of spending with diverse suppliers Reports the sustainability of the company’s supply chain and commitment to supporting diverse and inclusive economic practices
Corporate governance Board diversity metrics, governance practices, compliance with ethical standards Aligns with CSRD requirements for governance transparency and ethical business conduct
Antibribery and anticorruption measures Training completion rates, incidents of noncompliance, whistleblower reports Reports on the company’s ethical practices and legal compliance

Table 3. Examples of data collected in HR systems that can be used for
sustainability reporting aligned with regulations like the CSRD


Manually processing data for ESG/sustainability reporting

All these pitfalls we mentioned earlier hinge on the most common mistake that enterprises make — manually processing data. While traditional and familiar, it introduces a range of challenges that can significantly undermine the accuracy and reliability of sustainability reports.


Issues in Manual Processes Adverse Impact to Sustainability Reporting Sample Scenarios
Error-prone processes Inaccurate energy consumption data can lead to incorrect assessments of energy reduction strategies and potential noncompliance with energy efficiency standards. An energy manager manually records electricity and gas usage figures from paper bills into a spreadsheet. Errors such as misreading figures or inputting them into incorrect columns occur.
Inconsistency Difficulty in aggregating data accurately complicates the assessment of overall water conservation efforts. Different facilities within the same company manually track and report their water usage without a standardized measurement protocol, leading to variations in reporting intervals and methods.
Time delays Delays in processing and reporting data from HR systems could lead to missed opportunities for timely interventions and improvements in diversity initiatives and employee development programs. HR teams manually compile data on workforce diversity and training programs quarterly by collating information from various departmental reports.
Scalability As the supply chain expands, manual processing becomes more cumbersome and error-prone, reducing the reliability of data used to evaluate supplier compliance with sustainability standards. A multinational corporation relies on manual compilation of supply chain sustainability audits from hundreds of suppliers around the globe.
 Data fragmentation Fragmented data leads to challenges in obtaining a holistic view of sustainability efforts, making comprehensive reporting and analysis difficult. Different departments use separate files, systems, and platforms to record sustainability data, such as waste management and carbon emissions, without a unified approach.
Lack of real-time data Delays in data availability prevent timely decision-making and responsiveness to sustainability challenges or regulatory demands. Manual data processes delay the availability of current data, such as real-time energy usage or emissions monitoring.
Poor data security Increased risk of data breaches, which can lead to legal and reputational damage, especially if sensitive information is mishandled. Manual handling of sensitive sustainability data, such as employee personal information or proprietary environmental impact data, without strict security protocols.
Limited audit trails Without clear audit trails, it’s harder to verify data accuracy or identify errors, complicating compliance checks and historical data analysis. Manual processes often lack comprehensive logs of data entries and changes, making it difficult to trace the origins of data or understand modifications.
Resource-intensive work High resource use for data handling can lead to inefficiencies and increased costs, detracting from investment in actual sustainability initiatives. Manual data collection and entry require significant human resources, diverting staff from other valuable activities.
Data validation Inaccuracies might go unnoticed until advanced stages of reporting, potentially leading to flawed strategies or noncompliance issues. Manual checking of data for errors or inconsistencies is prone to oversight, especially with large datasets.

Table 4. An overview of how manual processes adversely affect sustainability reporting

 

Addressing these challenges involves strategic shifts:

Implementing automation: Automating the collection of data directly from source points, such as sensors for energy monitoring, can drastically reduce human error and accelerate the process.

Standardizing protocols: Implementing uniform data entry and updating protocols across all departments and regions ensures consistency.

Investing in advanced systems: Deploying modern data management platforms that support large volumes of data and real-time processing enhances the reliability and overall quality of sustainability reports.

Continuous training and audits: Regular training sessions for employees on the importance of data accuracy help maintain high standards of data quality and promote a culture focused on accuracy and attention to detail.

Figure 2. A sample workflow/process that Lingaro employs in data assessments for sustainability reporting


Manual work is the main challenge that most of the CPG companies we’ve worked with had to overcome first before they even start creating their reports. It’s also why our projects with these enterprises start with an assessment and road map for verifying and addressing gaps in their data management processes.

For instance, in 2024, we collaborated with one of the world’s leading food manufacturing companies to improve their sustainability reporting. Our initial step was to conduct a comprehensive assessment and develop a detailed road map to ensure that their sustainability reports were both extensive and compliant with regulatory standards. 

This assessment was an end-to-end program data evaluation aligned with the Data Management Body of Knowledge (DMBOK) framework. By helping them navigate their data jungle, they were able to pave smarter, faster, and better ways manage data that also align with industry standards, best practices, and regulatory requirements.

In the second part of our article, we will explore crucial sources or types of data that companies overlook in their sustainability reporting, and what organizations can do derive actionable insights from them.