Part 2: Read by robots, admired by … anyone?
What it means when your ESG report is simply machine-readable evidence,

The most regular reader of a modern ESG report may no longer be a human being. It may be an LLM.
That is the shift organisations now need to understand. The long sustainability PDF is still being produced, approved, designed, uploaded and announced. But increasingly, its most consequential readership is not sitting with a coffee, reading the chair’s statement. It’s an LLM scanning, chunking, extracting, comparing and scoring
Research in Natural Language Processing (‘NLP’), financial engineering and ESG extraction points clearly in this direction. As sustainability reports have become more common, standardised and dense, human readers have increasingly bypassed them. Institutional investors, analysts, ratings agencies and data platforms now use natural language processing, AI extraction tools and large language models to parse disclosures at scale.
Systems such as ESGReveal show what this new form of reading looks like. They are not simply searching PDFs for keywords. They combine ESG taxonomies, report preprocessing, vector databases, retrieval-augmented generation and LLM agents to convert unstructured narrative into structured data.
That is a very different reader from the one companies used to imagine.
The first layer is the metadata or taxonomy module: the “expert brain” that tells the AI what counts as a valid ESG metric, which standards matter, what units to expect and how to interpret context. The second is report preprocessing: cleaning and chunking awkward corporate PDFs, with their tables, columns, footnotes and inconsistent layouts. The third is the LLM and RAG layer, in which an agent retrieves relevant fragments and maps them into structured outputs.
In simple terms, the machine is doing what the human reader no longer does. It is going deep into the report.
The performance is already strong enough to matter. The research cites quantitative ESG data extraction accuracy of around 76.9% and qualitative disclosure analysis accuracy of around 83.7%. That is not perfect. Systems still struggle with messy tables, changing units, footnotes and inconsistent disclosure. But it is good enough to change the economics of attention. It is good enough for investors, analysts and researchers to interrogate corporate ESG claims at scale.
And interrogation is the point.
The deeper goal is not just efficiency. It is risk detection. Financial engineering literature is increasingly focused on the gap between narrative and numbers. If a CEO’s letter is full of confident sustainability language but extracted data shows flat or worsening emissions, the machine can flag the discrepancy. If the story and the metrics do not match, the organisation becomes a greenwashing risk.
The newest frontier is multi-source verification. AI systems are not confined to the company’s own PDF. They can compare a sustainability claim against earnings call transcripts, regulatory databases, web sources and environmental enforcement records. The report is no longer read in isolation; it is cross-examined.
That creates a strategic communication challenge for any organisation that wants to be evaluated on more than its numbers.
The old approach was to publish a report and hope stakeholders would find the meaning. The new reality demands a twin-prong approach.
First, organisations must service the data hunger of the machines. ESG data has to be clear, structured, consistent, accessible and interrogable. Metrics should be traceable. Units should be stable. Methodologies should be explicit. Claims should connect visibly to evidence. If progress is real, it should not be buried inside narrative clutter, irregular tables or ambiguous language.
Second, organisations must engage the human audience in a completely different way. Because the machine can extract the number. It cannot make people care.
If an organisation wants to be appreciated for its ambitions, efforts, struggles, breakthroughs and progress, it needs human storytelling. Not corporate varnish but real stories, specific stories, stories that show the work behind the claim.
That’s where the disciplines of film, television and social media become strategically important. The best storytellers know how to hold attention, create relevance, reveal tension and make complexity memorable. For ESG communication, that is not decoration. It is translation.
The future of ESG communication is therefore bifurcated: one layer built for interrogation by machines, another built for belief among people.
The organisations that understand this will be more legible to analysts and more meaningful to stakeholders. The insight is that the report is becoming the evidence base; it is not the engagement strategy.
Sources:
Feng Li, "Annual report readability, current earnings, and earnings persistence," Journal of Accounting and Economics (Elsevier), Volume 45, Issues 2–3, pp. 221-247, August 2008.
Dr. Jürgen Schanz et al., "Who actually reads annual reports? An empirical study of digital corporate reporting," WU Vienna (Vienna University of Economics and Business) & nexxar Digital Reporting Network, 2020/2021.
Financial Reporting Council (FRC) Financial Reporting Lab, "Digital Present: Current Trends in Digital Reporting and Corporate Portal Readership Frameworks," UK Governance & Academic Corporate Research Series, 2020.
Z. Zou et al., "ESGReveal: An LLM-based approach for extracting structured data from ESG reports," On arXiv / Computational Sustainability Ecosystem & Journal of Cleaner Production Frameworks, 2023.
PWC, "PwC Global Investor Survey: Interrogating the ESG Data Gap," PwC Global Data Insights / Academic Accounting Review Series, late 2023.
The Palladium Group, "The Green Communication Dilemma: Global Consumer Insights on Corporate Sustainability Reporting," Global Strategy & Sentiment Audits, 2023.
“ESGLens: An LLM-Based RAG Framework for Interactive ESG Report Analysis and Score Prediction,” On arXiv, Cornell, 2026












