What if your sustainability team could finally think strategically?

The premise

Sustainability teams are drowning in operations

The people who should be designing a company's transition to a low-carbon economy are instead buried in spreadsheets, chasing supplier data, and reconciling emission factors at 11pm.

CSRD. SEC Climate Disclosure. ISSB. GRI. CDP. The regulatory surface area keeps expanding, and headcount stays flat. Every new framework means more data to collect, more narratives to write, more edge cases to interpret. Most sustainability teams are running a Fortune 500 programme with the tooling of a startup — spreadsheets, email chains, and heroic individual effort.

We think there's a different model. Not AI as a feature bolted on to existing workflows. AI as the foundation — an operating environment where specialised agents handle the operational drudgery, and the sustainability team focuses on what only humans can do: setting strategy, driving reductions, and telling the company's sustainability story.

The feel

Not a dashboard.
An operating system.

The core insight from projects like Mercury OS isn't aesthetic — it's structural. When your team's primary collaborators are AI agents working across dozens of domains simultaneously, the traditional SaaS model breaks down. You don't need a sidebar with eight sections. You need an adaptive environment that surfaces the right context at the right moment, scales from oversight to detail fluidly, and treats agent output as a first-class interaction pattern.

An operating system built for a new kind of collaboration — where your team works alongside agents and the interface adapts to how they collaborate.

This means rethinking the basics. Navigation becomes intent-driven, not menu-driven. Status is ambient, not buried in tabs. Review and approval are native interactions, not afterthoughts. The whole thing is designed around a single question: what does a sustainability team need when they're directing a fleet of autonomous specialists?

⬡ Concept 01

Agent oversight

"Checking in on the class." — Inspired by the organisational model in Gas Town: your agents have been working in the background. Here's what they did, how confident they are, and what needs the team's attention.

The platform maintains a team of specialised agents — a data ingestion agent, a calculation engine, an anomaly detector, a regulation monitor, a report drafter, a stakeholder researcher. Each has a presence the team can inspect. The interface adapts to show the right level of detail: a high-level stream when scanning, full context when drilling in.

Agent activity — today

Live

⊕

Data Ingestion Agent

Processed Q4 activity data from 3 new suppliers. Normalised units, staged for calculation.

6:42 AM · auto

Ingested 2,847 line items across Fujian Packaging Co., Rhine Logistics GmbH, and Cargill Oils APAC. Converted all energy data to kWh (was reported in GJ by Rhine). Flagged 12 rows with missing facility codes — used best-match from prior quarter. All data staged for the Calculation Engine.

◈

Calculation Engine

Completed Scope 3 Category 1 calculations for Q4 batch. 94% coverage achieved.

7:03 AM · auto

Applied DEFRA 2025 emission factors for upstream goods. Used market-based method for Scope 2 where supplier-specific factors were available (18 of 23 suppliers). Remaining 5 used location-based grid average. Total Category 1: 148,200 tCO₂e ± 3.2%.

△

Anomaly Detector

Flagged unusual spike in Scope 3 Cat. 4 — upstream transportation up 47% QoQ in APAC.

7:08 AM · flagged for review

Category 4 (Upstream Transportation) for Asia-Pacific shows 47% increase vs Q3. Root cause analysis suggests temporary air freight substitution during October port congestion. Cross-referenced with logistics provider invoices — pattern is consistent. Recommend: accept data, note one-time event in report narrative. Confidence: 89%.

◎

Regulation Monitor

No new regulatory changes detected. CSRD delegated acts stable. Next check: 6 hours.

5:00 AM · auto

Scanned EFRAG, SEC EDGAR, ISSB, and CDP sources. No material updates since last check. Current watch items: pending CSRD sector-specific standards for consumer goods (expected Q2 2026), SEC climate disclosure stay order status. All reporting obligations unchanged.

Click any agent above to drill into their work. The key idea: the team maintains oversight without micromanagement. The agents do the work. The team checks in when it suits them.

⬡ Concept 02

Review queues

"Insights that need a human." — Agents work autonomously, but they know what they don't know. When they hit something that needs judgement, they surface it — with full context and a recommendation.

These aren't buried in a queue you have to remember to check. The OS surfaces them based on urgency and relevance. Each one tells you: what happened, why it matters, what the agent thinks, and how confident it is. You approve, reject, or ask for more context. The interface is built for this loop — it's the core interaction between you and your agents.

Anomaly Confidence: 89%

Scope 3 Cat. 4 spike — APAC upstream transportation

Upstream transportation emissions increased 47% quarter-over-quarter in the Asia-Pacific region. The anomaly detector traced this to a temporary shift from sea to air freight during October port congestion. Agent recommends accepting data as accurate and noting the one-time event in the CSRD report narrative.

Missing data Confidence: 72%

3 key suppliers haven't submitted Q4 activity data

Yangtze Chemical Corp., Nordic Freight AS, and Braskem SA have not responded to automated data requests. Agent has sent two follow-ups. Recommends using spend-based estimates until actuals arrive — estimated accuracy within ±15% based on historical patterns.

Ready for review Confidence: 94%

Draft section: Climate-related risks & opportunities

The report drafter has completed the TCFD-aligned risks and opportunities section for the CSRD report. Incorporates latest scenario analysis, physical risk assessment, and transition risk mapping. 4,200 words with 23 citations to source data.

⬡ Concept 03

Simulation & backtesting

"The trading floor for reductions." — Inspired by Composer.trade. Build reduction strategies, model their impact, compare scenarios, then deploy the best one and let agents monitor the results.

What if you switched your APAC shipping from air to sea freight? What's the emissions impact? What does it cost? How does it align with the IMO's decarbonisation trajectory? What about grid greening assumptions — does the maths change if Vietnam hits its renewable targets?

The team describes the initiative. The system models it. They compare, adjust, and commit. Then agents go collect real data to confirm or disprove the hypothesis. A continuous loop between strategy and evidence.

Scenario: APAC air → sea freight transition

With sea freight transition

Current trajectory

IMO 2050 target

Annual reduction

-12,400 tCO₂e

Net cost impact

-$2.1M /yr

Lead time impact

+8–14 days

The magic is the feedback loop. The team commits to the experiment, and agents start monitoring actual shipping data against the model. If reality diverges from projection, they'll know — and they'll know why.

⬡ Concept 04

Qualitative research

"The automated researcher." — Inspired by Listen Labs. CSRD double materiality assessments require qualitative input from across the business. What if an agent could gather it for you?

The research agent gets dispatched with context — "I need input from regional facility managers on physical climate risks for our CSRD double materiality assessment." It crafts contextual questions, sends them asynchronously, follows up on vague answers, and synthesises everything into themes with citations. Every claim traces back to a source.

Dispatched

Questions sent to 12 regional facility managers

stakeholders contacted

Collecting

Responses received and follow-ups sent

8 / 12

responses collected

Synthesising

Key themes identified from responses

themes extracted

Gaps

Follow-up questions queued

follow-ups pending

Draft ready

Report section with full provenance

in progress

What used to take weeks of scheduling, chasing, and manual synthesis becomes a process you dispatch and check in on. The output isn't just a summary — it's report-ready prose with citations back to every interview and document.

⬡ Concept 05

The always-current report

Reporting isn't a separate activity. It's an emergent property of the system. Because agents are continuously ingesting, calculating, and gathering — the report is always up to date.

CSRD Annual Report 2025 87%

Governance & strategyComplete

Scope 1 & 2 emissionsComplete

Scope 3 value chainComplete

Climate risks & opportunitiesIn review

Double materiality assessmentIn review

Targets & transition planComplete

Scope 3 methodology notesPending

When a regulation changes, the monitor flags it. The drafter updates the affected sections. The team reviews. The operating system keeps the whole pipeline connected — agents feed into the report continuously, and the team curates and approves.

Putting it together

Sarah's Monday morning

Sarah Chen is VP of Sustainability at Meridian Consumer Goods — a Fortune 500 CPG company with global operations and upcoming CSRD obligations. Here's what her Monday looks like when her team is backed by an AI-first platform.

7:45 AM — Overnight activity

Sarah opens her laptop. The platform shows her what her agent team did overnight: data from 3 new suppliers was ingested and calculated, Scope 3 Category 1 is now 94% complete, and one anomaly was flagged. Everything she needs to know, surfaced by priority.

7:50 AM — The anomaly

She taps the flag. APAC upstream transportation is up 47%. The agent already investigated — it was October port congestion causing a temporary air freight spike. It recommends accepting the data and noting the one-time event. Sarah approves in two taps. Three minutes, done.

8:00 AM — A strategic question

The air freight spike sparks an idea: what if Meridian permanently shifted APAC shipping to sea freight? She opens the simulation space, builds the scenario, and sees the projection — 12,400 tCO₂e reduction, $2.1M net savings, 8-14 days added lead time. She overlays the IMO trajectory. Saves it for Thursday's board presentation.

8:15 AM — Research check-in

She checks on the research agent gathering input from facility managers for the double materiality assessment. Eight of twelve responses are in. Three themes synthesised. Two follow-up questions queued. She approves the follow-ups and moves on.

8:25 AM — Report glance

The CSRD report is at 87%. Two sections need review. She flags them for her analyst to pick up after lunch. By 8:30, she's done with operational triage and preparing for her board presentation on science-based targets — the strategic work that actually drives reductions.

Forty-five minutes. That's the operational overhead before the sustainability team's day becomes entirely strategic.

What this means

Not replacing teams. Empowering them.

None of the components here are science fiction. Automated data ingestion, intelligent anomaly detection, simulation-based strategy, AI-conducted research, always-current reporting — all within reach of current technology.

The hard part isn't capability. It's design. It's imagining the right operating system — the right interaction model between a sustainability team and a fleet of agents. An interface built for oversight at scale. Strategic where others are operational. Adaptive where others are rigid.

This is a first sketch of that imagination.

Watershed · Vision · 2026