Energy Regulation · Australia
May 20268 min read

Turning a Decade of AER Regulatory History into a Queryable Submission Intelligence Layer

Ausgrid · DNSP serving greater Sydney, NSW

10 yrs

of AER regulatory history indexed with full provenance

12

regulatory entity types extracted and cross-linked

~20%

estimated reduction in submission preparation effort

Ausgrid's regulatory affairs team faced a structural memory problem: two complete AER regulatory cycles, 10 years of objections, justifications, accepted and rejected line items, locked in thousands of PDF pages with no systematic way to surface patterns or answer specific questions without days of manual search. Semantica ingested the full public corpus, built a temporal knowledge graph that tracks how positions evolved across cycles, and wired a hybrid retrieval layer that answers cross-cycle questions in under 3 seconds with exact source citations. Teams preparing the 2029–34 submission now start with a complete institutional memory rather than a blank page.

01

About

Ausgrid is a Distribution Network Service Provider (DNSP) serving greater Sydney and the Hunter Valley, with more than 1.7 million customers across one of Australia's most complex electricity distribution networks. As a regulated monopoly under the National Electricity Law, its allowed revenue is determined every five years by the Australian Energy Regulator (AER) through a formal, multi-stage process. Two complete cycles now exist in the public record: 2019–24 and 2024–29. The 2029–34 cycle is the next horizon. Each determination involves billions of dollars in capital expenditure and operational cost allowances, reviewed against a regulator that publishes its reasoning in forensic detail and references its own prior decisions explicitly.

02

The Problem

Each AER regulatory cycle produces thousands of pages across a structured document hierarchy: Regulatory Proposals, AER Issues Papers, AER Draft Determinations, Ausgrid Responses, Expert Witness Reports, Alternative Control Services determinations, and Final Determinations. Two complete cycles, 2019–24 and 2024–29, already exist in full. They contain every objection the AER has raised, every cost category Ausgrid has defended, every accepted and rejected justification, and every negotiated outcome. The AER references this history explicitly when assessing new proposals. Ausgrid's submission teams were not.

  • 18–24 months per regulatory cycle the most time-intensive and highest-stakes recurring workflow in the business
  • Thousands of pages across two completed cycles no single analyst has read them in full, no tool has indexed them with cross-cycle linkage
  • 12 distinct regulatory entity types (Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions) scattered across unlinked PDFs
  • Zero systematic institutional memory when senior regulatory affairs staff turn over between cycles, critical negotiating context walks out the door
  • Pattern blindness: identifying which arguments the AER has consistently accepted or rejected requires weeks of manual comparison across thousands of pages
  • 2–4 hours of manual document search to answer a single factual question about a prior cycle determination too slow for a live submission process
The AER does not forget. It holds institutional memory going back decades. Submission teams who start each 18-month cycle from scratch are preparing arguments against a regulator with perfect recall, without an intelligence layer of their own.
03

The Solution

Semantica ingested the complete public corpus of AER and Ausgrid regulatory documents across both cycles, parsing PDFs with structure-aware chunking that preserves the regulatory document hierarchy rather than breaking at arbitrary character limits. The output is a temporal knowledge graph in Neo4j where every extracted entity is linked to its source document, part, section heading, and page number, and where cross-cycle evolution is tracked through typed evolves_to edges. A hybrid retrieval layer combining Pinecone semantic search and Neo4j graph traversal answers plain-English questions with exact source citations in under 3 seconds. A two-stage LLM pipeline using step-back query decomposition followed by grounded synthesis with 7 specialist regulatory agent tools surfaces patterns that would otherwise require days of manual review.

System pipeline

  1. 1

    Ingest

    the full public document corpus across both cycles is pulled: AER Issues Papers, Draft Determinations, Final Determinations, Ausgrid Regulatory Proposals, all Responses, and Expert Witness submissions

  2. 2

    Parse

    StructuralChunker splits each document at section boundaries, preserving the regulatory hierarchy (Regulatory Proposal, Part, Chapter, Section, Paragraph) so no context is broken by chunking

  3. 3

    Extract

    NERExtractor identifies 12 entity types across every document: Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions, and more

  4. 4

    Annotate

    TemporalAnnotator adds typed evolves_to edges between matched entities across cycles, making it possible to ask how AER's position on a cost category changed from 2019 to 2024

  5. 5

    Index

    ProvenanceManager links every extracted entity to its source document identifier, part number, section heading, and page so every answer generated by the system is citable by a human reviewer

  6. 6

    Query

    a hybrid retrieval layer (Pinecone semantic search and Neo4j graph traversal) with a two-stage LLM pipeline answers plain-English questions with full source citations in under 3 seconds

Example query

Query: How did AER's position on vegetation management costs evolve from 2019–24 to 2024–29?
MATCH (e1:Expense {category: "Vegetation Management", cycle: "2019-24"})
      -[:EVOLVED_TO]->(e2:Expense {category: "Vegetation Management", cycle: "2024-29"})
MATCH (e1)-[:AER_OBJECTION]->(o1:Objection)
MATCH (e2)-[:AER_OBJECTION]->(o2:Objection)
RETURN
  e1.allowance            AS allowance_2019_24,
  o1.reason               AS aer_objection_2019,
  o1.source_page          AS objection_page_2019,
  e2.allowance            AS allowance_2024_29,
  o2.reason               AS aer_objection_2024,
  o2.source_page          AS objection_page_2024,
  e2.allowance - e1.allowance AS delta_allowance

-- Returns: position history, objection text, source page refs, and allowance delta
-- Execution time:  < 3 seconds
-- Previously:      2–4 hours of manual search across two cycles of documents
04

Semantica Modules

StructuralChunker

Splits regulatory PDFs at section boundaries rather than character limits, preserving the document hierarchy that gives AER determinations their meaning. A chunk never spans two sections.

NERExtractor

Identifies 12 regulatory entity types across every document in the corpus: Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions, and more.

TemporalAnnotator

Adds typed evolves_to edges between matched entities across regulatory cycles, enabling cross-cycle traversal queries such as how AER's position on a specific cost category changed between the two determinations.

ProvenanceManager

Links every extracted entity to its source document, part number, section heading, and page. Every answer is citable. No claim is generated without a verifiable reference.

GraphBuilder

Assembles the temporal knowledge graph in Neo4j with full cross-cycle relationship traversal. Both the 2019–24 and 2024–29 cycles exist as connected layers in the same graph.

PolicyEngine

Evaluates the current regulatory proposal draft against AER compliance rules and prior accepted positions, flagging deviations before they reach the AER's formal review stage.

05

Results

TaskBeforeAfter Semantica
Locate AER's position on a specific cost category2–4 hours of manual search across both cycle document setsUnder 3 seconds, with exact source citations and page numbers
Cross-cycle pattern and delta analysisDays of manual comparison across thousands of pages by a senior analystSingle graph traversal query returning allowance history, objection reasons, and delta
Onboarding a new regulatory analyst to the full cycle history6–12 months to build contextual understanding from scratchDays: the full institutional memory is queryable and citable from day one
Submission team preparation for a new cycleRediscovering prior cycle outcomes from scratch each 18-month cycleBuild the 2029–34 proposal on top of indexed, queryable prior history
Surfacing recurring AER objection patternsNot systematically possible, required weeks of manual review to approximateAutomatic: the temporal graph surfaces pattern recurrence across both cycles
Estimated regulatory preparation effort savingBaseline: 18–24 month cycle with full archaeology of prior documents~20% reduction in submission preparation effort across the cycle
06

What the system can now answer

query examples
01What did AER say about vegetation management costs in the 2019–24 Final Determination? Include the exact section and page.
02Which Capex programs were accepted in 2024–29 that were rejected in 2019–24, and what changed in the justification?
03How has AER's stated position on demand forecasting methodology evolved across both regulatory cycles?
04List every instance where Ausgrid's expert witness position differed from AER's preferred position and the outcome
05Which cost categories have attracted AER scrutiny in both cycles, ranked by frequency of objection?
06Show every accepted Opex justification from 2024–29 that could serve as precedent in the 2029–34 proposal
07

Who It Helps

Regulatory Submission Teams

Build the 2029–34 proposal on an indexed foundation of prior cycle history. Stop rediscovering what AER has accepted and rejected, and start from a complete institutional picture.

Regulatory Strategy Leads

Identify which AER objection patterns recur across cycles and prepare counter-arguments in advance, informed by the full historical record of what the regulator has and has not accepted.

Expert Witness Consultants

Brief in minutes on AER's documented position history before preparing expert reports. No six-week document review just to establish the baseline before expert analysis begins.

New Regulatory Analysts

Access full institutional memory from day one, not dependent on the one person who was in the room during the 2019–24 determination and remembers the negotiating context.

08

Conclusion

Ausgrid's regulatory submission process is one of the highest-stakes recurring workflows in Australian energy infrastructure. The team preparing the 2029–34 determination will include people who were not in the room when the 2019–24 objections were negotiated. Without a systematic intelligence layer, institutional knowledge depreciates with every staff change and accumulates nowhere. Semantica does not replace the regulatory affairs team. It gives them memory that does not walk out the door: a queryable, provenance-backed institutional record that makes every prior cycle finding accessible, citable, and usable in the preparation of the next. For any regulated DNSP approaching an AER cycle, the question is no longer whether that institutional memory matters. It is whether it will be there when you need it.

Temporal Knowledge GraphHybrid RAGPolicy EngineProvenanceRegulatory AINeo4jPinecone

Get in touch

Preparing for your next AER regulatory cycle?

We work with regulated utilities, regulatory affairs teams, and expert witness consultants who need to turn prior cycle documents into queryable institutional intelligence. If the 2029–34 cycle is on your horizon, we want to talk.

Start a conversation