AI search across regulator complaints and oversight data
A government oversight agency engaged us to deploy AI search across its public publications and oversight data. The audience: journalists, the legal community, citizens, and the agency's own staff. The question types in production traffic: "How many complaints were upheld in 2024?", "What did this regulator say about issue X?", "Show me the latest investigation summary on Y."
Background
Standard search across the agency's website returned long lists of partly-relevant documents and forced users to read each one to find the answer they wanted. Generative AI answer-quality on the same content was poor without context grounding: hallucinated specifics, no citations, no way for the user to trust or verify.
The brief was to deploy a citation-grounded AI search experience over the agency's own corpus — annual reports, media releases, investigation summaries, and the PowerBI dashboards that publish complaints and oversight data. No data was to leave the agency's boundary.
Solution
We deployed Onyx (the open-source enterprise RAG platform) and extended it with the specific connectors the agency's content stack required, particularly PowerBI dashboard scraping for the structured oversight data. We also implemented our TableRAG pathway so that computational questions across the dashboards — like complaint totals by category and year — get computed answers rather than retrieval guesses.
PDF extraction was tuned for the agency's annual-report layout, with cross-page table merging and proper handling of investigation summary structure. The deployment runs scale-to-zero on Fly.io between requests, with eight specialised Onyx workers split off the standard monolithic worker to keep idle cost negligible.
Outcome
End users now ask natural-language questions and get cited, defensible answers drawn from the agency's own published material — including computed answers across the PowerBI data that previously couldn't be queried this way at all. The deployment remains in production under the agency's data-sovereignty controls; no data leaves the agency's boundary.
Client is anonymised in this case study at their request. We're happy to discuss the engagement in detail under NDA.