ATO’s 800-Developer AI Coding Assistant Pilot: COBOL Translation, Privacy Demands, and the Vendor Battle

The Australian Taxation Office is preparing to issue a request for tender for an enterprise-grade AI coding assistant that will serve roughly 800 in-house developers—a move that could set a precedent for how government agencies adopt generative AI in highly regulated, legacy-heavy environments. The tender, first reported by iTnews, seeks a Software-as-a-Service solution deeply integrated with Microsoft’s Visual Studio and Azure DevOps ecosystems, with explicit capabilities for COBOL-to-modern-language translation and a non-negotiable requirement that processed code never be used to train the vendor’s models.

The procurement signals a strategic shift: the ATO wants to free its engineering workforce from “trivial tasks” like boilerplate code, routine refactoring, and test script generation, redirecting talent toward application security, test-case planning, and legacy system maintenance. But the request also opens a window into the practical challenges of injecting AI into a sprawling, multi-technology IT estate that still includes mission-critical COBOL assets.

What the ATO’s Tender Asks For

The statement of requirements, detailed in the tender documents, envisions a coding assistant that becomes a natural part of a developer’s daily workflow. Highlights include:

Real-time code suggestions and completions inside Visual Studio 2019, Visual Studio 2022, and Visual Studio Code.
Bug detection and automated fix proposals, with the ability to refactor across a multi-technology stack.
Automated generation of unit and integration test cases and scripts.
Tight integration with Azure DevOps pipelines and Git repositories, turning AI into a participant in pull-request reviews and CI feedback loops.
Legacy language translation, moving COBOL code into modern languages while preserving business logic and integration patterns.
Strict data handling rules: all code processed by the assistant must not be stored or used for model training, and logging/telemetry must comply with government privacy and security standards.

These aren’t pie-in-the-sky requests. They mirror the enterprise features that major vendors like Microsoft, Amazon Web Services, and IBM have been rolling out for their coding assistants. The twist is the combination of a public-sector accountability framework, the need to work within Australia’s regulatory environment, and the sheer complexity of the ATO’s codebase—spanning multiple repositories, languages, and decades-old mainframe applications.

Why 800 Developers, and Why Now?

The ATO is not starting from zero. An Australian National Audit Office (ANAO) performance audit released last year showed the agency already operates dozens of AI models and is experimenting with large multimodal models to understand taxpayer documents—images and all. The coding assistant fits into a broader enterprise AI strategy that has been building governance structures but, as the ANAO noted, still needs to mature formal monitoring and deployment controls.

For an organization of this scale, deploying an AI assistant isn’t a simple IDE plugin install. It’s an enterprise integration project touching code quality, security, procurement, and developer upskilling. The 800 core developers sit at the heart of tax administration—if productivity gains materialize, the public benefit could be significant; if things go wrong, the blast radius is wide.

The Legacy Code Elephant: COBOL Translation Is Not a Magic Wand

The tender’s most eyebrow-raising requirement is the ability to translate legacy COBOL into modern languages. It’s a noble goal, but practitioners know automatic translation is fraught with peril. Community discussions highlight that generic LLM-based assistants can churn out syntactically correct Java or C# but often miss crucial mainframe integration nuances: database access patterns, CICS/TPX interactions, error handling, and stateful services.

Independent comparisons, such as those from IBM’s watsonx Code Assistant for Z and specialist mainframe migration tools, show that purpose-built converters outperform generalist assistants when full business-logic fidelity is required. The ATO’s use of “translate” in its tender may be aspirational; savvy observers recommend treating the capability as “assisted modernization”—a human-in-the-loop process where AI scaffolding is validated by subject matter experts and tested via automated equivalence checks.

Privacy and the No-Training Mandate

The requirement that processed code never be used to train the vendor’s models is a line in the sand. Government code can contain operational secrets, patterns that reveal system architecture, or even embedded personally identifiable information (PII). Public-sector procurement globally is converging on the same demand: vendor models must not ingest customer data.

How can that be guaranteed? The tender will force bidders to demonstrate technical controls—ephemeral compute, no persistent storage of code snippets, audited logging—and back them with contractual lock-ins. The ATO’s own AI governance maturity will be tested here: it must verify promises through audits and penetration testing, not just trust a vendor’s marketing sheet.

Integration with Microsoft’s Ecosystem: Blessing or Lock-In?

The ATO explicitly names Visual Studio 2019/2022, VS Code, Azure DevOps, and Git repositories. That plays directly into Microsoft’s Copilot/IntelliCode stack, but also leaves room for others that support these tools (AWS CodeWhisperer, for instance, has a Visual Studio extension). The deep integration can accelerate time-to-value, but it also raises vendor lock-in concerns. If the ATO builds custom connectors, fine-tunes prompts, and trains developers on a single vendor’s assistant, switching costs could be high.

The tender process will need to weigh portability and interoperability. A cloud-agnostic, self-hosted solution might offer more flexibility, but it would require additional engineering effort to integrate with the ATO’s existing CI/CD pipelines. The evaluation panel will have to decide whether convenience today is worth potential friction tomorrow.

Community and Industry Perspectives

In online discussions among Australian IT professionals, the ATO’s move is seen as both brave and risky. Several points emerged:

Productivity expectations must be grounded. Private-sector studies show AI assistants can boost developer output, but the gains vary wildly. In a compliance-heavy environment, the first few months may see more overhead as teams learn where the assistant excels and where it hallucinates.
Skills evolution is unavoidable. Developers won’t be replaced, but their day-to-day work will shift. Senior staff may spend more time on architecture and security reviews; junior staff will need training on prompt engineering and critical validation of AI outputs.
Governance can’t be an afterthought. The ANAO already flagged the ATO’s need for stronger AI monitoring. Adding a tool that can inject code into repositories demands new approval gates, audit trails linking AI suggestions to PRs, and continuous oversight to catch drift in suggestion quality or security risks.
The vendor landscape is complex. Microsoft’s Copilot is the default fit, but IBM’s mainframe-specialized tools might win the COBOL use case. AWS CodeWhisperer emphasizes security scanning. Niche players offer on-premises deployment for strict data sovereignty—which might be essential for parts of the ATO’s codebase.

How the Pilot Could Succeed—and How It Could Fail

Success will hinge on execution, not just tool choice. Practical recommendations drawn from forum analysis and the ANAO report include:

Start small. A subset of teams, non-production repositories, and synthetic sensitive data. Measure cycle time, defect rates, and developer satisfaction before extending.
Enforce CI/CD guardrails. AI can create pull requests, but nothing merges without human review, static analysis, and security scanning. Treat the assistant as a contributor, not a committer.
Treat legacy translation as an assisted workflow. Combine AI-generated scaffolding with SME review and automated equivalence tests. Set realistic timelines for COBOL modernization.
Demand verifiable privacy. Require private-model hosting or on-premises proxies, and audit the vendor’s no-training commitment with penetration testing.
Build a performance dashboard. Track metrics like PR cycle time, AI suggestion acceptance rate, regression test pass rates, and production incidents linked to AI-influenced code.

Failure modes are equally clear: over-trusting translation quality without human review, neglecting governance until an incident occurs, or accepting a vendor’s privacy claims without independent verification. Each could derail both the project and public confidence.

Broader Policy Ripples for Government IT

The ATO’s tender is a bellwether. If the pilot works, expect other Australian agencies—and public-sector organizations worldwide—to follow. That raises policy questions that Canberra (and peers) must address:

How should government procurement rules define acceptable data residency, training bans, and audit rights for AI tools?
What regulatory frameworks are needed for AI-generated code in legally consequential systems?
How can agencies maintain transparency and public trust when algorithms contribute to software that processes citizen data?

The ATO’s careful, privacy-first approach sets a baseline. But the real test will come when the first production incident is traced back to an AI suggestion. The governance structures being built now will determine whether the agency can respond swiftly and credibly.

The Vendor Horse Race

While the tender is still open, it’s worth sizing up the likely contenders:

Vendor	Key Strengths	ATO Fit Cautions
Microsoft Copilot/IntelliCode	Deep Visual Studio/Azure DevOps integration; enterprise compliance controls	High lock-in; COBOL translation not a core strength
AWS CodeWhisperer	Strong security scanning; VS Code and limited Visual Studio support	Less mature .NET/Visual Studio 2019 integration; no mainframe specialism
IBM watsonx Code Assistant for Z	Best-in-class mainframe modernization; faithful COBOL→Java conversions	May require IBM ecosystem familiarity; integration with Azure DevOps needs evaluation
Boutique/on-prem options	Data sovereignty; custom deployment	Smaller user base; may lack polished IDE plugins

No single vendor excels in every dimension. The ATO will likely run a proof-of-concept with representative legacy code to test translation quality and will probe privacy guarantees through technical demonstrations.

Conclusion

The ATO’s AI coding assistant procurement is a landmark move for enterprise AI in the public sector. By demanding deep integration with existing tools, a hard privacy line, and COBOL modernization help, it is charting a path that many large organizations will watch closely. If executed with conservative pilots, rigorous governance, and an eye toward vendor flexibility, the project could unlock meaningful productivity gains without compromising the integrity of critical tax systems. The next 12–18 months will show whether the agency can turn vendor capability into durable, auditable developer productivity—and set the standard for government AI adoption.