📑 Table of Contents

Uber Migrates 75K Test Classes From JUnit 4 to 5

📅 · 📁 Industry · 👁 6 views · ⏱️ 12 min read
💡 Uber engineers used automated code transformation to migrate over 75,000 test classes and 1.25 million lines of code from JUnit 4 to JUnit 5, noting that generative AI produced inconsistent results.

Uber has completed one of the largest known automated test migrations in the software industry, converting more than 75,000 test classes and 1.25 million lines of code from JUnit 4 to JUnit 5 inside a single Java monorepo. The effort relied on deterministic code transformation and orchestration tooling rather than generative AI, which engineers found produced inconsistent results at scale.

The migration underscores a growing trend in enterprise engineering: even as AI coding assistants gain traction, rule-based automated refactoring remains the more reliable choice for large-scale, safety-critical code transformations.

Key Takeaways

  • Uber migrated 75,000+ test classes and 1.25 million+ lines of code from JUnit 4 to JUnit 5
  • The migration targeted a single Java monorepo integrated with Bazel, which lacks native JUnit 5 support
  • Generative AI was evaluated but rejected due to inconsistent output on custom test cases
  • Deterministic transformation tools ensured consistency across the entire codebase
  • JUnit 4 has been in maintenance mode since 2021, making the upgrade essential to reduce technical debt
  • Engineers Anshuman Mishra and Kaushik Vejju led the migration strategy

Why Uber Needed to Leave JUnit 4 Behind

JUnit 4 entered maintenance mode in 2021, meaning it receives only critical bug fixes and no new feature development. For a company operating at Uber's scale — with hundreds of thousands of tests running continuously — staying on a legacy framework creates compounding technical debt.

JUnit 5 introduces a fundamentally different architecture. Built on the JUnit Platform, it features a modular design that separates the test engine from the platform runtime. The Jupiter engine provides modern APIs for writing tests, including significantly improved support for parameterized tests, nested test classes, and extension models.

For Uber, the practical implications were clear. Continuing with JUnit 4 meant forgoing access to new testing capabilities that could improve developer productivity and test reliability. The longer the migration was delayed, the more expensive it would become as the codebase continued to grow.

The Scale Problem: 75,000 Classes in a Bazel Monorepo

Uber's Java monorepo is not a typical codebase. It contains hundreds of thousands of test targets managed by Bazel, Google's open-source build system. Bazel is known for its speed and reproducibility, but it does not provide native support for JUnit 5 — a significant obstacle that had to be addressed before migration could begin.

Engineers first needed to build compatibility layers between Bazel and the JUnit 5 runtime. This involved creating custom test runners and build rule modifications that allowed JUnit 5 tests to execute within Bazel's existing infrastructure without disrupting the continuous integration pipeline.

The sheer volume of 75,000 test classes made manual migration impossible. Even if an engineer could migrate 50 classes per day, the effort would take more than 4 years of uninterrupted work. Automation was not just preferable — it was the only viable path forward.

Why Generative AI Fell Short

One of the most notable aspects of Uber's migration is what the team chose not to use. Engineers evaluated generative AI tools for the transformation but found the results unreliable, particularly for custom test cases that deviated from standard patterns.

Anshuman Mishra and Kaushik Vejju, the Uber engineers who led the project, explained that at this scale, deterministic transformation tools are essential for ensuring consistency. A generative model might correctly transform 95% of test cases but introduce subtle errors in the remaining 5% — and with 75,000 classes, that means nearly 3,750 potentially broken tests.

The inconsistency problem is especially dangerous in test code. Unlike application code, where a bug might produce a visible error, a flawed test can silently pass while no longer validating the behavior it was designed to check. This creates a false sense of security that undermines the entire purpose of the test suite.

Instead, Uber relied on rule-based code transformation tools that apply predictable, repeatable changes. These tools parse the abstract syntax tree (AST) of each test class, identify JUnit 4 patterns — such as @Test annotations from org.junit, @Before and @After lifecycle methods, and assertion imports — and mechanically replace them with their JUnit 5 equivalents.

Inside the Migration: Key Technical Changes

The JUnit 4 to JUnit 5 migration involves more than swapping import statements. Several fundamental API changes require careful handling:

  • Annotations: @Before and @After become @BeforeEach and @AfterEach; @BeforeClass and @AfterClass become @BeforeAll and @AfterAll
  • Assertions: The assertion methods move from org.junit.Assert to org.junit.jupiter.api.Assertions, and the parameter order for failure messages changes from first to last position
  • Expected exceptions: The @Test(expected = Exception.class) pattern is replaced by assertThrows()
  • Rules and runners: JUnit 4's @Rule and @RunWith mechanisms are replaced by the JUnit 5 Extension model
  • Parameterized tests: JUnit 4's @RunWith(Parameterized.class) gives way to @ParameterizedTest with various @Source annotations

Each of these transformations has edge cases. For example, custom TestRule implementations in JUnit 4 have no direct equivalent in JUnit 5 and may require rewriting as extensions. Uber's tooling had to identify these cases and either transform them automatically or flag them for manual review.

Orchestrating Changes at Monorepo Scale

Applying transformations to 75,000 files is only half the challenge. The other half is orchestration — coordinating the rollout so that the migration does not break the build for hundreds of engineering teams simultaneously.

Uber's approach involved batching changes and validating each batch through the existing CI/CD pipeline before merging. This incremental strategy allowed engineers to catch transformation errors early and iterate on the tooling without risking widespread build failures.

The monorepo structure actually provided an advantage here. Because all code lives in a single repository, the migration team could make atomic changes across dependent modules and verify cross-cutting impacts immediately. In a multi-repo setup, the same migration would require coordinating changes across dozens or hundreds of separate repositories with independent build systems.

Industry Context: The Rise of Automated Refactoring

Uber's migration fits into a broader industry movement toward automated large-scale code changes. Google has long used tools like Rosie and ClangMR to perform codebase-wide refactoring. OpenRewrite, an open-source automated refactoring framework, has gained significant traction for exactly this type of migration, offering pre-built recipes for JUnit 4 to JUnit 5 conversions.

The decision to favor deterministic tooling over generative AI also reflects a maturing understanding of where AI coding tools excel and where they struggle. AI assistants like GitHub Copilot and Amazon CodeWhisperer are effective for code generation and completion tasks, but large-scale mechanical transformations — where consistency matters more than creativity — remain better served by traditional AST-based tools.

This does not diminish the value of AI in software engineering. Rather, it highlights the importance of choosing the right tool for the job. Generative AI and deterministic refactoring serve complementary roles in modern development workflows.

What This Means for Engineering Teams

Uber's migration offers several lessons for organizations facing similar technical debt:

  • Do not delay framework migrations — the cost grows linearly (or worse) with codebase size
  • Evaluate AI tools critically — generative models may not deliver the consistency required for safety-critical transformations
  • Invest in deterministic tooling — rule-based transformers are more predictable and auditable
  • Use incremental rollouts — batch changes and validate through CI/CD to minimize risk
  • Leverage monorepo advantages — single-repository architectures simplify cross-cutting changes

For teams still running JUnit 4, the message is clear: migration is not optional. With the framework receiving no new features and the ecosystem increasingly building on JUnit 5, the gap will only widen.

Looking Ahead: Post-Migration Benefits

With the migration complete, Uber's engineering teams can now leverage JUnit 5's modern capabilities across their entire Java test suite. The Extension model enables cleaner test lifecycle management. Parameterized tests with @MethodSource and @CsvSource reduce boilerplate. Dynamic tests generated at runtime open new testing strategies.

Perhaps more importantly, the tooling and processes Uber built for this migration are reusable. Future framework upgrades — whether for testing libraries, dependency injection frameworks, or language version migrations — can follow the same playbook of deterministic transformation plus orchestrated rollout.

The project also adds to the growing body of evidence that while generative AI is transforming software development, it has not yet replaced the need for precise, rule-based engineering tools. As codebases continue to grow and technical debt accumulates, the ability to perform reliable automated refactoring at scale will remain a critical capability for engineering organizations worldwide.