The Linux Foundation Projects
Skip to main content
BlogHackathon

Arista Networks Speeds Up sonic-mgmt Testing: 2024 Hackathon Winner Spotlight

By December 2, 2024No Comments

The 2024 SONiC Virtual Hackathon brought together some of the brightest minds in the SONiC community to tackle key challenges and develop innovative solutions. This year, four exceptional projects stood out as winners, each addressing unique aspects of SONiC development and usability. Today, we’re highlighting the Most Impactful Award winner: Parallelizing and Reusing sonic-mgmt Test Runs, presented by Arista Networks.

Tackling the Testing Bottleneck: An Introduction to the Project

A full sonic-mgmt test run can take anywhere from a dozen to over a hundred hours—a significant bottleneck for SONiC developers. To address this, the Arista Networks team devised a solution to parallelize and reuse test runs. By breaking the test suite into smaller collections of tests that can run in parallel and later be recombined, they aim to reduce testing time to single-digit hours, accelerating development timelines and improving efficiency.

The Inspiration Behind the Innovation

The team’s motivation stemmed from the challenges of verifying new SONiC images effectively. Hardware and time constraints made it difficult to keep pace with upstream code updates, especially when some test runs for specific SKUs could take over three days. The need for a faster, more efficient testing process was clear.

A Walkthrough of the Solution

The Arista Networks team implemented a three-stage process:

  1. Scanning Tests: Analyzing the entire sonic-mgmt test suite to identify which test files are applicable to specific topologies.
  2. Running Subsets in Parallel: Dividing the tests into subsets and running them across multiple SKUs in parallel.
  3. Recombining Results: Merging the subset results into a complete test run for a given SKU and topology.

This intelligent division of tests allows for significant time savings. For example, tests marked with the any topology marker only need to be run once, with results applicable across all SKUs and topologies. Even in their initial efforts, the team achieved a 20% reduction in test time.

Overcoming Challenges

The biggest hurdle was the lengthy nature of sonic-mgmt test runs, which could have consumed the entire hackathon timeline. To address this, the team leveraged stored results from previous test runs as sample input data, enabling them to focus on implementing and validating their solution within the limited timeframe.

The Impact on SONiC Developers and Users

While the project may have limited direct impact on end users, its implications for developers and vendors are profound:

  • For Developers: Faster testing cycles allow for more frequent and efficient code validation, boosting development velocity.
  • For Vendors: Testing multiple SKUs simultaneously becomes feasible, reducing the time required for validation from days to mere hours. This translates to more frequent updates and higher-quality SONiC images.

Ultimately, the project indirectly enhances the experience for end users by improving the quality and reliability of SONiC software.

Key Takeaways and Future Plans

The hackathon highlighted the importance of test classification, ensuring that tests are correctly marked as SKU-specific, topology-dependent, or universally applicable. The team also identified opportunities to refine their solution further:

  • Transitioning from a per-test file to a per-test case approach for more precise parallelization.
  • Introducing additional test markers to improve classification and facilitate better reuse of results.

These enhancements could unlock even greater efficiencies in the testing process, making SONiC development even more robust and agile.