SONiC Mentorship Spotlight: Meghana Ambalathingal on Improving Test Flakiness and Runtime in sonic-swss

The SONiC Mentorship Program brings together new SONiC contributors and mentors to work hands-on on real-world technical challenges and strengthen the open networking community. In this spotlight, we speak with Meghana Ambalathingal, a recent UC Berkeley graduate, about her work on improving test speed, reliability, and determinism in the sonic-swss repository by converting Python-based tests into fast, in-process C++ unit tests.

About the Mentee

Hi, I’m Meghana! I recently graduated from UC Berkeley with a major in Computer Science. I joined the SONiC Mentorship Program to contribute to a real open-source networking platform and to learn how large systems are tested at scale.

Q: What project did you work on, and why is it important to SONiC?

Topic: Converting selected Python tests (pytest + DVS) into C++ unit tests using GoogleTest/GoogleMock inside the sonic-swss repo.

Goal: Make tests faster, more deterministic, and less dependent on external services when we’re validating orchestration logic.

Problem: Full DVS/ASIC_DB tests are excellent for end-to-end coverage, but they can be slow and occasionally flaky. Many behaviors can be verified in-process if we replace external dependencies with mocks and assert on public state (tables, counters, mock SAI calls).

Q: What were your main technical contributions?

I translated targeted pytests into GoogleTest/GoogleMock suites that:

Seed minimal DB state (e.g., ports, interfaces, neighbors)
Drive specific orch code paths by enqueueing inputs and running doTask()/timers directly
Assert on outcomes in SONiC tables and on calls captured by a mock SAI layer

Example Merge Request

What I did:
Added a single test, RouteOrch_AddRemoveIPv4_And_DefaultRoute_State, that covers adding/removing 2.2.2.0/24 and the default route. I wrote small helpers to read the route state from STATE_DB and to wait for ok/na.

Why:
It mirrors the original pytest intent but fits a mock unit-test harness that pre-seeds ports, interfaces, neighbors in SetUp(). It stays fast and deterministic by verifying behavior via SAI mocks and DB checks instead of relying on DVS/ASIC_DB.

How verified:
The test runs in milliseconds and asserts that RouteOrch writes the expected fields and that the mock SAI saw the expected add/remove calls.

Technical Stack:

GoogleTest/GoogleMock, SONiC table helpers for APPL_DB / STATE_DB / config, and small SAI mock hooks. Timers/consumers are ticked directly (no sleeps) to avoid flakes.

Q: Why did you choose to use mocks?

Speed: Everything runs inside the process, no containers or external daemons to spin up.
Determinism: Fewer timing races; results are repeatable.
Isolation: We test the orchestration logic itself and check what it outputs (DB rows, SAI calls) without depending on a full system.

Q: What challenges did you face, and what did you learn?

Onboarding to C++ after Python

Most of my testing background was Python. Moving to C++/GoogleTest meant learning the C++ build flow (headers vs. sources), handling link errors, and being careful with types and const-correctness. Treating warnings as hints and iterating in small steps helped a lot.

Understanding sonic-swss architecture

To mock well, I first mapped which orch was under test (e.g., RouteOrch), which tables it read/wrote (CONFIG_DB → APPL_DB → STATE_DB), and what SAI actions should fire (add/remove route). Then I mocked only the boundary I needed and asserted on observable outcomes.

Translating pytests to GTest

Pytests often validated outcomes indirectly via DVS/ASIC_DB. I rewrote those checks as: enqueue inputs → call doTask()/tick timers → assert on table fields and recorded SAI calls. For unordered field/value checks, I compared sets so tests wouldn’t fail on key ordering.

Debugging in a VM using build logs

Interactive debugging in the VM was tough. I leaned heavily on build logs (compiler errors, link failures, test output) to iterate: fix one error, rebuild a narrow target, re-run, repeat. This forced a disciplined loop of minimal changes and fast feedback that closely matched CI conditions.

Q: What impact has this mentorship had, and what are your next steps?

Today:
The converted tests run much faster and are more reliable, giving maintainers quick feedback on orchestration behavior.

For contributors:
It’s easier to iterate, write a small unit test, run it in milliseconds, and get a clear failure pointing to a missing field or unexpected call.

Next:

Apply the same pattern to more orch modules
Factor out small, reusable helpers (state waits, enqueue ops, SAI expectations)
Keep a healthy split: fast unit tests plus targeted integration/system tests for balanced coverage

Q: Is there anyone you’d like to acknowledge?

Huge thanks to my mentors Prabhat Aravind and Prince Sunny for guidance throughout this project. I learned a ton about practical C++, the sonic-swss testing model, and how to write fast, stable tests. I’m excited to keep contributing!