Case Study - Uncommon Goods

How Uncommon Goods stopped spending 50% of their QA time on Selenium

A conversation between

Sneha Sivakumar

CEO of Spur

Solomon Ademuwagun

QA Manager, UncommonGoods

COMPANY

UncommonGoods is an online retailer known for thoughtfully curated, design-forward products, connecting customers with unique goods from independent makers around the world

INDUSTRY

E-Commerce ($222 Million)

COMPANY SIZE

51–200

FOUNDED

1999

50%

Reduction in release time,

after adopting Spur.

90%+

Test accuracy

achieved in weeks vs. months with Selenium.

$300K

Saved in QA costs

while optimizing the process.

Sections

About the Company

DESCRIPTION

UncommonGoods is an online retailer known for thoughtfully curated, design-forward products, connecting customers with unique goods from independent makers around the world

INDUSTRY

E-Commerce ($222 Million)

COMPANY SIZE

51–200

FOUNDED

1999

The Problem

Half of every working day, gone on maintaining the infrastructure that was supposed to test the product.

UncommonGoods has been selling unique, thoughtfully curated goods from independent makers since 1999. Their e-commerce site is the product, and keeping it working reliably across checkout, browsing, and discovery flows is what QA exists to do.

But before Spur, QA at UncommonGoods wasn't really doing that. They had 150 tests built on Selenium, a DevOps dependency to keep the infrastructure running, and an offshore support arrangement just to maintain what they had. Around 50% of the QA team's time was going to maintenance, keeping the test suite alive, not running it productively.

"You're not spending 50% of your time doing maintenance anymore… you're spending maybe 1%, and boom, you run it."

The tests that did exist were brittle. Any UI change could break them. Complex releases required weeks of QA preparation and coordination just to reach a starting point. And despite all of that effort, site reliability was landing at 89–92%, below industry standards for a retailer that depends entirely on its website. Bugs were still reaching production. QA was a bottleneck without being a safety net.

The Solution

Maintenance became 1% of the job and coverage actually improved.

The shift from Selenium to Spur wasn't just a tool swap, it was a rethink of the whole testing approach. UncommonGoods consolidated 150 redundant, overlapping, brittle Selenium tests into around 30 dynamic, adaptive Spur tests. Fewer tests, covering more ground, with virtually no maintenance overhead.

What made that possible is the difference in how Spur works. Spur's agents navigate like real users, they adapt to UI changes automatically rather than breaking when a selector shifts. Writing a test is describing what you want to verify in plain language, not maintaining a fragile script. The infrastructure dependency disappeared entirely.

Within weeks, UncommonGoods reached 90%+ test accuracy, a benchmark that took months to achieve with Selenium. Regression moved from a once-per-release event to something the team could run multiple times per week with minimal overhead.

"The more you use Spur, the smarter it gets. The smarter it gets, the faster you can write tests and find bugs."

Crucial Moment

A complex release that would have taken weeks of QA was automated in 1-2 days.

This is the number Solomon comes back to most, a specific release that previously required weeks of QA preparation was handled by Spur in one to two days. That's roughly 10 business days saved on a single release. For a retailer where time to deploy directly affects revenue, that's not an operational improvement, it's a strategic one.

"Time is money, and that's the strength of Spur."

Spur also started surfacing clusters of bugs in checkout, the highest-stakes flow on any e-commerce site, that were previously reaching production. Site reliability climbed from 89–92% to 95–98%.

"That's above industry standards… a pretty good indicator of how good Spur is."

The Shift

With maintenance gone, QA became what it was always supposed to be.

The 50% of time that used to go to Selenium maintenance didn't disappear, it got redirected. With regression running reliably and automatically, Solomon's team shifted to the work that actually requires human judgment:

Edge case and exploratory testing, the scenarios no automated suite will think to try
Expanding automation coverage into new areas of the product
Evaluating internal tools for further automation opportunities
Working toward a further 25–40% reduction in manual QA

"It's allowed employees to focus on what they're really good at instead of just busy work."

The longer-term goal is catching blockers earlier, in development, not at release. That's the shift from QA as a release gate to QA as a development accelerator.

"If we can catch blockers early… that's the whole ball game."

50%

Release time reduction

90%+

Test accuracy achieved in weeks

$300K

Saved in QA costs

Critical e-commerce flows across 30+ regions

Every regional price, discount rule, and product variant automatically tested before your sale goes live, no manual spot-checking required.

Hundreds of partner landing pages

Ensuring that every audience coming from podcasts, newsletters, and other partnerships lands on a page that is on brand and error free.

Staging and production environments

Running tests in staging for high confidence before launch, then validating again on production as a final safety net.

Key Insights

UncommonGoods didn't just replace a tool. They replaced a way of working, one where half the job was keeping the test infrastructure alive, with one where tests run themselves and the team focuses on what actually takes judgment. 150 tests became 30. Maintenance became 1%. Site reliability crossed industry benchmarks. That's what happens when QA stops being a burden and starts being a system.