Waitless

Waitless

PythonSeleniumPyPIAutomationTesting

The Challenge

UI automation tests suffer from intermittent 'flaky' failures because interactions execute while the UI is still changing. Traditional solutions like time.sleep() are too slow, WebDriverWait only checks one element, and retry decorators mask rather than solve the problem.

The Solution

Built intelligent stability detection using browser-side JavaScript instrumentation. The library monitors DOM mutations (MutationObserver), network requests (XHR/fetch interception), CSS animations, and layout shifts to determine when a page is truly ready for interaction. Integrates with Selenium via a single line of codeβ€”zero test rewrites required.

  • βœ“DOM Stability Detection
  • βœ“Network Idle Monitoring
  • βœ“Animation Completion Tracking
  • βœ“One-Line Integration
  • βœ“Detailed Diagnostics
  • βœ“Zero External Dependencies

Waitless: Case Study

Eliminating Flaky UI Automation Tests Through Intelligent Stability Detection


Executive Summary

Waitless is a Python library that eliminates flaky UI automation test failures by replacing arbitrary waits and sleeps with intelligent stability detection. Through browser-side JavaScript instrumentation, it monitors DOM mutations, network requests, CSS animations, and layout shifts to determine when a page is truly ready for interaction. The library integrates with Selenium via a one-line change, requiring zero modifications to existing test code. This approach reduces test flakiness by addressing the root causeβ€”racing against incomplete UI stateβ€”rather than masking it with arbitrary delays.


Problem

The Original Situation

UI automation tests in large test suites suffer from intermittent failures that pass on retry but fail unpredictably. These "flaky tests" occur because test interactions (clicks, typing, assertions) execute while the UI is still changing.

What Was Broken

Test run 1: βœ— ElementClickInterceptedException
Test run 2: βœ“ Pass
Test run 3: βœ“ Pass
Test run 4: βœ— StaleElementReferenceException
Test run 5: βœ“ Pass

Common failure modes included:

Failure TypeRoot Cause
ElementClickInterceptedExceptionOverlay/modal still animating
StaleElementReferenceExceptionDOM rebuilt by React/Vue/Angular
ElementNotInteractableExceptionElement not yet visible/enabled
Wrong element clickedLayout shift moved target element

Risks Caused

  1. Wasted CI time - Re-running flaky tests wastes compute resources
  2. Lost developer trust - Teams ignore test failures assuming flakiness
  3. Missed regressions - Real bugs hidden among noise
  4. Slow feedback loops - Adding arbitrary sleeps slows test execution

Why Existing Approaches Were Insufficient

ApproachLimitation
time.sleep(2)Arbitrary delayβ€”either too short (still fails) or too long (slows suite)
WebDriverWait with expected_conditionsOnly checks ONE element condition, misses page-wide state
Retry decoratorsMasks the problem, doesn't solve it; still uses CI time on retries
Playwright auto-waitFramework-specific; doesn't help Selenium users

None of these approaches addressed the fundamental question: "Is the entire page stable and ready for interaction?"


Challenges

Technical Challenges

  1. Defining "stability" - No standard definition exists. What signals indicate a page is ready?

  2. Cross-domain monitoring - JavaScript instrumentation must intercept:

    • DOM mutations (MutationObserver)
    • Network requests (XHR and fetch interception)
    • CSS animations/transitions (event listeners)
    • Layout changes (ResizeObserver, position tracking)
  3. Re-injection after navigation - Single-page apps may destroy instrumentation on route changes

  4. Thread safety - Selenium tests may run across multiple threads

  5. No external dependencies - Library must work without additional pip packages

Operational Challenges

  1. Zero test rewrites - Must integrate without modifying hundreds of existing tests

  2. No performance degradation - Cannot add significant overhead to test execution

  3. CI compatibility - Must work in headless environments without special setup

Hidden Complexities

  1. Infinite animations - Some apps have perpetual spinners that never "stabilize"

  2. Background network traffic - Analytics, WebSockets, long-polling never become "idle"

  3. Wrapped element identity - Wrapped elements behave like WebElements but isinstance() returns False


Solution

Design Approach

The solution uses a layered architecture with clear separation of concerns:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      Public API                              β”‚
β”‚    stabilize() / unstabilize() / wait_for_stability()       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                  Selenium Integration Layer                  β”‚
β”‚    StabilizedWebDriver / StabilizedWebElement               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   Stabilization Engine                       β”‚
β”‚    Polling, timeout handling, signal evaluation              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                 JavaScript Instrumentation                   β”‚
β”‚    MutationObserver, fetch/XHR intercept, animation events  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step-by-Step Implementation

1. Define Stability Signals

Created a signal-based system with mandatory and optional indicators:

SignalTypeThresholdMandatory
DOM MutationsMutationObserver100ms quiet periodYes
Network RequestsXHR/fetch count0 pendingYes
CSS AnimationsEvent listeners0 activeConfigurable
Layout ShiftsPosition tracking<1px movementStrict mode

2. Build JavaScript Instrumentation

Injected script creates a window.__waitless__ object that:

  • Intercepts fetch() and XMLHttpRequest to count pending requests
  • Registers MutationObserver on document root
  • Listens for animationstart/end and transitionstart/end events
  • Tracks element positions for layout stability
window.__waitless__ = {
    pendingRequests: 0,
    lastMutationTime: Date.now(),
    activeAnimations: 0,
    
    isStable() {
        if (this.pendingRequests > 0) return false;
        if (Date.now() - this.lastMutationTime < 100) return false;
        return true;
    }
};

3. Create Stabilization Engine

Python engine that:

  • Injects JavaScript via execute_script()
  • Polls browser for stability status
  • Evaluates signals against configured thresholds
  • Re-validates instrumentation before each check (handles navigation)

4. Implement Safe Wrapper Pattern

Instead of monkey-patching Selenium (risky), used wrapper pattern:

class StabilizedWebElement:
    def click(self):
        self._engine.wait_for_stability()  # Auto-wait!
        return self._element.click()

This approach:

  • Doesn't modify Selenium internals
  • Easy to undo with unstabilize()
  • Lower risk of breaking on Selenium upgrades

5. Add Diagnostic Reporting

Created waitless doctor CLI that explains WHY stability wasn't reached:

╔══════════════════════════════════════════════════════╗
β•‘            WAITLESS STABILITY REPORT                 β•‘
╠══════════════════════════════════════════════════════╣
β•‘ BLOCKING FACTORS:                                    β•‘
β•‘   ⚠ NETWORK: 2 request(s) still pending             β•‘
β•‘   β†’ GET /api/users (started 2.3s ago)               β•‘
╠══════════════════════════════════════════════════════╣
β•‘ SUGGESTIONS:                                         β•‘
β•‘   1. Set network_idle_threshold=2 for background    β•‘
β•‘      traffic                                         β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Tools & Technologies Used

ComponentTechnology
LanguagePython 3.9+
Browser IntegrationSelenium WebDriver
Browser InstrumentationVanilla JavaScript (injected)
ConfigurationPython dataclasses
CLIargparse (stdlib)
External DependenciesNone

Package Structure

waitless/
β”œβ”€β”€ __init__.py           # Public API exports
β”œβ”€β”€ __main__.py           # CLI entry point
β”œβ”€β”€ config.py             # StabilizationConfig dataclass
β”œβ”€β”€ engine.py             # Core polling/evaluation logic
β”œβ”€β”€ exceptions.py         # Custom exception types
β”œβ”€β”€ instrumentation.py    # JavaScript code templates
β”œβ”€β”€ selenium_integration.py  # Wrapper classes
β”œβ”€β”€ signals.py            # Signal definitions
└── diagnostics.py        # Report generation

Outcome/Impact

Quantified Improvements

MetricBeforeAfterImprovement
Integration effortHours of test rewrites1 line of code~99% reduction
Arbitrary sleeps in testsMultiple per testZeroEliminated
False flaky failuresCommonRareDeterministic behavior
Diagnostic clarity"Element not found"Full stability reportActionable insights

Test Code Transformation

Before (brittle):

driver.get("https://example.com")
time.sleep(2)  # Hope this is enough?
WebDriverWait(driver, 10).until(
    EC.element_to_be_clickable((By.ID, "button"))
)
driver.find_element(By.ID, "button").click()
time.sleep(1)  # Wait for AJAX?

After (stable):

driver = stabilize(driver)  # One-time setup
driver.get("https://example.com")
driver.find_element(By.ID, "button").click()  # Just works

Long-Term Benefits

  1. Reduced CI costs - Fewer flaky re-runs
  2. Faster test execution - No arbitrary sleeps
  3. Improved debugging - Clear diagnostics when issues occur
  4. Framework independence - Core engine can extend to Playwright
  5. Knowledge capture - Stability definitions codified, not tribal knowledge

Key Files

FilePurpose
config.pyConfiguration with validation
engine.pyCore stabilization engine
instrumentation.pyJavaScript browser monitoring
selenium_integration.pyWrapper pattern implementation
diagnostics.pyReport generation
README.mdDocumentation

Summary

Waitless solves the pervasive problem of flaky UI tests by replacing time-based waits with intelligent stability detection. Through browser-side JavaScript instrumentation monitoring DOM mutations, network requests, and animations, it determines when a page is truly ready for interaction. The library integrates via a single line of code (stabilize(driver)), requires zero external dependencies, and provides detailed diagnostics when issues occur. This transforms brittle, timing-dependent tests into deterministic, stable automation that works reliably in both local and CI environments.

Get In Touch

Interested in collaborating or have a question about my projects? Feel free to reach out. I'm always open to discussing new ideas and opportunities.