Automation

AI

Test Automation

From Plain Text to Beautiful Timelines: Launching the Visual Flight Recorder

June 27, 2026 3 min read

Debugging AI coding agents shouldn't feel like parsing endless flat text dumps. When we launched Agent Blackbox, our goal was to build a local-first flight recorder for autonomous software—enabling developers to track, audit, and diagnose runs privately. Today, we're taking a massive leap forward in usability and developer experience with the release of the Visual Flight Recorder.

By adding the `export-html` subcommand (with the clean `visual` alias) to the CLI, Agent Blackbox can now compile verbose terminal logs, background cron outputs, and gateway histories into a self-contained, interactive, dark-themed HTML postmortem dashboard.

The Pain: The Terminal Cognitive Load

AI coding agents are notoriously non-deterministic. They don't just execute static pipelines; they make decisions, fetch web pages, write code, and run tests. When an agent gets caught in an infinite tool-use loop or fails to respond, scanning plain-text terminal logs to find the single hanging call or unhandled exception is incredibly tedious.

A flat log file cannot show you correlation. It cannot easily group which user prompt triggered which outbound flush, nor can it visually distinguish a minor network warning from a critical, system-blocking write crash. The Visual Flight Recorder changes this entirely.

Inside the Visual Flight Recorder

The visual report converts raw timeline telemetry into a structured, highly legible dashboard. Here is what we engineered into the dark-themed HTML report:

Interactive Timeline Filters: Instant quick-filtering buttons (All, Inbounds, Outbounds, Errors, Gateway) combined with a real-time fuzzy text search bar to isolate relevant log events in milliseconds.
Message Correlation Trace: Groups incoming user prompts and outgoing agent flushes into distinct, readable chat bubbles, displaying precise execution latency (e.g., '⚡ Responded with 14.5s latency').
At-a-Glance Metrics Cards: Highlights key transaction stats such as Total Inbound Messages, Outbound Flushes, Platform Error Counts, and Cache Invalidations.
Pragmatic Action Banners: Color-coded status banners displaying a definitive diagnosis of the run (e.g., green for `response_flushed`, orange for `agent_busy_or_response_pending`, and red for failure modes) along with actionable recommendations.

Built for Security and Portability

Following Agent Blackbox's core local-first and secure thesis, the Visual Flight Recorder requires zero network requests and operates entirely in memory. It produces a completely self-contained, portable HTML file that you can share with your team or save locally without pulling external JavaScript libraries (like Bootstrap, Tailwind, or jQuery) from CDNs.

Additionally, because agents deal with untrusted web content and executable code, the HTML compiler applies strict XSS prevention. Every single log line is programmatically escaped using Python's standard `html.escape()` utility before rendering. If an agent parses a page containing a `

How to Run it Locally

To compile a safe, sample incident and view the visual report instantly, run the built-in demo suite:

Code

python agent_doctor.py demo --out exports/demo.html

To compile live postmortems of your local gateway activity over the last 3 hours, run:

Code

python agent_doctor.py export-html --minutes 180 --out exports/postmortem.html

Or use the clean, highly intuitive alias:

Code

python agent_doctor.py visual --minutes 180 --out exports/postmortem.html

Once generated, simply double-click `exports/postmortem.html` to open your interactive visual flight record!

Zero-Dependency Python Architecture

Under the hood, we designed this rendering engine with extreme performance constraints. The HTML template is compiled using Python standard string formatting, making it incredibly lightweight:

Code

def html_postmortem(events: list[Event], diagnosis: dict, platform: str) -> str:
    status = diagnosis.get("status", "unknown")
    status_class = status.replace("_", "-")
    
    # Render timeline list item-by-item
    timeline_items_html = []
    for evt in events:
        evt_type = evt.event_type
        severity = evt.severity
        ts_str = evt.timestamp.strftime("%Y-%m-%d %H:%M:%S") if evt.timestamp else ""
        
        item_html = f'''
        <div class="timeline-item" data-type="{html.escape(evt_type)}" data-severity="{html.escape(severity)}">
            <div class="timeline-dot"></div>
            <div class="timeline-content">
                <span class="event-time">{html.escape(ts_str)}</span>
                <span class="event-badge badge-{html.escape(severity.lower())}">{html.escape(severity)}</span>
                <span class="event-type">{html.escape(evt_type)}</span>
                <p class="event-message">{html.escape(evt.message)}</p>
            </div>
        </div>
        '''
        timeline_items_html.append(item_html)
        
    return html_template.format(
        status=status,
        status_class=status_class,
        timeline_html="".join(timeline_items_html),
        # ... and other template fields
    )

Upgrade Your Observability Layer

Visualizing non-deterministic agent executions shouldn't require bloated cloud subscriptions. Keep your data local, keep your runs private, and let the Visual Flight Recorder bring pristine clarity to your autonomous systems.

About the Author

Dhiraj Das | Senior Automation Consultant | 10+ years building test automation that actually works. He transforms flaky, slow regression suites into reliable CI pipelines—designing self-healing frameworks that don't just run tests, but understand them.

Creator of many open-source tools solving what traditional automation can't: waitless (flaky tests), sb-stealth-wrapper (bot detection), selenium-teleport (state persistence), selenium-chatbot-test (AI chatbot testing), lumos-shadowdom (Shadow DOM), and visual-guard (visual regression).

June 21, 2026

From Passive Log-Reading to Active Stream-Tapping: Building a Local Flight Recorder for AI Agents

AI AgentsObservability

June 15, 2026

pytest-why: Turning Pytest Failures into Actionable Engineering Guidance

Pythonpytest

June 13, 2026

Practical Hermes Agent Use Cases for QA Engineers: From Nightly Failures to Release Intelligence

Hermes AgentAI Agents

Share this article:

From Plain Text to Beautiful Timelines: Launching the Visual Flight Recorder

The Pain: The Terminal Cognitive Load

Inside the Visual Flight Recorder

Built for Security and Portability

How to Run it Locally

Zero-Dependency Python Architecture

About the Author

You might also like

From Passive Log-Reading to Active Stream-Tapping: Building a Local Flight Recorder for AI Agents

pytest-why: Turning Pytest Failures into Actionable Engineering Guidance

Practical Hermes Agent Use Cases for QA Engineers: From Nightly Failures to Release Intelligence

Get In Touch