Automation

AI

Test Automation

Announcing pytest-mockllm v0.2.1: "True Fidelity"

December 22, 2025 3 min read

🎯

What's New in v0.2.1

True Async & Await: Native coroutines for OpenAI, Anthropic, Gemini, and LangChain
Pro Tokenizers: tiktoken integration for >99% token accuracy
PII Redaction: Automatic scrubbing of API keys before cassette storage
Chaos Engineering: Simulate rate limits, timeouts, and network jitter
Python 3.14 Ready: First to officially support and verify the latest Python

We are thrilled to announce the release of pytest-mockllm v0.2.1, codenamed "True Fidelity".

This release is a complete technical overhaul designed to make LLM testing as robust as the systems you're building. For the first time, developers can test complex asynchronous AI workflows with a level of accuracy that mirrors production environments exactly.

🚀

The Challenge We Solved

When we first released pytest-mockllm, our async support was a "best-effort" wrapper around synchronous mocks. While this worked for simple cases, it failed in production-grade environments where developers used:

Complex coroutine orchestration: Real async workflows with multiple awaits
Asynchronous generators: Streaming responses via LangChain's `astream` and `ainvoke`
Strict type checking: MyPy compatibility requirements
Enterprise security: VCR-style recordings risking API key leaks

⚡

True Async & Await

We've rewritten our core mocks from the ground up to support real asynchronous patterns. No more fake awaitables—pytest-mockllm now provides native coroutines and async iterators for OpenAI, Anthropic, Gemini, and LangChain.

Every provider mock now implements native `async def` methods that return real coroutines. This ensures that `await` calls behave exactly as they do with real SDKs.

Code

import pytest
from pytest_mockllm import mock_openai

@pytest.mark.asyncio
async def test_async_completion():
    with mock_openai() as mock:
        mock.set_response("Hello from pytest-mockllm!")
        
        # Real async/await - no fake wrappers
        response = await client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": "Hi"}]
        )
        
        assert response.choices[0].message.content == "Hello from pytest-mockllm!"

Pro Tokenizers (tiktoken)

Standard character-based token estimation is often off by 20-30%. By integrating `tiktoken` (OpenAI) and custom heuristics (Anthropic), we brought our accuracy to >99% for standard models.

This allows developers to write precise assertions on usage and cost—critical for prompt window testing and budget limits.

Real Accuracy

Token counts now match exactly what you'd see in your OpenAI dashboard.

ROI Dashboard

Run your tests and see your savings! Every session now ends with a professional terminal summary showing exactly how many tokens you avoided paying for.

Code

═══════════════════════════════════════════════════════
   pytest-mockllm ROI Summary
═══════════════════════════════════════════════════════
   Tests Run:        47
   API Calls Mocked: 312
   Tokens Saved:     847,291
   Estimated Cost:   $12.71 (at GPT-4 pricing)
═══════════════════════════════════════════════════════

PII Redaction by Default

Security should never be an afterthought. We implemented a `PIIRedactor` that automatically scrubs sensitive data before the cassette is ever written to disk, ensuring zero leak risk.

`api_key` and `sk-...` strings
`Authorization: Bearer ...` headers
Sensitive parameters in request bodies

Enterprise Ready

Teams can now safely share VCR cassettes across repositories without security risk.

Chaos Engineering for LLMs

The real world is messy. Our new chaos tools allow you to simulate network jitter and random API refusals to ensure your retry logic and fallback systems are bulletproof.

Code

from pytest_mockllm import mock_openai, chaos

def test_retry_logic():
    with mock_openai() as mock:
        # Simulate rate limit on first 2 calls, then succeed
        mock.add_chaos(chaos.rate_limit(times=2))
        mock.set_response("Success after retry!")
        
        # Your retry logic should handle this gracefully
        response = call_with_retry(prompt="Hello")
        assert response == "Success after retry!"

The First to Python 3.14

We are proud to be one of the first AI testing tools to officially support and verify compatibility with Python 3.14. We are building for the future, today.

🎯

Outcomes

Zero Flakiness: True async support eliminated `TypeError` and "coroutine not awaited" bugs in CI
Enterprise Ready: Secure recording allows teams to share cassettes without security risk
Future Proof: Full verification against Python 3.14 ensures the library is ready for the next decade of AI development

Get Started

Code

pip install -U pytest-mockllm

PyPI: pypi.org/project/pytest-mockllm
GitHub: github.com/godhiraj-code/pytest-mockllm

Built by Dhiraj Das

Automation Architect. Making LLM testing as reliable as the AI systems you're building.

About the Author

Dhiraj Das | Senior Automation Consultant | 10+ years building test automation that actually works. He transforms flaky, slow regression suites into reliable CI pipelines—designing self-healing frameworks that don't just run tests, but understand them.

Creator of many open-source tools solving what traditional automation can't: waitless (flaky tests), sb-stealth-wrapper (bot detection), selenium-teleport (state persistence), selenium-chatbot-test (AI chatbot testing), lumos-shadowdom (Shadow DOM), and visual-guard (visual regression).

January 12, 2026

Selenium Teleport v2.1.0: Enterprise-Grade Security for Browser State Management

SeleniumPython

January 5, 2026

Waitless v1.0: The End of Flaky Tests Has Arrived

SeleniumPython

December 31, 2025

Beyond the Black Box: Visualizing Autonomous Intelligence with Starlight Mission Control

AutomationCBA

Share this article:

Announcing pytest-mockllm v0.2.1: "True Fidelity"

What's New in v0.2.1

The Challenge We Solved

True Async & Await

Pro Tokenizers (tiktoken)

ROI Dashboard

PII Redaction by Default

Chaos Engineering for LLMs

The First to Python 3.14

Outcomes

Get Started

About the Author

You might also like

Selenium Teleport v2.1.0: Enterprise-Grade Security for Browser State Management

Waitless v1.0: The End of Flaky Tests Has Arrived

Beyond the Black Box: Visualizing Autonomous Intelligence with Starlight Mission Control

Get In Touch