Automation

AI

Test Automation

Choosing the Right Data Structure in Python for Automation Projects

November 28, 2025 2 min read

Visualizing the core Python data structures.

In automation engineering—whether you’re testing web apps, mobile apps, APIs, or databases—your code must be fast, clean, and predictable. One of the biggest factors influencing that is your choice of data structures.

Choosing the right structure is not academic—it directly impacts:

How clean your automation code looks
How fast it executes
How easy it is to maintain or scale
How effectively you store or access test data
How stable and readable your page objects, API utilities, and verifications become

This blog outlines the best Python data structures for automation developers and gives real-world examples for each use case.

1. List – Best for Ordered Collections

Use lists when you have ordered data and need iteration—common in storing multiple web elements, maintaining ordered test steps, parsing API responses, or device lists in mobile automation.

Code

Scenario: Extract all product names from a category page
product_elements = driver.find_elements(By.CSS_SELECTOR, ".product-name")
product_names = [elem.text for elem in product_elements]

assert "iPhone 15" in product_names

2. Dictionary – Best for Key–Value Mappings

Use dictionaries when data must be accessed by keys, not positions. This is the most commonly used structure in automation frameworks for configs, test data, and cookies.

Code

2.1 API Headers / Payloads
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {token}"
}

2.2 Page Object Element Locators
locators = {
"username": (By.ID, "user"),
"password": (By.ID, "pass"),
"login_btn": (By.ID, "login")
}

3. Set – Best for Unique Collections

Use sets when you need unique values, need to find differences or intersections, and order does NOT matter.

Code

3.1 Checking Duplicate Values in a Dropdown
options = [o.text for o in select.options]
assert len(options) == len(set(options)), "Duplicate values found"

3.2 API Response Validation – Unique IDs
ids = [item["id"] for item in response.json()]
assert len(ids) == len(set(ids)), "Duplicate IDs found in API response"

4. Tuple – Best for Fixed Values

Use tuples when you want an immutable, fixed-size object. Every Selenium locator is a tuple.

Code

4.1 Selenium Locators
username = (By.ID, "username")

4.2 Mobile Tap Coordinates
start_point = (100, 500)
end_point   = (300, 500)

5. NamedTuple / Dataclass – Best for Structured Data

When you need structured, readable objects without the heaviness of a class.

Code

from dataclasses import dataclass

@dataclass
class UserData:
username: str
email: str
role: str

# Then use it in tests:
user = UserData("dhiraj", "d@example.com", "admin")
payload = user.__dict__

6. Queue / Deque – Best for High Throughput

For parallel testing, job scheduling, or test orchestration. Deque offers faster pop from left (O(1)) vs list (O(n)).

Code

from collections import deque

devices = deque(["Pixel_8", "iPhone_14", "Samsung_S23"])

while devices:
device = devices.popleft()
run_test_on_device(device)

7. DefaultDict / Counter – Best for Aggregations

Great for computing API response statistics, log analysis, or error frequency.

Code

from collections import Counter

codes = [resp.status_code for resp in responses]
summary = Counter(codes)

print(summary)

8. Pandas DataFrame – Best for Large Test Data Sets

When dealing with large test data, Excel files, DB dumps, or CSV inputs.

Code

import pandas as pd

df = pd.read_csv("testdata.csv")

for _, row in df.iterrows():
test_login(row.username, row.password)

🧩 Data Structure Decision Matrix

A quick guide to choosing the right data structure.

🎯 Final Thoughts

Efficient automation isn’t only about writing scripts—it’s about writing maintainable, scalable, and high-performance test code. Choosing the right data structure reduces bugs, makes your framework faster, and improves readability.

About the Author

Dhiraj Das | Senior Automation Consultant | 10+ years building test automation that actually works. He transforms flaky, slow regression suites into reliable CI pipelines—designing self-healing frameworks that don't just run tests, but understand them.

Creator of many open-source tools solving what traditional automation can't: waitless (flaky tests), sb-stealth-wrapper (bot detection), selenium-teleport (state persistence), selenium-chatbot-test (AI chatbot testing), lumos-shadowdom (Shadow DOM), and visual-guard (visual regression).

January 12, 2026