Testinfra as Verification Infrastructure for Ansible Roles

Abstract

Ansible is excellent at converging infrastructure, but Ansible playbooks are a poor fit for large verification suites. Verification code needs fast feedback, crisp assertions, parameterization, good failure locality, and a test runner that encourages small independent checks. This paper summarizes practical lessons from replacing an Ansible assertion playbook with a pytest-testinfra suite for an Ubuntu 26.04 container-host role. The resulting suite validates production host state directly: kernel modules, sysctls, PAM limits, systemd service state, transparent hugepage mode, swap state, and locale configuration.

The core finding is simple: testinfra is most useful when treated as infrastructure testing, not as a Python wrapper around shell scripts. Native testinfra modules such as host.sysctl, host.file, host.service, and host.package produce clearer tests than hand-rolled remote commands. Performance then comes from pytest-level parallelism and SSH connection reuse rather than from batching unrelated assertions into opaque scripts.

Context

The role under test configures Ubuntu 26.04 virtual machines intended to run container workloads. The desired host contract includes:

  • Container kernel modules loaded;
  • Runtime sysctl values set;
  • Persistent sysctl configuration present;
  • High PAM limits for file descriptors and processes;
  • Transparent hugepages configured as madvise through a systemd one-shot;
  • Swap disabled at runtime and absent from /etc/fstab;
  • UTF-8 locale package, generated locale, and default locale configured.

The initial verification approach used an Ansible playbook with assert tasks. That was serviceable for a small number of checks, but the structure degraded quickly as coverage grew. Every assertion required task plumbing, registered variables, loops, slurp, filters, and custom failure messages. The test file looked like another configuration playbook rather than a verification specification.

Why Ansible Assertions Did Not Scale

Ansible verification has three recurring problems for this use case.

First, the unit of execution is too heavy. A single logical check often becomes multiple tasks: collect state, transform state, assert state. That expands noise around simple facts such as “this sysctl equals this value.”

Second, Ansible’s data access and filter syntax is awkward for test code. Even basic structures can produce surprising failures, such as conflicts between dictionary keys and Python method names when using dot notation.

Third, performance encourages the wrong abstraction. Reading 38 sysctls one task at a time is slow, so the natural optimization is to batch remote reads in a Python snippet. That makes the playbook faster, but it also hides individual checks behind a custom collector.

These problems aren't signs that Ansible is bad. They're signs that convergence and verification are different jobs.

Testinfra Design Principles

The replacement suite followed four principles.

1. Test observable host state

The tests don't inspect Ansible task internals. They validate the deployed machine:

assert host.file("/sys/module/overlay").is_directory
assert str(host.sysctl("vm.swappiness")) == "10"
assert host.service("transparent-hugepages-madvise.service").is_enabled

This keeps the suite useful as production deployment verification. A host passes when the system state is correct, regardless of exactly how the role produced it.

2. Use native testinfra modules heavily

The best testinfra tests read like infrastructure assertions. Native modules encode common system concepts:

  • host.sysctl(name) for kernel parameters;
  • host.file(path) for file existence, content, and permissions;
  • host.service(name) for systemd service state;
  • host.package(name) for package installation;
  • host.check_output(command) for the few cases where no native abstraction exists.

This avoids reinventing a remote execution framework inside the tests.

3. Keep tests self-contained

The expected values are hardcoded in the test suite. This is intentional. Verification should be an independent contract, not a mirror that imports the role’s variables and repeats whatever the implementation currently says. If both the role and the tests read from the same source of truth, the tests can pass while the intended behavior silently changes.

4. Prefer small parameterized assertions

Instead of one giant “all sysctls match” test, the suite uses pytest parameterization. Each sysctl, module, and PAM limit becomes its own test case. This improves failure locality and works well with parallel execution.

Performance Findings

The first testinfra implementation passed but was slower than expected because it collected many checks serially over SSH. Two optimizations mattered.

Parallel execution with pytest-xdist

pytest-xdist distributes tests across worker processes:

pytest -n 8 --dist=worksteal

For a suite with many independent parameterized checks, this is a natural fit. The sysctl and file-content checks are independent and read-only, so parallelism is safe. In practice, moving to eight workers reduced runtime from roughly a minute and a half to around 36–40 seconds in this environment.

SSH connection reuse with ControlPersist

Testinfra’s SSH backend supports persistent connections through controlpersist. With the Ansible inventory backend, the host specification can pass this option through:

--hosts='ansible://all?controlpersist=300&timeout=30'

Connection reuse avoids paying a full SSH handshake for every command. It's especially important when tests are intentionally small and many.

The final Makefile target combined both optimizations:

TESTINFRA_HOSTS ?= ansible://all?controlpersist=300&timeout=30
TESTINFRA_WORKERS ?= 8

test:
	ANSIBLE_HOST_KEY_CHECKING=False $(TESTINFRA_PYTEST) \
		-n $(TESTINFRA_WORKERS) --dist=worksteal \
		--hosts='$(TESTINFRA_HOSTS)' \
		--ansible-inventory=$(ANSIBLE_INVENTORY) \
		$(ANSIBLE_ROLE_DIR)/tests

Practical Test Structure

A maintainable testinfra file should group expected values near the tests that use them. For example, kernel module constants live beside the kernel module test, sysctl constants live beside sysctl tests, and locale constants live beside locale tests. This avoids a large global constants block that readers must cross-reference.

A representative sysctl pattern is:

EXPECTED_SYSCTLS = {
    "net.ipv4.ip_forward": "1",
    "vm.swappiness": "10",
}

@pytest.mark.parametrize("name, expected", EXPECTED_SYSCTLS.items())
def test_runtime_sysctl_value(host, name, expected):
    assert " ".join(str(host.sysctl(name)).split()) == expected

The whitespace normalization handles sysctls such as net.ipv4.ip_local_port_range, which may be returned with tabs rather than spaces.

Persistent file checks can use host.file(...).contains(...), accepting that the regex syntax is grep-style rather than Python-style. That small constraint is worth it because the assertion executes through testinfra’s file abstraction rather than custom remote parsing.

Lessons Learned

  1. don't port Ansible tasks line-for-line. A testinfra suite should assert system facts, not reproduce playbook mechanics.
  2. Use testinfra modules before reaching for shell. Shell commands should be the fallback, not the default.
  3. Self-contained expected values catch unintended drift. Importing role vars into tests weakens the tests as an independent contract.
  4. Many small tests are better than fewer opaque tests. Pytest parameterization and xdist make small tests practical.
  5. Remote verification needs connection strategy. Without SSH reuse and parallelism, clean tests can feel slow enough to tempt bad batching.
  6. Production verification is different from role idempotence. Ansible lint and idempotence checks say the role is well-formed; testinfra says the host is actually ready.

Conclusion

Testinfra is a strong fit for production verification of Ansible-managed hosts when used idiomatically. The most important shift is conceptual: treat the test suite as an independent specification of deployed host state. Hardcode the contract, use native modules, keep assertions small, and let pytest provide parameterization and parallel execution.

For infrastructure roles that configure real operating-system behavior, this approach produces tests that are easier to read, faster to run, and more trustworthy than large Ansible assertion playbooks.