Linux Disk Benchmarking on VPS: A Practical Research Brief

Date: 2026-06-01

Bottom Line

Evaluating disk performance on a Linux VPS requires measuring the right thing. Sequential throughput and high-queue-depth IOPS are the numbers providers advertise, but they're poor predictors of how databases, container orchestrators, and web servers actually feel. The critical metric is p99.9 latency at queue depth 1, with special attention to fsync/fdatasync latency for any workload that cares about durability. A budget VPS can show 300 MB/s sequential writes and 20,000 IOPS while delivering 30–100 ms fsyncs that make etcd, PostgreSQL, and kubelet housekeeping collapse. This brief provides a reproducible, distro-agnostic methodology for benchmarking VPS disks on Ubuntu 26.04, Ubuntu 24.04, and Debian 13, together with the tuning context needed to interpret results.

Key Findings

  • Finding: dd and fio measure different things. dd if=/dev/zero of=/tmp/test bs=1M count=100 tests sequential buffered writes. dd oflag=dsync adds per-write sync overhead but still doesn't expose random I/O or latency distributions. fio is required for meaningful random-I/O and latency-percentile data.
  • Finding: fio's reported clat (completion latency) doesn't include fsync/fdatasync time. A job with fsync=1 or fdatasync=1 will show sub-millisecond clat while the actual per-operation wall-clock time is tens of milliseconds. To measure synchronous-write latency you must use --sync=1 (O_SYNC) or measure fsync externally.
  • Finding: etcd and similar consensus systems need fdatasync p99 under 10 ms as an absolute ceiling, and under 1 ms for healthy operation. Our fresh Contabo VPS averaged 37 ms concurrent fsync on /var/lib/etcd with spikes to 86 ms - an order of magnitude too slow.
  • Finding: ext4 mount options commit=30 and discard on Ubuntu cloud images increase the window of unwritten data and can add synchronous TRIM overhead. These are baked into the image, not introduced by configuration roles.
  • Finding: Writeback storms on large-RAM hosts are a timing problem, not a raw disk-speed problem. Default vm.dirty_ratio=20 on a 12 GB host is ~2.4 GB of dirty data; on a 256 GB host it's ~51 GB. Moving to byte-based thresholds (vm.dirty_background_bytes, vm.dirty_bytes) smooths flushing and reduces latency spikes.

Background

Why disk benchmarking on VPS is different

A VPS disk isn't a physical device. It's a virtual block device backed by a hypervisor-level storage layer that may be:

  • A local SSD or NVMe drive shared among many tenants
  • A network-attached block volume (Ceph, LINSTOR, Longhorn, etc.)
  • An oversubscribed RAID array with volatile write caches

Each layer introduces its own queuing, caching, and contention behavior. The guest OS sees a block device (/dev/sda) with an I/O scheduler and a filesystem. The actual latency depends on the host's storage stack and neighbor activity. This is why a single dd run can show "good" throughput while a sustained fsync-heavy workload collapses.

The metrics that matter

Metric Definition Why it matters
Throughput MB/s read or written Large file transfers, backups, video streaming
IOPS I/O operations per second Database random access, VM boot storms
Latency (avg) Mean time per operation Rough indicator; easily skewed by outliers
Latency (p99.9) 99.9th percentile completion time Tail latency = what users feel; critical for SLAs
Queue depth (QD) In-flight I/O requests High QD increases throughput at the cost of latency
fsync latency Time to flush data + metadata to stable storage Databases, etcd, WALs, any durability guarantee

For typical VPS workloads (web server, database, single-node Kubernetes), the bottleneck is QD1 latency, not QD32 throughput. A PHP request waiting on MySQL doesn't issue 32 parallel disk requests; it issues one and blocks.

Current State

Industry consensus

  • Google Cloud, Oracle Cloud, and Red Hat officially recommend fio over dd for disk benchmarking. Google Cloud explicitly warns that dd uses "a very low I/O queue depth" and "might not accurately test disk performance."
  • etcd documentation states typical fdatasync latency is ~10 ms for spinning disks and <1 ms for SSDs. The etcd runbook flags etcd_disk_wal_fsync_duration_seconds p99 > 10 ms as a problem.
  • VPSBenchmarks (independent test aggregator) grades Contabo VPS disk performance at E (lowest tier) for Cloud VPS 10/20 plans and C–D for dedicated-vCPU VDS plans, confirming that the "NVMe" label doesn't guarantee NVMe-class latency.
  • Hayden James' VPS-Disk-Latency-Bench (2026) popularized a qd1-focused fio script with pass/fail thresholds: <0.3 ms = excellent NVMe, <0.5 ms = acceptable, ≥0.5 ms = poor/throttled.

Our empirical findings (Contabo VPS, fresh Ubuntu 26.04)

Test Result
Sequential write (cached) 273 MB/s → 1.3 GB/s
Sequential write (direct I/O) 65–807 MB/s
Single fsync on /var/lib/etcd 6.5–50 ms (avg ~28 ms)
Concurrent fsync (4 threads) 11–86 ms (avg 37 ms)
iostat disk util under idle 0–17%
vmstat context switches 100–120/s

These numbers confirm that the disk is not broken - it delivers reasonable throughput - but it's unsuitable for fsync-heavy control-plane workloads. When we previously ran Kubernetes on this host, the 500 ms fsync spikes were caused by a retry storm. Etcd, kubelet, crashlooping metrics-server, and API-server cache resyncs all contending for the same slow disk queue.

Technical or Implementation Details

Reproducible benchmark suite

All commands work on Ubuntu 24.04/26.04 and Debian 13 with packages from default repositories.

1. Install tools

apt update && apt install -y fio sysstat bc

2. Quick sequential throughput baseline (dd)

# Buffered sequential write (tests page cache + disk)
dd if=/dev/zero of=/tmp/seq_test bs=1M count=1000

# Direct sequential write (bypasses page cache)
dd if=/dev/zero of=/tmp/seq_test bs=1M count=1000 oflag=direct
rm /tmp/seq_test

# Synchronous sequential write (per-write sync)
dd if=/dev/zero of=/tmp/seq_test bs=1M count=100 oflag=dsync
rm /tmp/seq_test

Caveat: dd doesn't test random I/O or report latency distributions.

3. Random I/O latency at QD1 (the "what users feel" test)

fio --name=randrw_qd1 \
  --directory=/var/tmp \
  --size=2G \
  --ioengine=libaio \
  --iodepth=1 \
  --numjobs=1 \
  --rw=randrw \
  --rwmixread=70 \
  --bs=4k \
  --direct=1 \
  --runtime=60 \
  --time_based \
  --group_reporting \
  --output-format=json

Look at jobs[0].read.clat_ns.percentile["99.900000"] and the equivalent for write. Thresholds:

  • < 0.3 ms: excellent (true NVMe-class)
  • 0.3–0.5 ms: acceptable
  • 0.5 ms: poor / oversubscribed

4. fsync/fdatasync latency (the database/etcd test)

Critical: fio clat with fdatasync=1 does not include the sync time. Use sync=1 (O_SYNC) or an external script.

Option A - fio with O_SYNC:

fio --name=sync_write \
  --directory=/var/tmp \
  --size=1G \
  --ioengine=sync \
  --sync=1 \
  --rw=randwrite \
  --bs=4k \
  --numjobs=1 \
  --runtime=60 \
  --time_based \
  --group_reporting

Option B - Python fsync micro-benchmark (matches our methodology):

python3 -c '
import time, os, threading

def worker(n):
    delays = []
    for i in range(10):
        start = time.time()
        fname = f"/var/tmp/fsync_test_{n}_{i}"
        with open(fname, "w") as f:
            f.write("x" * 4096)
            os.fsync(f.fileno())
        delays.append((time.time() - start) * 1000)
        os.remove(fname)
    return delays

threads, results = [], []
for n in range(4):
    t = threading.Thread(target=lambda n=n: results.append(worker(n)))
    threads.append(t)
    t.start()
for t in threads:
    t.join()

all_d = [d for r in results for d in r]
print(f"Concurrent fsync: min={min(all_d):.2f}ms max={max(all_d):.2f}ms avg={sum(all_d)/len(all_d):.2f}ms")
'

5. Production-style monitoring

# Real-time disk stats
iostat -x 1 10

# VM-level overview
vmstat 1 10

# I/O pressure (Linux 5.2+)
cat /proc/pressure/io

# eBPF block I/O histogram (requires bpfcc-tools)
sudo biolatency-bpfcc -D 10 1

# Per-operation block I/O trace
sudo biosnoop-bpfcc -d sda 10

Filesystem and mount options

The Ubuntu cloud image defaults for / on ext4:

rw,relatime,discard,errors=remount-ro,commit=30
Option Effect on benchmark
commit=30 Journal flush every 30s instead of 5s. Batches metadata writes, reducing journal overhead but increasing the window of unwritten data.
discard Online TRIM. Can add synchronous latency on virtualized storage where TRIM commands are expensive.
barrier (default) Ensures journal ordering. Safe but adds a flush point. Only safe to disable with battery-backed cache.
noatime Eliminates read-triggered metadata writes. Safe for most workloads; reduces write amplification.

To remount without discard (test impact):

mount -o remount,noatime,nodiscard /

Kernel tuning for write-heavy workloads

If benchmarking reveals bursty latency rather than consistently slow latency, the culprit may be writeback batching rather than the disk itself.

Create /etc/sysctl.d/99-dirty-writeback.conf:

# Start background writeback earlier; cap dirty cache
vm.dirty_background_bytes = 536870512
vm.dirty_bytes = 2147483648
vm.dirty_expire_centisecs = 1500

Apply: sysctl --system

On a VPS with limited RAM, this is often unnecessary. It becomes critical on hosts with 32+ GB RAM where default ratio-based thresholds allow multi-gigabyte dirty bursts.

Evidence, Comparisons, and Related Context

Expected baselines (from sources)

Storage type Sequential write Random 4K QD1 read p99 fdatasync p99
Consumer NVMe (local) 3,000–7,000 MB/s <0.3 ms <0.5 ms
SATA SSD (local) 500 MB/s 0.5–2 ms 1–5 ms
Enterprise SAS HDD 150–200 MB/s 10–20 ms ~10 ms
Cloud SSD (AWS gp3) 1,000 MB/s 1–5 ms 2–10 ms
Budget VPS "NVMe" (Contabo) 100–300 MB/s 0.5–1.5 ms 30–100 ms

The Contabo fsync latency is the outlier: its sequential throughput is competitive, but its synchronous-write latency is HDD-class or worse despite the NVMe label.

Why providers advertise IOPS, not latency

Provider marketing benchmarks run at QD32 with many jobs. At QD32, an SSD can reorder and parallelize requests, inflating IOPS while each individual operation waits longer. A web server or database running at QD1 doesn't benefit from that parallelism; it benefits from fast individual completion. As the linuxblog.Io benchmark showed, a provider with the highest QD1 IOPS can still have the worst QD1 latency due to oversubscription and noisy neighbors.

Tool comparison

Tool Best for Avoid for
dd 30-second sequential sanity check Random IOPS, latency percentiles, fsync
fio Comprehensive workload simulation Quick one-liners without reading docs
sysbench Standardized cross-platform comparison Fsync-specific latency (no native fsync mode)
ioping Simple latency ping to a file Throughput or IOPS
iostat Ongoing device utilization monitoring Deep latency histograms
biolatency (eBPF) Production latency distribution tracing Benchmarking (it's a monitor, not a load generator)
VPS-Disk-Latency-Bench Standardized VPS comparison script Raw block devices (it's file-based)

Limitations and Critiques

  • fio complexity: fio has hundreds of parameters. Misconfigured jobs (e.G., testing buffered I/O when you meant direct I/O) produce meaningless results. Always verify with --output-format=json and inspect both clat and slat + lat.
  • The 10 ms etcd rule is debated: Red Hat noted that the old IBM guidance ("fdatasync p99 < 10 ms") was causing customer confusion. Ten milliseconds is a survival threshold, not a performance target. For a responsive cluster, aim for <1–2 ms.
  • Cloud image defaults aren't tunable at boot: commit=30 and discard come from the Ubuntu cloud image fstab. Changing them requires editing /etc/fstab and rebooting, or a live remount. The impact of discard on virtualized fsync latency is documented anecdotally but not rigorously quantified in the sources we found.
  • eBPF tools require privileges: biolatency, biosnoop, and ext4slower need root and kernel BTF support. They're invaluable for production diagnosis but not available in restricted containers or hardened kernels.
  • Our Contabo sample is n=1: We tested one instance of one plan at one point in time. Contabo's consistency score at VPSBenchmarks is 53/100, meaning high variance across instances. Your mileage will vary.

Open Questions

  1. What is the exact quantitative impact of discard on fsync latency for QEMU virtio disks backed by qcow2 on NVMe?
  2. Is there a safe, provider-agnostic way to detect whether a VPS disk is locally attached NVMe, network block, or oversubscribed RAID?
  3. Can a standardized fio jobfile be adopted across the community to replace provider-biased benchmark scripts?
  4. How does mq-deadline vs none I/O scheduler affect tail latency on virtualized block devices with noisy neighbors?

Practical Takeaways

  1. Never trust a single dd result. Run it with oflag=direct and oflag=dsync, then follow up with fio at QD1.
  2. Always measure p99.9 latency, not average. Averages hide the outliers that kill application responsiveness.
  3. Test fsync explicitly. If you run databases, etcd, or anything with a WAL, use the Python fsync script or fio with --sync=1. A disk that does 1 GB/s sequential but 50 ms fsyncs will ruin your cluster.
  4. Check your mount options. If discard is active and you see erratic fsync spikes, try remounting with nodiscard and retest.
  5. Monitor in production, not just at provision time. Install sysstat and occasionally run biolatency-bpfcc to catch latency regression caused by neighbor contention.
  6. Apply the Contabo lesson. Budget VPS storage is fine for staging, static sites, and CPU-bound workloads. It's a poor choice for latency-sensitive control planes unless you accept the retry-storm risk.

Sources Used


This brief incorporates direct empirical measurements from a fresh Ubuntu 26.04 installation on a Contabo VPS (June 2026) alongside publicly documented best practices from cloud providers, storage vendors, and the Linux kernel community.