Skip to content

Commit

Permalink
hwbench: Adding fio engine
Browse files Browse the repository at this point in the history
This commit is adding a first cmdline engine_module to execute a
single fio command line.

This code has been tested on fio-3.19, defining the minimal release
version.

To enable this mode, engine_module must be set to "cmdline".
The expected command line to forward to fio must be provided in the engine_module_parameter_base.

The command line will be tweaked by hwbench to ensure:
- runtime consistency with other engines : --time_based and --runtime are added
- output consistency: --output-format=json+ is added
- job naming: --name is adjusted to match hwbench's job name
- logs: --write_*_logs are enabled at a 20sec precision
- cache invalidation: each benchmark clears the cache to ensure an
  out-of-cache testing

Please note that :
- Fio's runtime will inherit automatically from hwbench's runtime value.
- --numjobs value will be fed with 'stressor_range' making possible to
  study the scalability of a device with a minimal code.

If one of these values were already present in the
engine_module_parameter_base, hwbench will replace them by the values
that were computed based on the benchmark descrption.

A sample configuration file (configs/fio.conf) is provided as an example, it will:
- test /dev/nvme0n1 in a randread 4k profile
- two benchmarks are automatically created as per the stressor_range
  value ("4,6") :
-- one with numjobs=4
-- one with numjobs=6

The testing suite is added to ensure a proper parsing and benchmarking job creation.

A documentation is also added to detail this first implementation
behavior.

Signed-off-by: Erwan Velu <e.velu@criteo.com>
  • Loading branch information
ErwanAliasr1 authored and anisse committed Dec 23, 2024
1 parent bc61cec commit 901b13e
Show file tree
Hide file tree
Showing 12 changed files with 402 additions and 2 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ The current version of hwbench supports 3 different engines.
- [stress-ng](https://github.com/ColinIanKing/stress-ng): no need to present this very popular low-level benchmarking tool
- spike: a custom engine used to make fans spike. Very useful to study the cooling strategy of a server.
- sleep: a stupid sleep call used to observe how the system is behaving in idle mode
- [fio](https://github.com/axboe/fio): a flexible storage benchmarking tool, see [documentation](./documentation/fio.md)

Benchmark performance metrics are extracted and saved for later analysis.

Expand Down
15 changes: 15 additions & 0 deletions configs/fio.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# This configuration will :
# - test /dev/nvme0n1 in 4k randread for 40 seconds
# -- first with 4 stressors
# -- then with 6 stressors
[global]
runtime=40
monitor=all

[randread_cmdline]
engine=fio
engine_module=cmdline
engine_module_parameter_base=--filename=/dev/nvme0n1 --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=256 --group_reporting --readonly
hosting_cpu_cores=all
hosting_cpu_cores_scaling=none
stressor_range=4,6
53 changes: 53 additions & 0 deletions documentation/fio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# FIO

hwbench can use [fio](https://github.com/axboe/fio) to perform storage benchmarking.
The current implementation requires fio >= 3.19.

# Concept
Fio is operated in three(3) different modes by selecting the `engine_module` directive.


## Command line

When `engine_module=cmdline` is used, the content of `engine_module_parameter_base` will be passed directly to fio with some limitations.

The following fio keywords are automatically defined, or replaced if present, by hwbench :

- `--runtime`: set to match the exact duration of the current hwbench benchmark.
- `--time_based`: it's mandatory to have a benchmark lasting `runtime` seconds.
- `--output-format`: hwbench need the output to be set in `json+` for an easy integration.
- `--name`: hwbench will use the current job name to ensure its unique over the runs.
- `--numjobs`: defined by `stressor_range`, can be set as a unique value or a list of values. Each value will generate a new benchmark.
- `--write_{bw|lat|hist|iops}_logs`: hwbench will automatically collect the performance logs to let hwgraph doing time-based graphs.
- `--invalidate`: hwbench ensure that every benchmark will be done out of cache.

### Sample configuration file

The following job defines two benchmarks on the same device (nvme0n1).

The `randread_cmdline` job will create :
- `randread_cmdline_0` benchmark with ``numjobs=4`` extracted from `stressor_range` list
- `randread_cmdline_1` benchmark with ``numjobs=6`` extracted from `stressor_range` list

```
[randread_cmdline]
runtime=600
engine=fio
engine_module=cmdline
engine_module_parameter_base=--filename=/dev/nvme0n1 --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=256 --group_reporting --readonly
hosting_cpu_cores=all
hosting_cpu_cores_scaling=none
stressor_range=4,6
```

Please note the `hosting_cpu_cores` only selects a set of cores to pin fio. A possible usage would be using a list of cores with a `hosting_cpu_cores_scaling` to study the performance of the same storage device from different NUMA domains.

## External file execution
Hwbench execute an already existing fio job file.

Not yet implemented.

## Automatic job definition
Hwbench automatically creates jobs based on some hardware detection and profiles.

Not yet implemented.
4 changes: 2 additions & 2 deletions hwbench/bench/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,9 +123,9 @@ def pre_run(self):
cpu_location = ""
if p.get_pinned_cpu():
if isinstance(p.get_pinned_cpu(), (int, str)):
cpu_location = " on CPU {:3d}".format(p.get_pinned_cpu())
cpu_location = " pinned on CPU {:3d}".format(p.get_pinned_cpu())
elif isinstance(p.get_pinned_cpu(), list):
cpu_location = " on CPU [{}]".format(
cpu_location = " pinned on CPU [{}]".format(
h.cpu_list_to_range(p.get_pinned_cpu())
)
else:
Expand Down
46 changes: 46 additions & 0 deletions hwbench/bench/test_fio.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from . import test_benchmarks_common as tbc


class TestFio(tbc.TestCommon):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.load_mocked_hardware(
cpucores="./hwbench/tests/parsing/cpu_cores/v2321",
cpuinfo="./hwbench/tests/parsing/cpu_info/v2321",
numa="./hwbench/tests/parsing/numa/8domainsllc",
)
self.load_benches("./hwbench/config/fio.conf")
self.parse_jobs_config()
self.QUADRANT0 = list(range(0, 16)) + list(range(64, 80))
self.QUADRANT1 = list(range(16, 32)) + list(range(80, 96))
self.ALL = list(range(0, 128))

def test_fio(self):
"""Check fio syntax."""
assert self.benches.count_benchmarks() == 2
assert self.benches.count_jobs() == 1
assert self.benches.runtime() == 80

for bench in self.benches.benchs:
self.assertIsNone(bench.validate_parameters())
bench.get_parameters().get_name() == "randread_cmdline"

bench_0 = self.get_bench_parameters(0)
assert (
bench_0.get_engine_module_parameter_base()
== "--bs=4k --direct=1 --filename=/dev/nvme0n1 --group_reporting \
--invalidate=1 --iodepth=256 --ioengine=libaio --log_avg_msec=20000 --name=randread_cmdline_0 \
--numjobs=4 --output-format=json+ --readonly --runtime=40 --rw=randread --time_based \
--write_bw_log=fio/randread_cmdline_0_bw.log --write_hist_log=fio/randread_cmdline_0_hist.log \
--write_iops_log=fio/randread_cmdline_0_iops.log --write_lat_log=fio/randread_cmdline_0_lat.log"
)

bench_1 = self.get_bench_parameters(1)
assert (
bench_1.get_engine_module_parameter_base()
== "--bs=4k --direct=1 --filename=/dev/nvme0n1 --group_reporting \
--invalidate=1 --iodepth=256 --ioengine=libaio --log_avg_msec=20000 --name=randread_cmdline_1 \
--numjobs=6 --output-format=json+ --readonly --runtime=40 --rw=randread --time_based \
--write_bw_log=fio/randread_cmdline_1_bw.log --write_hist_log=fio/randread_cmdline_1_hist.log \
--write_iops_log=fio/randread_cmdline_1_iops.log --write_lat_log=fio/randread_cmdline_1_lat.log"
)
18 changes: 18 additions & 0 deletions hwbench/config/fio.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# This configuration will :
# - test /dev/nvme0n1 in 4k randread for 40 seconds
# -- first with 4 stressors
# -- then with 6 stressors
#
# As runtime is set to 30s by the user, it should be replaced by runtime=40 defined by hardware bench
[global]
runtime=40
monitor=all

[randread_cmdline]
engine=fio
engine_module=cmdline
engine_module_parameter_base=--filename=/dev/nvme0n1 --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=256 --group_reporting --readonly --runtime=30 --numjobs=10 --name=plop
hosting_cpu_cores=all
hosting_cpu_cores_scaling=none
stressor_range=4,6

26 changes: 26 additions & 0 deletions hwbench/config/test_parse_fio.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from unittest.mock import patch
from ..environment.mock import MockHardware
from ..bench import test_benchmarks_common as tbc


class TestParseConfig(tbc.TestCommon):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.hw = MockHardware()
self.load_benches("./hwbench/config/fio.conf")

def test_sections_name(self):
"""Check if sections names are properly detected."""
sections = self.get_jobs_config().get_sections()
assert sections == [
"randread_cmdline",
]

def test_keywords(self):
"""Check if all keywords are valid."""
try:
with patch("hwbench.utils.helpers.is_binary_available") as iba:
iba.return_value = True
self.get_jobs_config().validate_sections()
except Exception as exc:
assert False, f"'validate_sections' detected a syntax error {exc}"
213 changes: 213 additions & 0 deletions hwbench/engines/fio.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
import json
import pathlib
from typing import Any


from ..bench.parameters import BenchmarkParameters
from ..bench.engine import EngineBase, EngineModuleBase
from ..bench.benchmark import ExternalBench
from ..utils.helpers import fatal


class EngineModuleCmdline(EngineModuleBase):
"""This class implements the EngineModuleBase for fio"""

def __init__(self, engine: EngineBase, engine_module_name: str, fake_stdout=None):
super().__init__(engine, engine_module_name)
self.engine_module_name = engine_module_name
self.load_module_parameter(fake_stdout)

def load_module_parameter(self, fake_stdout=None):
# if needed add module parameters to your module
self.add_module_parameter("cmdline")

def validate_module_parameters(self, p: BenchmarkParameters):
msg = super().validate_module_parameters(p)
FioCmdLine(self, p).parse_parameters()
return msg

def run_cmd(self, p: BenchmarkParameters):
return FioCmdLine(self, p).run_cmd()

def run(self, p: BenchmarkParameters):
return FioCmdLine(self, p).run()

def fully_skipped_job(self, p) -> bool:
return FioCmdLine(self, p).fully_skipped_job()


class Engine(EngineBase):
"""The main fio class."""

def __init__(self, fake_stdout=None):
super().__init__("fio", "fio")
self.add_module(EngineModuleCmdline(self, "cmdline", fake_stdout))

def run_cmd_version(self) -> list[str]:
return [
self.get_binary(),
"--version",
]

def run_cmd(self) -> list[str]:
return []

def parse_version(self, stdout: bytes, _stderr: bytes) -> bytes:
self.version = stdout.split(b"-")[1].strip()
return self.version

def version_major(self) -> int:
if self.version:
return int(self.version.split(b".")[0])
return 0

def version_minor(self) -> int:
if self.version:
return int(self.version.split(b".")[1])
return 0

def parse_cmd(self, stdout: bytes, stderr: bytes):
return {}


class Fio(ExternalBench):
"""The Fio stressor."""

def __init__(
self, engine_module: EngineModuleBase, parameters: BenchmarkParameters
):
ExternalBench.__init__(self, engine_module, parameters)
self.parameters = parameters
self.engine_module = engine_module
self.log_avg_msec = 20000 # write_*_log are averaged at 20sec
self._parse_parameters()
# Tests can skip this part
if isinstance(parameters.out_dir, pathlib.PosixPath):
parameters.out_dir.joinpath("fio").mkdir(parents=True, exist_ok=True)

def version_compatible(self) -> bool:
engine = self.engine_module.get_engine()
return engine.version_major() >= 3 and engine.version_minor() >= 19

def _parse_parameters(self):
self.runtime = self.parameters.runtime
if self.runtime * 1000 < self.log_avg_msec:
fatal(
f"Fio runtime cannot be lower than the average log time ({self.log_avg_msec})."
)

def need_skip_because_version(self):
if self.skip:
# we already skipped this benchmark, we can't know the reason anymore
# because we might not have run the version command.
return ["echo", "skipped benchmark"]
if not self.version_compatible():
print(f"WARNING: skipping benchmark {self.name}, needs fio >= 3.19")
self.skip = True
return ["echo", "skipped benchmark"]
return None

def run_cmd(self) -> list[str]:
skip = self.need_skip_because_version()
if skip:
return skip

# Let's build the command line to run the tool
args = [
self.engine_module.get_engine().get_binary(),
]

return self.get_taskset(args)

def get_default_fio_command_line(self, args: list) -> list:
"""Return the default fio arguments"""

def remove_arg(args, item) -> list:
if isinstance(item, str):
return [arg for arg in args if not arg.startswith(item)]
else:
# We need to ensure that value based items are having the right value
# This avoid a case where the user already defined a value we need to control
for arg in args:
if arg.startswith(item[0]):
if arg != f"{item[0]}={item[1]}":
print(
f"{self.parameters.get_name_with_position()}: Fio parameter {item[0]} is now set to {item[1]}"
)
args.remove(arg)

return args

name = self.parameters.get_name_with_position()
enforced_items = [
["--runtime", f"{self.parameters.get_runtime()}"],
"--time_based",
["--output-format", "json+"],
["--numjobs", self.parameters.get_engine_instances_count()],
["--name", name],
["--invalidate", 1],
["--log_avg_msec", self.log_avg_msec],
]
for log_type in ["bw", "lat", "hist", "iops"]:
enforced_items.append(f"--write_{log_type}_log=fio/{name}_{log_type}.log")

for enforced_item in enforced_items:
args = remove_arg(args, enforced_item)
if isinstance(enforced_item, str):
args.append(enforced_item)
else:
args.append(f"{enforced_item[0]}={enforced_item[1]}")

return args

def parse_cmd(self, stdout: bytes, stderr: bytes) -> dict[str, Any]:
if self.skip:
return self.parameters.get_result_format() | self.empty_result()
try:
ret = json.loads(stdout)
except json.decoder.JSONDecodeError:
print(
f"{self.parameters.get_name_with_position()}: Cannot load fio's JSON output"
)
return self.parameters.get_result_format() | self.empty_result()

return {"fio_results": ret} | self.parameters.get_result_format()

@property
def name(self) -> str:
return self.engine_module.get_engine().get_name()

def run_cmd_version(self) -> list[str]:
return self.engine_module.get_engine().run_cmd_version()

def parse_version(self, stdout: bytes, _stderr: bytes) -> bytes:
return self.engine_module.get_engine().parse_version(stdout, _stderr)

def empy_result(self):
"""Default empty results for fio"""
return {
"effective_runtime": 0,
"skipped": self.skip,
"fio_results": {"jobs": []},
}


class FioCmdLine(Fio):
def parse_parameters(self):
"""Removing fio arguments set by the engine"""
# We need to ensure we have a proper fio command line
# Let's remove duplicated and enforce some
args = self.parameters.get_engine_module_parameter_base().split()

# Overriding empb to represent the real executed command.
# The list is having unique members and sorted to ensure a constant string representation.
self.parameters.engine_module_parameter_base = " ".join(
sorted(list(set(self.get_default_fio_command_line(args))))
)

def run_cmd(self) -> list[str]:
# Let's build the command line to run the tool
return (
super().run_cmd()
+ self.parameters.get_engine_module_parameter_base().split()
)
Loading

0 comments on commit 901b13e

Please sign in to comment.