Skip to content

Commit fa15a92

Browse files
committed
Update benchmarks README
1 parent cadf37a commit fa15a92

1 file changed

Lines changed: 3 additions & 157 deletions

File tree

benchmarks/README.md

Lines changed: 3 additions & 157 deletions
Original file line numberDiff line numberDiff line change
@@ -1,158 +1,4 @@
1-
# Candidate Benchmark Programs
1+
# Benchmarks
22

3-
This directory contains the candidate programs for the benchmark suite. They are
4-
candidates, not officially part of the suite yet, because we [intend][rfc] to
5-
record various metrics about the programs and then run a principal component
6-
analysis to find a representative subset of candidates that doesn't contain
7-
effectively duplicate workloads.
8-
9-
[rfc]: https://github.com/bytecodealliance/rfcs/pull/4
10-
11-
## Building
12-
13-
Build an individual benchmark program via:
14-
15-
```
16-
$ ./build.sh path/to/benchmark/dir/
17-
```
18-
19-
Build all benchmark programs by running:
20-
21-
```
22-
$ ./build-all.sh
23-
```
24-
25-
## Minimal Technical Requirements
26-
27-
In order for the benchmark runner to successfully execute a Wasm program and
28-
record its execution, it must:
29-
30-
* Export a `_start` function of type `[] -> []`.
31-
32-
* Import `bench.start` and `bench.end` functions, both of type `[] -> []`.
33-
34-
* Call `bench.start` exactly once during the execution of its `_start`
35-
function. This is when the benchmark runner will start recording execution
36-
time and performance counters.
37-
38-
* Call `bench.end` exactly once during execution of its `_start` function, after
39-
`bench.start` has already been called. This is when the benchmark runner will
40-
stop recording execution time and performance counters.
41-
42-
* Provide reproducible builds via Docker (see [`build.sh`](./build.sh)).
43-
44-
* Be located in a `sightglass/benchmarks/$BENCHMARK_NAME` directory. Typically
45-
the benchmark is named `benchmark.wasm`, but benchmarks with multiple files
46-
should use names like `<benchmark name>-<subtest name>.wasm` (e.g.,
47-
`libsodium-chacha20.wasm`).
48-
49-
* Input workloads must be files that live in the same directory as the `.wasm`
50-
benchmark program. The benchmark program is run within the directory where it
51-
lives on the filesystem, with that directory pre-opened in WASI. The workload
52-
must be read via a relative file path.
53-
54-
If, for example, the benchmark processes JSON input, then its input workload
55-
should live at `sightglass/benchmarks/$BENCHMARK_NAME/input.json`, and it
56-
should open that file as `"./input.json"`.
57-
58-
* Define the expected `stdout` output in a `./<benchmark name>.stdout.expected`
59-
sibling file located next to the `benchmark.wasm` file (e.g.,
60-
`benchmark.stdout.expected`). The runner will assert that the actual
61-
execution's output matches the expectation.
62-
63-
* Define the expected `stderr` output in a `./<benchmark name>.stderr.expected`
64-
sibling file located next to the `benchmark.wasm` file. The runner will assert
65-
that the actual execution's output matches the expectation.
66-
67-
Many of the above requirements can be checked by running the `.wasm` file
68-
through the `validate` command:
69-
70-
```
71-
$ cargo run -- validate path/to/benchmark.wasm
72-
```
73-
74-
## Compatibility Requirements for Native Execution
75-
76-
Sightglass can also measure the performance of a subset of benchmarks compiled
77-
to native code (i.e., not WebAssembly). To compile these benchmarks without
78-
changing their source code, this involves a delicate interface with the [native
79-
engine] with some additional requirements beyond the [Minimal Technical
80-
Requirements] noted above:
81-
82-
[native engine]: ../engines/native
83-
[Minimal Technical Requirements]: #minimal-technical-requirements
84-
85-
* Generate an ELF shared library linked to the [native engine] shared library to
86-
provide definitions for `bench_start` and `bench_end`.
87-
88-
* Rename the `main` function to `native_entry`. For C- and C++-based source this
89-
can be done with a simple define directive passed to `cc` (e.g.,
90-
`-Dmain=native_entry`).
91-
92-
* Provide reproducible builds via a `Dockerfile.native` file (see
93-
[`build-native.sh`](./build-native.sh)).
94-
95-
Note that support for native execution is optional: adding a WebAssembly
96-
benchmark does not imply the need to support its native equivalent &mdash; CI
97-
will not fail if it is not included.
98-
99-
## Additional Requirements
100-
101-
> Note: these requirements are lifted directly from the [the benchmarking
102-
> RFC][rfc].
103-
104-
In addition to the minimal technical requirements, for a benchmark program to be
105-
useful to Wasmtime and Cranelift developers, it should additionally meet the
106-
following requirements:
107-
108-
* Candidates should be real, widely used programs, or at least extracted kernels
109-
of such programs. These programs are ideally taken from domains where Wasmtime
110-
and Cranelift are currently used, or domains where they are intended to be a
111-
good fit (e.g. serverless compute, game plugins, client Web applications,
112-
server Web applications, audio plugins, etc.).
113-
114-
* A candidate program must be deterministic (modulo Wasm nondeterminism like
115-
`memory.grow` failure).
116-
117-
* A candidate program must have two associated input workloads: one small and
118-
one large. The small workload may be used by developers locally to get quick,
119-
ballpark numbers for whether further investment in an optimization is worth
120-
it, without waiting for the full, thorough benchmark suite to complete.
121-
122-
* Each workload must have an expected result, so that we can validate executions
123-
and avoid accepting "fast" but incorrect results.
124-
125-
* Compiling and instantiating the candidate program and then executing its
126-
workload should take *roughly* one to six seconds total.
127-
128-
> Napkin math: We want the full benchmark to run in a reasonable amount of
129-
> time, say twenty to thirty minutes, and we want somewhere around ten to
130-
> twenty programs altogether in the benchmark suite to balance diversity,
131-
> simplicity, and time spent in execution versus compilation and
132-
> instantiation. Additionally, for good statistical analyses, we need *at
133-
> least* 30 samples (ideally more like 100) from each benchmark program. That
134-
> leaves an average of about one to six seconds for each benchmark program to
135-
> compile, instantiate, and execute the workload.
136-
137-
* Inputs should be given through I/O and results reported through I/O. This
138-
ensures that the compiler cannot optimize the benchmark program away.
139-
140-
* Candidate programs should only import WASI functions. They should not depend
141-
on any other non-standard imports, hooks, or runtime environment.
142-
143-
* Candidate programs must be open source under a license that allows
144-
redistributing, modifying and redistributing modified versions. This makes
145-
distributing the benchmark easy, allows us to rebuild Wasm binaries as new
146-
versions are released, and lets us do source-level analysis of benchmark
147-
programs when necessary.
148-
149-
* Repeated executions of a candidate program must yield independent samples
150-
(ignoring priming Wasmtime's code cache). If the execution times keep taking
151-
longer and longer, or exhibit harmonics, they are not independent and this can
152-
invalidate any statistical analyses of the results we perform. We can easily
153-
check for this property with either [the chi-squared
154-
test](https://en.wikipedia.org/wiki/Chi-squared_test) or [Fisher's exact
155-
test](https://en.wikipedia.org/wiki/Fisher%27s_exact_test).
156-
157-
* The corpus of candidates should include programs that use a variety of
158-
languages, compilers, and toolchains.
3+
The set of benchmarks here have been copied from [Sightglass](https:/
4+
/github.com/bytecodealliance/sightglass/benchmarks). In general, the benchmarks here and will mostly be consistent with the set of benchmarks in that repository.

0 commit comments

Comments
 (0)