Full review (#8) (#9)

jlb6740 · web-flow · commit 7e1e38bd93bd · 2023-09-18T16:26:10.000-07:00
* Initial commit for WasmScore" * add github actions to compile and test x64 benchmarks (#1) * Add github actions to compile and test x64 benchmarks * Refactor driver to match refactored sightglass * Continue refactor and cleanup. Better use of df * Prints out efficiency and wasm scores * Remove unneeded comments * Add printing of results to a file and the screen * Improve quiet printing * Add simdscore placeholder and validate runall * Update tag for docker images * Update workflow to include running of all available benchmarks * Update wasmtime commit * Add code of conduct Code of conduct is a copy from wasmtime repo. * Add contributor documentation * Add license agreement * Update security policy * Add dependabot support for pip * Update simplify config.inc * Use a local benchmarks directory instead of the Sightglass version * Fix for failures caused by missing results directory on some runs * Remove unnecessary installs in Dockerfile * Separate security section to it's own file * Update printed comments * Update container entry point and build message * Update README and add example screenshots in asset folder
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,8 +10,3 @@ For more information about contributing to this project you can consult the
 [Code of Conduct]: CODE_OF_CONDUCT.md
 [Organizational Code of Conduct]: ORG_CODE_OF_CONDUCT.md
 [online documentation]: https://bytecodealliance.github.io/wasmtime/contributing.html
-
-# Security
-
-If you think you have found a security issue in WasmScore, please file a github issue with details, reproducible steps, and a clearly defined impact.
-Once an issue is reported, we will assess, respond, and priortize a solution. In the case that WasmScore has planned updates at a regular time candence, the fix of a security vulnerability may warrant an intermediate release of WasmScore depending on the severity of the vulnerability.
diff --git a/Dockerfile b/Dockerfile
@@ -51,14 +51,6 @@ RUN python3 -m pip install termgraph \
 	&& python3 -m pip install termcolor \
 	&& python3 -m pip install pyyaml
 
-# Install docker
-RUN apt-get update && apt-get -y install ca-certificates curl gnupg lsb-release
-RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
-RUN echo \
-  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
-  $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
-RUN apt-get update && apt-get -y install docker-ce docker-ce-cli containerd.io
-
 # Install sightglass
 WORKDIR /
 RUN git clone --recurse-submodules ${SIGHTGLASS_REPO} sightglass
@@ -92,3 +84,7 @@ ADD benchmarks /sightglass/benchmarks
 WORKDIR /
 COPY wasmscore.py /sightglass/wasmscore.py
 COPY wasmscore.sh /
+
+# Set default entry and command
+ENTRYPOINT ["/bin/bash", "/wasmscore.sh"]
+CMD ["-t", "wasmscore"]
diff --git a/README.md b/README.md
@@ -1,11 +1,38 @@
 # WasmScore
 
-This benchmark is designed to provide a simple, convenient and portable view of the performance of WebAssembly outside the browser on the underlying platform it is run on. It uses a containerized suite of real codes and micros leveraged directly from [Sightglass](https://github.com/bytecodealliance/sightglass) to benchmark the platform and provide a “WasmScore” that is based on a formula that aggregates execution time. In addition, the driver for the benchmark serves as an easy-to-use tool for executing any user determined assortment of Sightglass benchmarks supported by the driver.
-
-One of the most important and challenging aspect of benchmarking is deciding how to interpret the results; should you consider the results to be good or bad? To decide, you really need context on what it is you are trying to achieve, and this often starts with a baseline used to serve as a point of comparison. That baseline could be the execution of that same original source but before some transformation was applied when lowered to WebAssembly, or that baseline could be a modified configuration of the runtime that executes the WebAssembly. In the case of WasmScore, a novel aspect is that for every Wasm real code and micro that is run, WasmScore also executes the native versions of those code compiled from the same high-level source used to generate the Wasm. In this way WasmScore provides a comparison point for the Wasm performance which will ideally be the theoretical upper for the performance of WebAssembly. This feature allows a user to quickly gauge the performance impact of using Wasm instead of using a native compile of the same code when run on that particular platform. It allows developers to see opportunity to improve compilers, or to improve Wasm runtimes, or improve the Wasm spec, or to suggest other solutions (such as Wasi) to address gaps.
-
-Another important feature of WasmScore is simplicity and convenience. Specifically, the user is not expected to have to build the benchmark where they might have to deal with installing or updating dependencies. The user is also not expected contend interpreting the need for turning on or off a myriad of flags and features; to get a platforms WasmScore the user simply runs wasmscore.sh inside the container. Still, while it is meant for the user to simply pull a containerized image and then run the benchmark on the desired platform without worrying, WasmScore can of course be built and then run either within or outside (TODO) the containerized environment. In either case is intended for the compile of all codes to properly utilizes underlying hardware features. To that end, the ideal use case and indeed the target use case for WasmScore is for a quick, simple and consistent cross platform view of Wasm performance. The benchmark especially wants to target usecases and applications that are emerging for Wasm in standalone client and cloud environments. WasmScore is intended to be run on X86-64 and AArch64 Linux platforms.
-
+## Intro
+WasmScore aims to provide a view of WebAssembly performance when executed outside the browser. It uses a containerized suite of benchmarks (both user facing codes and purpose built benchmarks) and leverages [Sightglass](https://github.com/bytecodealliance/sightglass) to benchmark the underline platform. A score based on a formula that aggregates execution time of the suites that make up the "wasmscore" test is provided. In addition to scoring wasm performance, the benchmark is also a tool capable of executing any assortment of other tests, suites, or benchmarks supported by the driver. This WasmScore is work in development.
+
+## Description
+One of the most important and challenging aspect of benchmarking is deciding how to interpret the results; should you consider the results to be good or bad? To decide, you really need context on what it is you are trying to achieve, and this often starts with a baseline used to serve as a point of comparison. For example, that baseline could be the execution of that same original source but before some transformation was applied when lowered to WebAssembly, or that baseline could be a modified configuration of the runtime that executes the WebAssembly. In the case of WasmScore, for every Wasm real code and micro that is run, WasmScore also executes the native code compile from the same high-level source used to generate the Wasm to serve as a baseline. In this way WasmScore provides a comparison point for the Wasm performance which will ideally be the theoretical upper for the performance of WebAssembly. This allows a user to quickly gauge the performance impact of using Wasm instead of using a native compile of the same code when run on that particular platform. It allows developers to see opportunity to improve compilers, or to improve Wasm runtimes, or improve the Wasm spec, or to suggest other solutions (such as Wasi) to address gaps.
+
+## Benchmarks
+Typically a benchmark reports either the amount of work done over a constant amount of time or it reports the time taken to do a constant amount of work. The benchmarks here all do the later. The initial commit of the benchmarks avaialble have been pulled Sightglass however the benchmarks used with WasmScore come from the local directory here and have no dependency on the benchmarks stored there. However, how the benchmarks here are built and run directly dependent on changes to the external Sightglass repo.
+
+Also, benchmarks are often categorized based on their origin. Two such buckets of benchmarks are (1) codes written with the intent of being user facing (library paths, application use cases) and (2) codes written specifically to benchmark some important/common code construct or platform feature. WasmScore will not necessarily favor either of these benchmarking buckets as both are valuable for the evaluation of standalone Wasm performance depending on what you want to know. The extent that it does will depending on the test run, where currently there is only the primary "wasmscore" test though a simdscore test is in the plans.
+
+## Goals
+A standalone benchmark that is:
+- Convenient to build and run with easy to interpret results
+- Is portable and enables cross-platform comparisons
+- Provides a breadth of coverage for current standalone binaries
+- Is convenient to analyze
+
+## WasmScore Suites
+Any number of test can be created but WasmScore is the initial and default test. It includes a mix of relevant in use codes and targeted benchmarks Wasm performance outside the browser that is broken down into categories:
+- App:  [‘Meshoptimizer’]
+- Core: [‘Ackermann', ‘Ctype', ‘Fibonacci’]
+- Crypto: [‘Base64', ‘Ed25519', ‘Seqhash']
+- AI: (Coming)
+- Regex: (Coming)
+
+## Plan
+Next steps include:
+- Improving stability and user experience
+- Adding benchmarks to the AI, Regex, and APP suites
+- Adding more benchmarks
+- Complete the SIMD test
+- Publish a list of planned milestone
 
 ## Usage
 
@@ -15,17 +42,17 @@ Download and run the latest prebuilt benchmark image:
 
 **X86-64:**
 ```
-docker pull ghcr.io/jlb6740/wasmscore/wasmscore_x86_64:latest
+docker pull ghcr.io/bytecodealliance/wasm-score/wasmscore_x86_64_linux:latest
 ```
 ```
-docker run -it ghcr.io/jlb6740/wasmscore/wasmscore_x86_64:latest /bin/bash /wasmscore.sh
+docker run -it ghcr.io/bytecodealliance/wasm-score/wasmscore_x86_64_linux:latest
 ```
 **AArch64:**
 ```
-docker pull ghcr.io/jlb6740/wasmscore/wasmscore_aarch64:latest
+docker pull ghcr.io/bytecodealliance/wasm-score/wasmscore_aarch64_linux:latest
 ```
 ```
-docker run -it ghcr.io/jlb6740/wasmscore/wasmscore_aarch64:latest /bin/bash /wasmscore.sh
+docker run -it ghcr.io/bytecodealliance/wasm-score/wasmscore_aarch64_linux:latest
 ```
 
 ### Build and Run Yourself
@@ -36,29 +63,10 @@ To build:
 ```
 To run from this local build:
 ```
-docker run -ti wasmscore /bin/bash wasmscore.sh --help
+docker run -ti wasmscore <--help>
 ```
 
 To build containerless:
 > Not yet supported
 
-### Other Useful Commands
-
-For a detached setup that allows for copying files to the image or entering the container (being mindful of the container name), use the following commands:
-```
-docker run -ti -d wasmscore /bin/bash
-```
-```
-wasmscore_container_id=$(docker ps | grep -m 1 wasmscore | awk '{ print $1 }')
-```
-```
-docker cp <file> ${wasmscore_container_id}:
-```
-or
-```
-docker exec -ti ${wasmscore_container_id} /bin/bash
-```
-## Example Screenshots
-
-
 
diff --git a/SECURITY.md b/SECURITY.md
@@ -0,0 +1,2 @@
+If you think you have found a security issue in WasmScore, please file a github issue with details, reproducible steps, and a clearly defined impact.
+Once an issue is reported, we will assess, respond, and priortize a solution. In the case that WasmScore has planned updates at a regular time candence, the fix of a security vulnerability may warrant an intermediate release of WasmScore depending on the severity of the vulnerability.
diff --git a/build.sh b/build.sh
@@ -12,17 +12,19 @@ docker tag ${IMAGE_NAME} ${IMAGE_NAME}_${ARCH}_${KERNEL}:latest
 docker tag ${IMAGE_NAME} ${IMAGE_NAME}_${ARCH}_${KERNEL}:${IMAGE_VER}
 
 echo ""
-echo "To run from this local build use command:"
-echo "> docker run -ti ${IMAGE_NAME} /bin/bash wasmscore.sh --help"
+echo "The entry point is a wrapper to the python script 'wasmscore.py'"
+echo "To run from this local build use command (for a list of more options use --help):"
+echo "> docker run -ti ${IMAGE_NAME} <options>"
 echo ""
 echo "To stop and rm all ${IMAGE_NAME} containers:"
 echo "> docker rm \$(docker stop \$(docker ps -a -q --filter ancestor=${IMAGE_NAME}:latest --format="{{.ID}}"))"
 echo ""
 echo "For a detached setup that allows for copying files to the image or"
 echo "entering the container, use the following commands:"
-echo "> docker run -ti -d ${IMAGE_NAME} /bin/bash"
+echo "> docker run --entrypoint=/bin/bash -ti -d ${IMAGE_NAME}"
 echo "> wasmscore_container_id=\$(docker ps | grep -m 1 ${IMAGE_NAME} | awk '{ print \$1 }')"
+echo ""
 echo "> docker cp <file> \${wasmscore_container_id}:"
 echo "or"
-echo "> docker exec -ti \${wasmscore_container_id} /bin/bash"
+echo "> docker exec -ti \${wasmscore_container_id}" /bin/bash
 echo ""
diff --git a/docs/assets/Screenshot-Fibonacci.png b/docs/assets/Screenshot-Fibonacci.png
diff --git a/docs/assets/Screenshot-WasmScore.png b/docs/assets/Screenshot-WasmScore.png
diff --git a/wasmscore.py b/wasmscore.py
@@ -3,7 +3,6 @@
 
 import os
 import sys
-import datetime
 from datetime import datetime
 import argparse
 import subprocess
@@ -24,9 +23,6 @@
             available suites:       See list
             available tests:        WasmScore (default), SimdScore
 
-
-
-            example usage: ./wasmscore.sh -b shootout -r wasmtime_app
          """
     ),
 )
@@ -67,7 +63,6 @@
 )
 
 parser.add_argument(
-    "-l",
     "--list",
     action="store_true",
     help="List all available suites and individual benchmarks to run",
@@ -367,8 +362,25 @@ def run_benchmarks(benchmark, run_native=False):
     logging.info("Running benchmark ...")
     logging.info("Run native ... %s", run_native)
 
-    native_df = None
+    results_dir = f"{SG_BENCHMARKS_BASE}/results/"
+
+    create_results_path_cmd_string = f"mkdir -p {results_dir}"
+    try:
+        logging.info(
+            "Trying mkdir for results_path ... %s", create_results_path_cmd_string
+        )
+        output = subprocess.check_output(
+            create_results_path_cmd_string,
+            shell=True,
+            text=True,
+            stderr=subprocess.STDOUT,
+        )
+        logging.debug("%s", output)
+    except subprocess.CalledProcessError as error:
+        print(f"mkdir for build folder failed with error code {error.returncode}")
+        sys.exit(error.returncode)
 
+    native_df = None
     if run_native and sg_benchmarks_native[benchmark]:
         print_verbose(f"Collecting Native ({benchmark}).")
 
@@ -382,7 +394,6 @@ def run_benchmarks(benchmark, run_native=False):
         )
         logging.debug("native_benchmark_path ... %s", native_benchmark_path)
 
-        results_dir = f"{SG_BENCHMARKS_BASE}/results/"
         results_path = f"{results_dir}/{benchmark}" + "_native_results.csv"
         logging.debug("results_path ... %s", results_path)
 
@@ -434,22 +445,6 @@ def run_benchmarks(benchmark, run_native=False):
                 print(f"Building native failed with error code {error.returncode}")
                 sys.exit(error.returncode)
 
-        create_results_path_cmd_string = f"mkdir -p {results_dir}"
-        try:
-            logging.info(
-                "Trying mkdir for results_path ... %s", create_results_path_cmd_string
-            )
-            output = subprocess.check_output(
-                create_results_path_cmd_string,
-                shell=True,
-                text=True,
-                stderr=subprocess.STDOUT,
-            )
-            logging.debug("%s", output)
-        except subprocess.CalledProcessError as error:
-            print(f"mkdir for build folder failed with error code {error.returncode}")
-            sys.exit(error.returncode)
-
         cli_cmd_string = (
             "LD_LIBRARY_PATH=/sightglass/engines/native/ "
             "/sightglass/target/release/sightglass-cli benchmark "
@@ -549,7 +544,6 @@ def run_benchmarks(benchmark, run_native=False):
     wasm_benchmark_path = f"{SG_BENCHMARKS_BASE}" + sg_benchmarks_wasm[benchmark]
     logging.debug("wasm_benchmark_path ... %s", wasm_benchmark_path)
 
-    results_dir = f"{SG_BENCHMARKS_BASE}/results/"
     results_path = f"{results_dir}/{benchmark}" + "_wasm_results.csv"
     logging.debug("results_path ... %s", results_path)
 
@@ -867,7 +861,7 @@ def main():
 
     if ARGS_DICT["list"]:
         print("")
-        print("Scores\n------")
+        print("Tests\n------")
         print(yaml.dump(perf_tests, sort_keys=True, default_flow_style=False))
         print("Suites\n------")
         print(yaml.dump(perf_suites, sort_keys=True, default_flow_style=False))

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+If you think you have found a security issue in WasmScore, please file a github issue with details, reproducible steps, and a clearly defined impact.`
	`2`	`+Once an issue is reported, we will assess, respond, and priortize a solution. In the case that WasmScore has planned updates at a regular time candence, the fix of a security vulnerability may warrant an intermediate release of WasmScore depending on the severity of the vulnerability.`