Write up Rust native benchmark

2024-08-19 20:06:16 +00:00 · 2024-08-19 20:06:16 +00:00 · e97dee9a72
commit e97dee9a72
parent 34aa1eb66b
1 changed files with 18 additions and 7 deletions
--- a/Language-benchmarks.md
+++ b/Language-benchmarks.md
@ -17,10 +17,13 @@ To validate the computation, the benchmark program displays the eigenvalues of $
 The language comparison benchmark uses 64-bit floating point matrices of size $N = 60$. Other variants of the benchmark, used to compare different design decisions within Rust, are described at the end.
 ## Running the benchmark
-### Rust
+### Rust web
 - To build and run, call `trunk serve --release` from the `rust-benchmark` folder and go to the URL that Trunk is serving.
-  - The `--release` flag is crucial. By turning off development features like debug symbols, it makes the compiled code literally a hundred times faster on Aaron's machine. However, it also seems prevent the benchmark computation from showing up in the Firefox profiler.
+  - The [`--release`](https://doc.rust-lang.org/cargo/reference/profiles.html#release) flag is crucial. By turning on compiler optimizations, turning off overflow checks, and changing other build settings, it makes the compiled code literally a hundred times faster on Aaron's machine. However, it also seems prevent the benchmark computation from showing up in the Firefox profiler.
-### Scala
+### Rust native
 - To build and run, call `cargo run --release` from the `rust-benchmark-native` folder.
  - As with the web benchmark, the `--release` flag is crucial: it makes the compiled code about a hundred times faster on Aaron's machine.
 ### Scala web
 - To build, call `sbt` from the `scala-benchmark` folder, and then call `fullLinkJS` from the `sbt` prompt.
  - The benchmark page points to the JavaScript file produced by `fullLinkJS`. Calling `fastLinkJS` won't update the code the benchmark page uses, even if compilation succeeds.
  - Using `fullLinkJS` instead of `fastLinkJS` is important. Doing a full build rather than a quick build provides more opportunities for optimization, making the transpiled code nearly twice as fast on Aaron's machine.
@ -43,14 +46,22 @@ where `rand_mat_next` is pre-allocated outside the loop.
 Both running under Ubuntu 22.04 (64-bit) on an [AMD Ryzen 7 7840U](https://www.amd.com/en/products/processors/laptop/ryzen/7000-series/amd-ryzen-7-7840u.html) processor.
 ## Results
-### Firefox
+### Rust vs. Scala on the web
 #### Firefox
 The Rust version typically ran 6–11 times as fast as the Scala version, and its speed was much more consistent.
 - Rust run time: 110–120 ms
 - Scala run time: 700–1200 ms
-### Chromium
+#### Chromium
 The Rust version typically ran 5–7 times as fast as the Scala version, with comparable consistency.
- Rust 80–90 ms
+- Rust run time: 80–90 ms
- Scala: 520–590 ms
+- Scala run time: 520–590 ms
 ### Web vs. native in Rust
 The native version typically ran 1.00–1.15 times as fast as the web version on Chromium, and 1.4–1.5 times as fast as the web version on Firefox, with slightly more consistency.
 - Rust native run time: 77 ms
 For this benchmark, WebAssembly achieved its aim of executing at near-native speed.
 *When I first added the GTK interface to the Rust native benchmark, the run time became unusually long for a while, hovering around 260 ms. I never figured out what was causing that. The run time eventually returned to typical after some combination of rebuilding, waiting, and shutting my laptop down for the night. —Aaron*
 ## Rust benchmark variants
 ### Low-precision variant
 - For matrices of size $N = 50$, using 32-bit floating point instead of 64-bit made the computation about 15% faster (60 ms instead of 70 ms). However, for $N \ge 54$, the 32-bit floating point variant would hang indefinitely! Maybe the target precision doesn't change to accommodate the lower-precision data type?