Write up Rust native benchmark
parent
34aa1eb66b
commit
e97dee9a72
@ -17,10 +17,13 @@ To validate the computation, the benchmark program displays the eigenvalues of $
|
|||||||
|
|
||||||
The language comparison benchmark uses 64-bit floating point matrices of size $N = 60$. Other variants of the benchmark, used to compare different design decisions within Rust, are described at the end.
|
The language comparison benchmark uses 64-bit floating point matrices of size $N = 60$. Other variants of the benchmark, used to compare different design decisions within Rust, are described at the end.
|
||||||
## Running the benchmark
|
## Running the benchmark
|
||||||
### Rust
|
### Rust web
|
||||||
- To build and run, call `trunk serve --release` from the `rust-benchmark` folder and go to the URL that Trunk is serving.
|
- To build and run, call `trunk serve --release` from the `rust-benchmark` folder and go to the URL that Trunk is serving.
|
||||||
- The `--release` flag is crucial. By turning off development features like debug symbols, it makes the compiled code literally a hundred times faster on Aaron's machine. However, it also seems prevent the benchmark computation from showing up in the Firefox profiler.
|
- The [`--release`](https://doc.rust-lang.org/cargo/reference/profiles.html#release) flag is crucial. By turning on compiler optimizations, turning off overflow checks, and changing other build settings, it makes the compiled code literally a hundred times faster on Aaron's machine. However, it also seems prevent the benchmark computation from showing up in the Firefox profiler.
|
||||||
### Scala
|
### Rust native
|
||||||
|
- To build and run, call `cargo run --release` from the `rust-benchmark-native` folder.
|
||||||
|
- As with the web benchmark, the `--release` flag is crucial: it makes the compiled code about a hundred times faster on Aaron's machine.
|
||||||
|
### Scala web
|
||||||
- To build, call `sbt` from the `scala-benchmark` folder, and then call `fullLinkJS` from the `sbt` prompt.
|
- To build, call `sbt` from the `scala-benchmark` folder, and then call `fullLinkJS` from the `sbt` prompt.
|
||||||
- The benchmark page points to the JavaScript file produced by `fullLinkJS`. Calling `fastLinkJS` won't update the code the benchmark page uses, even if compilation succeeds.
|
- The benchmark page points to the JavaScript file produced by `fullLinkJS`. Calling `fastLinkJS` won't update the code the benchmark page uses, even if compilation succeeds.
|
||||||
- Using `fullLinkJS` instead of `fastLinkJS` is important. Doing a full build rather than a quick build provides more opportunities for optimization, making the transpiled code nearly twice as fast on Aaron's machine.
|
- Using `fullLinkJS` instead of `fastLinkJS` is important. Doing a full build rather than a quick build provides more opportunities for optimization, making the transpiled code nearly twice as fast on Aaron's machine.
|
||||||
@ -43,14 +46,22 @@ where `rand_mat_next` is pre-allocated outside the loop.
|
|||||||
|
|
||||||
Both running under Ubuntu 22.04 (64-bit) on an [AMD Ryzen 7 7840U](https://www.amd.com/en/products/processors/laptop/ryzen/7000-series/amd-ryzen-7-7840u.html) processor.
|
Both running under Ubuntu 22.04 (64-bit) on an [AMD Ryzen 7 7840U](https://www.amd.com/en/products/processors/laptop/ryzen/7000-series/amd-ryzen-7-7840u.html) processor.
|
||||||
## Results
|
## Results
|
||||||
### Firefox
|
### Rust vs. Scala on the web
|
||||||
|
#### Firefox
|
||||||
The Rust version typically ran 6–11 times as fast as the Scala version, and its speed was much more consistent.
|
The Rust version typically ran 6–11 times as fast as the Scala version, and its speed was much more consistent.
|
||||||
- Rust run time: 110–120 ms
|
- Rust run time: 110–120 ms
|
||||||
- Scala run time: 700–1200 ms
|
- Scala run time: 700–1200 ms
|
||||||
### Chromium
|
#### Chromium
|
||||||
The Rust version typically ran 5–7 times as fast as the Scala version, with comparable consistency.
|
The Rust version typically ran 5–7 times as fast as the Scala version, with comparable consistency.
|
||||||
- Rust 80–90 ms
|
- Rust run time: 80–90 ms
|
||||||
- Scala: 520–590 ms
|
- Scala run time: 520–590 ms
|
||||||
|
### Web vs. native in Rust
|
||||||
|
The native version typically ran 1.00–1.15 times as fast as the web version on Chromium, and 1.4–1.5 times as fast as the web version on Firefox, with slightly more consistency.
|
||||||
|
- Rust native run time: 77 ms
|
||||||
|
|
||||||
|
For this benchmark, WebAssembly achieved its aim of executing at near-native speed.
|
||||||
|
|
||||||
|
*When I first added the GTK interface to the Rust native benchmark, the run time became unusually long for a while, hovering around 260 ms. I never figured out what was causing that. The run time eventually returned to typical after some combination of rebuilding, waiting, and shutting my laptop down for the night. —Aaron*
|
||||||
## Rust benchmark variants
|
## Rust benchmark variants
|
||||||
### Low-precision variant
|
### Low-precision variant
|
||||||
- For matrices of size $N = 50$, using 32-bit floating point instead of 64-bit made the computation about 15% faster (60 ms instead of 70 ms). However, for $N \ge 54$, the 32-bit floating point variant would hang indefinitely! Maybe the target precision doesn't change to accommodate the lower-precision data type?
|
- For matrices of size $N = 50$, using 32-bit floating point instead of 64-bit made the computation about 15% faster (60 ms instead of 70 ms). However, for $N \ge 54$, the 32-bit floating point variant would hang indefinitely! Maybe the target precision doesn't change to accommodate the lower-precision data type?
|
||||||
|
Loading…
Reference in New Issue
Block a user