Write up Rust and Scala benchmarks

Vectornaut 2024-08-12 22:26:58 +00:00
parent d9cf029e6e
commit d06e8551ad

51
Language-benchmarks.md Normal file

@ -0,0 +1,51 @@
## Background
Among the [languages we considered](Coding-environment), Rust and Scala were the only two that we'd be enthusiastic about using. Each one has a key disadvantage:
- Rust doesn't have quiet syntax, so we'd have to invest in writing and maintaining a preprocessor.
- Scala doesn't currently have a WebAssembly target (although there are plans for one), so we'll be limited to the performance of the JavaScript target.
The decision between these two languages basically comes down to comparing the maintenance cost of a Rust preprocessor to the perormance cost of JavaScript.
## Benchmark computation
To evaluate the performance cost, Aaron wrote a benchmark program in Rust and JavaScript. It does the following computation, given an even dimensions $N$ and an integer period $R$:
- Initialize a random matrix $A \colon \mathbb{R}^N \to \mathbb{R}^N$ whose entries are roughly independent and uniformly distributed in $[-1, 1]$. (The independence and uniformity probably aren't very good: we used a very simple hash function to make it easy to get the same matrix in both versions.)
- Initialize an orthogonal matrix $T \colon \colon \mathbb{R}^N \to \mathbb{R}^N$ that splits into a direct sum of rotations with periods in $\big\{R, \tfrac{R}{2}, \tfrac{R}{3}, \tfrac{R}{4}\big\}$.
- Compute $A,\;TA,\;T^2A,\;\ldots,\;T^{R-1}A$ using $R$ left-multiplications by $T$.
- Find the eigenvalues of $A,\;\ldots\;T^{R-1}A$.
To validate the computation, the benchmark program displays the eigenvalues of $T^r A$, with $r \in \{0, \ldots, R\}$ controlled by a slider. Displaying the eigenvalues isn't part of the benchmark computation, so it isn't timed.
## Running the benchmark
### Rust
- To build and run, call `trunk serve --release` from the `rust-benchmark` folder and go to the URL that Trunk is serving.
- The `--release` flag is crucial. By turning off development features like debug symbols, it makes the compiled code literally a hundred times faster on Aaron's machine. However, it also seems prevent the benchmark computation from showing up in the Firefox profiler.
### Scala
- To build, call `sbt` from the `scala-benchmark` folder, and then call `fullLinkJS` from the `sbt` prompt.
- The benchmark page points to the JavaScript file produced by `fullLinkJS`. Calling `fastLinkJS` won't update the code the benchmark page uses, even if compilation succeeds.
- Using `fullLinkJS` instead of `fastLinkJS` is important. Doing a full build rather than a quick build provides more opportunities for optimization, making the transpiled code nearly twice as fast on Aaron's machine.
- To run, launch a web server for the `scala-benchmark` folder and go to the URL that it's serving.
## Program details
### Rust
To make the Rust computation more similar to the Scala computation, we do the successive left-multiplications using the code
```rust
rand_mat = &rot_step * rand_mat;
```
which might allocate new memory to store the result in every time it runs. We could avoid the allocation by doing something like
```rust
rot_step.mul_to(&rand_mat, &rand_mat_next);
rand_mat.copy_from(&rand_mat_next);
```
where `rand_mat_next` is pre-allocated outside the loop.
## Browser details
- Firefox 128.0.3 (64-bit)
- Ungoogled Chromium 127.0.6533.88
Both running under Ubuntu 22.04 (64-bit) on an [AMD Ryzen 7 7840U](https://www.amd.com/en/products/processors/laptop/ryzen/7000-series/amd-ryzen-7-7840u.html) processor.
## Results
### Firefox
The Rust version typically ran 611 times as fast as the Scala version, and its speed was much more consistent.
- Rust run time: 110120 ms
- Scala run time: 7001200 ms
### Chromium
The Rust version typically ran 57 times as fast as the Scala version, with comparable consistency.
- Rust 8090 ms
- Scala: 520590 ms