30 Coding environment
Vectornaut edited this page 2024-08-07 21:30:06 +00:00
This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

The current proposal for an implementation language is Nim, which has a well-developed static type system, with generic functions and operator overloading, and a clean mostly brace-free syntax. It compiles to C, C++, and JavaScript, and has foreign function interfaces for all three. There is a third-party direct-to-WebAssembly compiler using LLVM that was active not too long ago. So this seems like a promising choice and likely to be the way to go unless some issue or better alternative pops up.

However, a single program cannot call both C/C++ and JavaScript. Dyna3 might consist of multiple programs, though, that somehow call each other, so it might be possible to use both external JavaScript and C++ libraries in the overall project -- I am not sure.

One early question is: how might we present 3D scenes of constructions to people using Dyna3 and allow them to interact with the constructions? Some possibilities include:

  • Three.js
  • Ganja.js
  • direct WebGL, either through a Javascript interface or an emscripten C/C++ interface
  • threepp, a C++ port of three.js, in case we decide that C++ is our target and it's hard to mix and therefore we need to be using a C++ graphical library
  • Or various other C++ graphics libraries if we need to go there for the same reason, e.g. Magnum

Language desiderata

This section is very Glen-opinionated ;-) Non-negotiable items:

  • Clean, non-noisy, DRY syntax, which implies significant indentation for certain (since we will always indent code anyway!), essentially no braces or such delimiters for code grouping, elimination of parentheses to the extent possible, etc.
  • Strong static typing with a well-developed type system
  • Able to compile to a browser-executable format (JavaScript or WASM)
  • Able to call some existing significant body of packages so we don't have to reinvent the wheel

Other items (should be fleshed out, but things like easy lists and dicts, nice comprehension expressions for them, what else?)

Language lists/data

The Stack Overflow survey is potentially relevant/useful. I've annotated the language list below with the SO used/want-to-use score for each language we mention. Tellingly, the highest ranked significant-indentation statically typed language is Scala, followed by F#, and then Nim. So we seem to have found the main options. This suggests two things (1) Since both Nim and F# seemed to have issues for us, perhaps we should also give Scala a closer look; and (2) so maybe then that means we will have to go the syntax crufter/decrufter route, in which case why not go with Rust that has the highest "want-to-use" score (and has for 9 years running!) and sixth highest "use" score.

Next let's try to collect various language options, with their pros and cons:

Civet (SO Unranked, unsurprisingly)

  • 👍 Nicest syntax I've ever used
  • Tied to the dreadful TypeScript type system (probably disqualifying)

Julia (SO 2.1/61.7)

  • 👍 has been very useful for prototyping
  • Littered with "end" keywords, not clear any current good path for building for in browser use

Nim (SO 0.9/50.5)

  • 👍 Nice rational syntax with lots of features
  • 👍 Compiles to JavaScript or WASM (via either C/C++ > emscripten or a third-party LLVM backend)
  • ⇓ The user community is very small, and even the most basic tools and libraries seem sporadically maintained. (Aaron)
    • The best-maintained linear algebra library I can find is Neo. It's the only linear algebra library mentioned in the SciNim scientific computing ecosystem overview. Unfortunately, it's been broken by a language update since January 2023, and the author doesn't seem to be using Nim anymore. I think the package is unlikely to be fixed unless we do it ourselves.
    • There seems to be a command line interface bug in nlvm that causes the basic examples in the README to fail. We can work around it, but it makes me worry that if we include nlvm in our toolchain, we may end up having to maintain it.
    • [Glen:] Oooh, I agree that this whole shallow copy and different garbage collectors thing and memory management being critical but also not being very well surfaced in the language is very scary for maintaining a package like Neo. Seems like perhaps Nim is still more of a laboratory language than a real live in the wild one...
  • ⇓ I haven't been able to build working WASM from any code using Neo, the recommended linear algebra library. I've tried nlvm and Emscripten with various build options. I suspect that the bug in Neo mentioned above is causing the build failures. (Aaron)
  • ⇓ Clunky comprehensions (see the examples on the home page) and dicts are not in the language, but rather in the library. (Based on one small slightly hard to find section in the manual, {"a": "b", "c": "d"} is a literal representation for a fixed-length array of pairs of strings, apparently in an effort to be agnostic to different possible dict-like implementations. So in particular, the types in each pair have to match, and all pairs must be the same type.)
  • ⇓ Some odd rigidity, such as indentation for continuing expressions is only allowed in certain special places like after a binary operator or a parenthesis. Araq, the "benevolent overlord" of Nim, definitely displays his opinionated rigidity on the forums. Of course, maybe that's what it takes to see a new language through to success...
  • (Bottom line, there will definitely be a number of very unusual idiosyncrasies with nim such as foo_bar being the same identifier as foObaR that will at least require getting used to or maybe even working around. But it's not clear that any are showstoppers.)
  • +- Has significant in-language syntax-aware macro facilities, which is a blessing and curse. We may be able to use it to provide more comfortable syntax for some of the idiosyncrasies that rub us the wrong way; but on the other hand, we could end up making our code sort of non-standard and therefore tricky for someone from the Nim language community such as it is to read intuitively...
  • It's sad that the "where continuing expressions can break" decision is just wrong ;-) -- there's lots of discussion about this sort of thing on the web, but the overall synthesis seems to be that for several reasons
   myvar = longThing
      + otherLongThing

is better than

   myvar = longThing +
      otherLongThing

The gist of the most significant argument for the upper version is that one reads the beginnings of lines more easily (they are aligned, after all), and looking there it's easy to see how the parts of the long expressions are connected, and hence code written in the upper style is easier to understand. Perhaps Nim is still new/flexible enough that we could contribute a PR to also make the split+indent just before a binary operator legal as well, backwards-compatibly still allowing a split+indent after a binary operator? I think if we go with Nim I would like to try to do that. But we can wait until after we make the decision as to which language to start the "full implementation" in.

  • Aaron reports the state of linear algebra packages is not good, and one has broken based on language/compiler changes, which is all kind of scary -- we might have to do a significant amount of maintenance ourselves. This is the curse of that 0.9 "use" rating on SO.

Lobster (SO unranked)

  • (Is this worth a more careful comparison with Nim, or is it just too small/fringe?)
  • Aaron consulted with the Lobster community who says that intercommunication with JavaScript is cumbersome

Scala, now with significant indentation (SO 3/50.9)

  • (I think targeting the JVM invalidates this? Or is there now a decent build process to WASM? Is Scala sufficiently more active/bigger community/more mainstream than Nim to make it worth looking into this more?)
  • Actually, there is Scala.js that compiles to JavaScript, that seems quite mature. There does not currently seem to be a way to compile to WASM, but it appears to be pretty close as an optional backend in Scala.js (see https://github.com/scala-js/scala-js/issues/4928). One can use existing JavaScript frameworks via Scala.js -> JavaScript; there also exist Scala frameworks/servers like Xitrum but not sure what that does as far as client-side computation. So it seems as though trying to go with Scala would at least at first mean sticking within JavaScript per se, so there is a potential trap if in fact WASM ends up being needed for performance reasons (assuming that a language well-compiled to WASM will be faster than basically equivalent JavaScript). On the other hand, the ongoing work for the WASM backend could provide a way out of that trap, although some commenters online say that the WASM backend will not likely be any faster than the pure-JavaScript backend because it will not be able to get close enough to the underlying WASM architecture to take advantage of its possible speedups. So there is some risk here; not sure how to evaluate whether it's worth looking deeper?
  • Nevertheless, being the statically-typed significant-indentation language with the highest SO score is suggestive that we should put a bit of effort into seeing what using it would be like.
  • Scala and Java have a bunch of numerical linear algebra packages. Some of them have been benchmarked against each other. The EJML seems to dominate the benchmarks for matrices smaller than 100×100. The benchmarks don't include Breeze or Slash, though. So far, Slash has been the easiest to get up and running.
  • Reactive web frameworks include:
    • Laminar. Confusing variety of similar-looking ways to do things. Documentation very difficult to use.
    • Outwatch. Documentation looks detailed and well-written. Build environment seems complicated, but maybe it's not all necessary?
    • REScala. Actively developed. Nice documentation of reactive data structures, but little explanation of how to use these in a web app. The to-do list example and other examples are valuable.
  • I think the musings/reservations I discuss above really just come down to the question of whether adopting a language originally tied to the JVM, and for the moment for our purposes tied to the JavaScript engines in browsers, will ultimately hobble the performance of Dyna3, which will surely have compute-intensive aspects. One way we could get at this is do the hello, world app in Scala (which should be trivial via Scala.js, as it will end up 100% JavaScript when compiled), and then do the same "Linear algebra hello world" in Scala as in other systems under consideration in which the linear algebra will be compiled to WASM, and make sure the linear algebra task being performed is a noticeable compute-time consumer, and then just benchmark the identical task in Scala and the other candidates.

C++ with our own civet-like syntax unmangler (SO 18.3/53.1)

  • ⇓⇓ Big con: we would have to implement this!

F# (SO 2.2/53.1)

  • Deeply tied to the .NET ecosystem.
  • ⇓ The basic template for a Bolero application seems very heavy. It's not clear how to write a bare-bones application.
  • ⇓ Projects are auto-generated and opaquely structured.
  • [Glen:] These things do seem icky.

Ruby (SO 4.7/50.1)

  • Well-established, with a large user community.
  • Can transpile to JavaScript using Opal.
  • May someday support WebAssembly as a build target using Artichoke.
  • ⇓ However, Artichoke is pre-release...
  • ⇓ ... and it doesn't support WASM builds yet.
  • ⇓ There may be other ways to compile Ruby to WASM, but I get the impression that they wouldn't be any more stable.
  • I don't think dynamic typing is right for dyna3
  • Has a linear algebra library written in C/C++: NMatrix.

Ada (SO 1.1/40.2)

  • Well-established.

Rust with our own civet-like syntax unmangler (SO 28.7/82.2)

  • ⇓⇓ Big con: we would have to implement this! (but it is probably less work than for C++ since the syntax is much simpler/cleaner; i.e., we could probably actually base it on a proper parser, like civet does, rather than by local string transforms, which is the only way I could think of to do unmangled C++)
  • If we decide to bite the bullet and implement our own unmangling, is Rust enough better than the far more mature C++ with vast legions of existing packages to make it worth doing this for Rust rather than C++?
  • There's at least one linear algebra library—nalgebra—that's explicitly designed to support WASM builds.
  • Large, active, and growing user community.
  • One way to build a syntax crufter/decrufter would be to have a parser that starts basically just like the Rust parser, and then modify it to accept the syntax we prefer. The official state of Rust grammar is sad, in that there was an accepted RFC#1331 to create an official grammar, which led to the grammar working group, which was disbanded in 2024 Apr. So it is not at all clear what the state is. On the other hand, there are at least three (lively) parsers out there other than the official rust compiler, namely rust-analyzer, which is the basis for the "official" Rust LSP used by VSCode etc; syn, a library within Rust used for writing macros; and the Intellij Rust plugin, which includes a BNF grammar for Rust (so possibly the best for our purposes?). I would likely start from one of these sources in trying to generate the parser for Crust (or whatever we'd call the decrufted Rust, maybe CLR -- the name of a popular rust remover ;-).
    • Oh, actually, there is the tree-sitter library which provides (incrementally-updatable!) parsers for a huge list of languages, including every language in this list on this wiki page except for Civet, Lobster, and F#. So I think if we want to do crufting/decrufting for any language that tree-sitter provides a grammar for, that's really the place to start. We can fold bits from the Nim or Python parsers into the Rust (or C++ even) parser, say, to get what we want. And it seems we can cache the parse of files to speed re-parsing when they change, which is a nice feature.