math5/README.md

145 lines
7.4 KiB
Markdown

# math5
Yet another math core prototype for a possible future of mathjs.
This project is a revision of the
[typocomath](https://code.studioinfinity.org/glen/typocomath) prototype
(which was the fourth in the series picomath, pocomath, typocomath, hence
the current name math5), preparing for an initial implementation of the
Dispatcher engine to assemble and run the methods specified in TypeScript.
Motivations for the refactor:
1. I observed that the `.d.ts` files that were being generated as a result
of the TypeScript compilation step did not contain sufficient type information
to see what each implementation/factory for each operation of the resulting
math module would do. This lack suggested that the TypeScript definitions of
the implementations and factories were not actually being fully typechecked.
2. I felt that there was still a significant amount of redundancy in the
implementation files. For example, in typocomath/src/numbers/arithmetic, it
reiterates for every arithmetic operation "foo" that "foo" implements the
number signature for the "foo" operation. It seemed like it would be
preferable to specify that this module is for the "number" type fewer times,
and not have to mention the operation name for each operation twice.
3. I did not love the creation of aliased operation names like "addReal" that
were actually implementations of the operation "add" but with different
signatures. I found that mechanism confusing.
You can verify that the new code compiles and generates implementation
information by cloning the repository and then running `pnpm install`,
`npx ts-patch install`, and then `pnpm go`.
Outcomes of the refactor, corresponding to the motivations:
1a. You can browse the generated `build/**/*.d.ts` files to see that they
now contain full, detailed type information on every implementation and
factory, including the exact types of the dependencies.
1b. The TypeScript compiler now correctly detected (which it had not done in
typocomath) that the intermediate real square roots in the complex `sqrt`
implementation might be used even though they had come out to `NaN`. This
outcome is direct evidence that the TypeScript compiler is now type-checking
more strictly, so we are getting greater value from using TypeScript to
specify the operations' behavior. (In addition, it led to adding the `isnan`
predicate so that the code would compile.)
2. There is less repeated information. For example,
math5/src/numbers/arithmetic only mentions `number` twice, and only mentions
each operation once.
3. Implementations/factories are now only exported under their actual
operation names, just with different signatures specified. The default name
of a dependency is the name of the operation, but when you have dependencies
on a given operation with different signatures, you can name the dependency
arbitrarily and then specify which operation it is an instance of.
Other potential advantages of the refactor:
* Assembling the implementation specifications (the main task of which
is resolving and injecting dependencies) into a running math engine could
potentially work by parsing the `.d.ts` files as generated; we would not
necessarily need to instrument the typescript code with macros to generate
the additional information needed to correctly assemble the factories.
Some disadvantages of the refactor:
* The presentation of the code is slightly more verbose in places. The
primary cause of this is the switch to a "builder" interface for collecting
implementations/factories, as advised by TypeScript guru
[jcalz](https://stackoverflow.com/questions/79025259) in order to get
narrower type inference as desired. So for example in
src/Complex/arithmetic, every factory (a "dependent implementation", as
opposed to an "independent" one that has no dependencies) is wrapped in
its own call to `dependent(dependencySpecifiers, factories)`. And that
whole chain of `dependent` calls has to be kicked off with a call to
`implementations()` and wrapped up with a call to `ship()`. Of course, the
names of those functions could be changed, but it appears that currently
there is no way to avoid these wrappers if we want TypeScript to do narrow
type inference/typechecking.
* When one module is providing multiple implementations for the same
operation, but with different signatures, it must export multiple of these
bundles of implementations generated with an `implementation(). ... .ship()`
seequence, because each bundle can only contain an operation once. The names
of these bundles are arbitrary. I think this artificial division is a little
cumbersome/confusing. See src/Complex/arithmetic for an example, in which
there is a default export with the "common" signatures for operations, and a
`mixed` export with variants for `add` and `divide` that operate on a
Complex and an ordinary number.
* The notation for the desired signatures for dependencies can still be
a bit arcane/cumbersome. It's very simple when the desired dependency
consists of the common signature for that operation. But for more unusual
situations, it can become intricate. For example, in src/Complex/arithmetic,
the `absquare` (absolute value squared, an operation needed to define
division and square root) factory needs as a dependency the addition
operation on the return type of the `absquare` operation on the base type
of the Complex number. This has ended up being specified as:
```
add: {sig: commonSignature<'add', CommonReturn<'absquare', T>>()}
```
which is a bit of a mouthful. It's possible that better utilities for
expressing desired signatures could be devised; I'd want to wait until we had
collected a larger number of use cases before trying to design them. (If
this absquare case is essentially a one-off, it doesn't really matter
if it is a bit elaborate.)
### Looking ahead
If this format is pursued, the next step would be to extract type information
that's needed to assemble the factories into working operations by injecting
the proper dependencies. There are two possible sources for this information:
(1) parsing the `.d.ts` files generated by tsc during the build, or
(2) generating strings encoding the types using the `$$typeToString` facility
of the `ts-macros` package.
To help pick between the two, this version is instrumented with $$typeToString
to record the type of all of the exported implementation objects, so that
one can simply compare the output of `pnpm go` with the `.d.ts` files.
At a first look, some features relevant to the choice are:
A) With ts-macros (method 2), the type information it generates is
available immediately upon importing the JavaScript files generated by
the `tsc` build step. With method 1, we would need to insert an additional
build step after `tsc` that parses the `.d.ts` files and produces one or more
small JavaScript modules (or possibly JSON files) that contain the type
information in a usable format.
B) On the other hand, the type specifications in the `d.ts` files appear
to have many more type definitions resolved and expanded out for us, making
them easier to read, parse, and use in the operations-assembly process.
C) With ts-macros we have a couple of additional package dependencies and an
additional installation step (`npx ts-patch install`). For either method, we
will have a TypeScript type parser module that we will need to write and
maintain.
Given these points, on balance at the moment I would lean ever so slightly
toward just parsing the `.d.ts` files -- it seems like less trouble overall
despite the additional build step -- but I could totally go either way.