Importing and combining Pocomath instances inconsistent #56

Open
opened 2022-10-06 01:13:03 +00:00 by glen · 2 comments
Owner

Right now it's confusing: some of the Pocomath modules export a bunch of identifiers (operation names) with values as implementations; others export a single Pocomath instance; some combine sub-files just by re-exporting their contents; others combine sub-files by using Pocomath.merge.

It would be preferable to have one consistent convention for importing and merging Pocomath modules.

This issue is an attempt to brainstorm such a convention, and can serve as a place for further brainstorming and discussion/refinement of the convention.

I think the ideal would be that a Pocomath module just exports some stuff, and you combine Pocomath modules always and only by re-exporting the submodules; and when you have everything all exported, you can just pass the whole shebang to a Pocomath instance creator to actually generate a working instance.

Issue #55 can provide some guide to turning this into a more specific proposal. It suggests to use complicated keys, looking potentially like add(a: bigint, b: bigint): bigint or perhaps even more intricate. This would be OK if JavaScript allowed arbitrary strings as identifiers exported by a module, but it does not. So we should not use these keys as the names of exports in Pocomath modules.

So one solution would be to make the exported symbols be arbitrary to some extent, chosen by some convention to assure that different modules are always exporting different symbols. Then we would look only at the values of exported symbols to get semantics for a Pocomath instance. And we could look at all such values to make sure we get all the semantics.

Then we would need conventions for how to interpret the different values found.
For example, if it is an object with signature keys and appropriate values, those implementations would be added to the corresponding operations of the Pocomath instance.

If we decide to allow mathjs expression language specifications of implementations, then any string value like absquare(z: Complex<T>) = add(absquare(re(z)), absquare(im(z))) would be interpreted as such.

And then it would be nice to have a way to specify type definitions as well. How to distinguish such definitions from operation implementations? One possibility would be to have some indication in the name of the exported symbol. And we could have a convention for symbols that define operations, so that we could also export other items that are just for use in other modules, but don't actually directly go into the generated Pocomath instance. (that doesn't really work because of namespace pollution issues with everything being rexported and aggregated. Such utilities should be segregated into separate modules that don't export any library semantics.)

If we went down this path, then a hypothetical source file src/complex/core.js might look like the following: (If we want to try to do this in TypeScript it would need to be a bit different...)

import {direct} from '../core/utils.js'
import {promoteUnary} from './utils.js' // contents shown below

export const complex_core_types = {
   'Complex<T>': {
      base: z => z && typeof z === 'object' && 're' in z && 'im' in z,
      test: testT => z => testT(z.re) && testT(z.im),
      infer: ({typeOf, joinTypes}) => z => joinTypes([typeOf(z.re), typeOf(z.im)]),
      from: {
         T: ({'zero(T)': zero}) => t => ({re: t, im: zero(t)}), 
         U: ({convert, 'zero(T)': zero}) => u => {
            const t = convert(u)
            return ({re: t, im: zero(t)})
         },
         'Complex<U>': convert => cu => ({re: convert(cu.re), im: convert(cu.im)})
      }
   }
}

export const complex_core_operations = {
   're(Complex<T>): T': direct(z => z.re),
   'im(Complex<T>): T': direct(z => z.im),
   'complex(T, T): Complex<T>': direct((x,y) => ({re: x, im: y})),
   'add(Complex<T>, Complex<T>): Complex<T>': ({
      'add(T,T)': add,
      'complex(T,T)': complex}) =>
      (w, z) => complex(add(w.re, z.re), add(w.im, z.im)),
}

export const complex_core_operation2 = 
   'absquare(z: Complex<T>) = add(absquare(re(z)), absquare(im(z)))'

export const complex_core_operation3 = promoteUnary('negate')
export const complex_core_operation4 = promoteUnary('zero')

So the exports of this file (which could be propagated by re-exporting) would: establish the Complex<T> generic type; define re, im, complex, and add operations via JavaScript/typed-function implementations; define absquare on complex numbers via a mathjs expression language definition; and provide implementations for negate and zero by using a utility from another file. Note the utility would not get injected into the resulting Pocomath instance because it is not re-exported.

Here's the section of src/complex/utils.js that provides a utility promoteUnary that given the name of a unary operation, generates an implementation of that operation on complex numbers that operates by simply applying that operation to the real and imaginary parts of the complex number separately:

export const promoteUnary = name => 
   `${name}(z: Complex<T>) = complex(${name}(re(z)), ${name}(im(z)))`

Morever, another file in the complex section could import also promoteUnary and use it to generate other operations more appropriate to be located there; for example, a hypothetical src/complex/rounding.js could look like:

import {promoteUnary} from './utils.js'

export const complex_rounding_operation1 = promoteUnary('floor')
export const complex_rounding_operation2 = promoteUnary('ceil')
export const complex_rounding_operation3 = promoteUnary('fix')
export const complex_rounding_operation4 = promoteUnary('round')

I don't love the length of the identifiers being exported from the library semantics modules, but with this design in which essentially all of the exports would be re-exported in aggregate, and then those aggregates collected, all of the exports of all of these library modules will end up being exported in the same namespace, and so we need a scheme that makes sure the names do not clash. So I am not sure how to avoid the length. I guess some length could be saved by not requiring _operation and just using _type anywhere in an identifier to mark that its value should be used to define a type. So then the above would become

import {promoteUnary} from './utils.js'

export const complex_rounding1 = promoteUnary('floor')
export const complex_rounding2 = promoteUnary('ceil')
export const complex_rounding3 = promoteUnary('fix')
export const complex_rounding4 = promoteUnary('round')

I guess this isn't so bad. This seems like a proposal worth reviewing/pursuing.

Right now it's confusing: some of the Pocomath modules export a bunch of identifiers (operation names) with values as implementations; others export a single Pocomath instance; some combine sub-files just by re-exporting their contents; others combine sub-files by using Pocomath.merge. It would be preferable to have one consistent convention for importing and merging Pocomath modules. This issue is an attempt to brainstorm such a convention, and can serve as a place for further brainstorming and discussion/refinement of the convention. I think the ideal would be that a Pocomath module just exports some stuff, and you combine Pocomath modules always and only by re-exporting the submodules; and when you have everything all exported, you can just pass the whole shebang to a Pocomath instance creator to actually generate a working instance. Issue #55 can provide some guide to turning this into a more specific proposal. It suggests to use complicated keys, looking potentially like `add(a: bigint, b: bigint): bigint` or perhaps even more intricate. This would be OK if JavaScript allowed arbitrary strings as identifiers exported by a module, but it does not. So we should not use these keys as the names of exports in Pocomath modules. So one solution would be to make the exported symbols be arbitrary to some extent, chosen by some convention to assure that different modules are always exporting different symbols. Then we would look _only_ at the _values_ of exported symbols to get semantics for a Pocomath instance. And we could look at all such values to make sure we get all the semantics. Then we would need conventions for how to interpret the different values found. For example, if it is an object with signature keys and appropriate values, those implementations would be added to the corresponding operations of the Pocomath instance. If we decide to allow mathjs expression language specifications of implementations, then any string value like `absquare(z: Complex<T>) = add(absquare(re(z)), absquare(im(z)))` would be interpreted as such. And then it would be nice to have a way to specify type definitions as well. How to distinguish such definitions from operation implementations? One possibility would be to have some indication in the name of the exported symbol. ~~And we could have a convention for symbols that define operations, so that we could also export other items that are just for use in other modules, but don't actually directly go into the generated Pocomath instance.~~ (that doesn't really work because of namespace pollution issues with everything being rexported and aggregated. Such utilities should be segregated into separate modules that don't export any library semantics.) If we went down this path, then a hypothetical source file `src/complex/core.js` might look like the following: (If we want to try to do this in TypeScript it would need to be a bit different...) ``` import {direct} from '../core/utils.js' import {promoteUnary} from './utils.js' // contents shown below export const complex_core_types = { 'Complex<T>': { base: z => z && typeof z === 'object' && 're' in z && 'im' in z, test: testT => z => testT(z.re) && testT(z.im), infer: ({typeOf, joinTypes}) => z => joinTypes([typeOf(z.re), typeOf(z.im)]), from: { T: ({'zero(T)': zero}) => t => ({re: t, im: zero(t)}), U: ({convert, 'zero(T)': zero}) => u => { const t = convert(u) return ({re: t, im: zero(t)}) }, 'Complex<U>': convert => cu => ({re: convert(cu.re), im: convert(cu.im)}) } } } export const complex_core_operations = { 're(Complex<T>): T': direct(z => z.re), 'im(Complex<T>): T': direct(z => z.im), 'complex(T, T): Complex<T>': direct((x,y) => ({re: x, im: y})), 'add(Complex<T>, Complex<T>): Complex<T>': ({ 'add(T,T)': add, 'complex(T,T)': complex}) => (w, z) => complex(add(w.re, z.re), add(w.im, z.im)), } export const complex_core_operation2 = 'absquare(z: Complex<T>) = add(absquare(re(z)), absquare(im(z)))' export const complex_core_operation3 = promoteUnary('negate') export const complex_core_operation4 = promoteUnary('zero') ``` So the exports of this file (which could be propagated by re-exporting) would: establish the `Complex<T>` generic type; define re, im, complex, and add operations via JavaScript/typed-function implementations; define absquare on complex numbers via a mathjs expression language definition; and provide implementations for negate and zero by using a utility from another file. Note the utility would not get injected into the resulting Pocomath instance because it is not re-exported. Here's the section of `src/complex/utils.js` that provides a utility `promoteUnary` that given the name of a unary operation, generates an implementation of that operation on complex numbers that operates by simply applying that operation to the real and imaginary parts of the complex number separately: ``` export const promoteUnary = name => `${name}(z: Complex<T>) = complex(${name}(re(z)), ${name}(im(z)))` ``` Morever, another file in the complex section could import also promoteUnary and use it to generate other operations more appropriate to be located there; for example, a hypothetical `src/complex/rounding.js` could look like: ``` import {promoteUnary} from './utils.js' export const complex_rounding_operation1 = promoteUnary('floor') export const complex_rounding_operation2 = promoteUnary('ceil') export const complex_rounding_operation3 = promoteUnary('fix') export const complex_rounding_operation4 = promoteUnary('round') ``` I don't love the length of the identifiers being exported from the library semantics modules, but with this design in which essentially all of the exports would be re-exported in aggregate, and then those aggregates collected, all of the exports of all of these library modules will end up being exported in the same namespace, and so we need a scheme that makes sure the names do not clash. So I am not sure how to avoid the length. I guess some length could be saved by not requiring `_operation` and just using `_type` anywhere in an identifier to mark that its value should be used to define a type. So then the above would become ``` import {promoteUnary} from './utils.js' export const complex_rounding1 = promoteUnary('floor') export const complex_rounding2 = promoteUnary('ceil') export const complex_rounding3 = promoteUnary('fix') export const complex_rounding4 = promoteUnary('round') ``` I guess this isn't so bad. This seems like a proposal worth reviewing/pursuing.
Author
Owner

Although this is not really on-topic, just another plug for supplying library semantics with mathjs expressions:

  1. Although in all the expressions above I used re(z) and im(z), the mathjs expression language does allow property access, so one could write z.re and z.im. The main reason I didn't is that we are already contemplating tracking the types of function calls, but I am not sure exactly how the mathjs expression language would track the types of property accesses. We would need some way of specifying that if z is an entity of type Complex<T> then z.re and z.im are of type T. Not sure what form that would take, and whether it is worth it depends on what the difference in performance is between re(z) and z.re in the compiled JavaScript function of a mathjs expression; does the latter actually get compiled to an in-line field access, or does it actually get converted into a function call, for example? So there may or may not be much of a difference. I also wonder what the difference in performance in JavaScript is between:
    const fgh = x => f(g(h(x))) and
const gh = x => g(h(x))
const fgh = x => f(gh(x))

since it seems like compiling expressions, since the functions are generated by crawling up the parse tree, always result in things like the latter (since we don't build up a JavaScript code string and produce a function by haveing JavaScript read that code, like with eval.

  1. It seems as though even the humble add is easier to write in the expression language:

add(w: Complex<T>, z: Complex<T>) = complex(add(w.re, z.re)), add(w.im, z.im)))

Although this is not really on-topic, just another plug for supplying library semantics with mathjs expressions: 1. Although in all the expressions above I used re(z) and im(z), the mathjs expression language does allow property access, so one could write z.re and z.im. The main reason I didn't is that we are already contemplating tracking the types of function calls, but I am not sure exactly how the mathjs expression language would track the types of property accesses. We would need some way of specifying that if z is an entity of type `Complex<T>` then z.re and z.im are of type T. Not sure what form that would take, and whether it is worth it depends on what the difference in performance is between `re(z)` and `z.re` in the compiled JavaScript function of a mathjs expression; does the latter actually get compiled to an in-line field access, or does it actually get converted into a function call, for example? So there may or may not be much of a difference. I also wonder what the difference in performance in JavaScript is between: `const fgh = x => f(g(h(x)))` and ``` const gh = x => g(h(x)) const fgh = x => f(gh(x)) ``` since it seems like compiling expressions, since the functions are generated by crawling up the parse tree, always result in things like the latter (since we don't build up a JavaScript code string and produce a function by haveing JavaScript read that code, like with eval. 2. It seems as though even the humble add is easier to write in the expression language: `add(w: Complex<T>, z: Complex<T>) = complex(add(w.re, z.re)), add(w.im, z.im)))`
Author
Owner

A follow-up to point 1 in the previous comment: Note it would also be possible to generate JavaScript code from the parse tree of a mathjs expression, and then turn that into a function with the Function() constructor that takes a string for the function body. But it's not clear to me in such a scheme how one would hook in the definitions of other mathjs operations, since only the global scope is available in the function body supplied to the Function() constructor. For example, suppose you want f(z:Complex<number>) = floor(z.re/2). You know z is Complex<number>, so z.re/2 is a number, so you can select the number implementation of the floor operation -- how do you get that particular function, the number implementation of floor, to be called from code you supply as the function body in Function('z', 'SOMETHING(z.re/2)') I don't really see a way... So I guess you just build up functions as above, rather than trying to generate JavaScript and then at the top level make that the body of a Function().

But note even this is a bit of an extension to what mathjs does now; currently when we .compile() a parse tree, we get a function that takes a scope and returns a value. We want a way to go from a function definition expression like f(x,y) = 2x +y to a javascript function of two arguments x and y that returns 2x + y. But that shouldn't be so hard: at the leaves we have (x,y) => 2, (x,y) => x, and (x,y) => y and we then build up by multiplying the first two and adding the third. (Maybe it's a worthwhile optimization to try to make the result at the multiplication node notice that the 2 is constant, and so then it can return the function that takes the result of (x,y) => x and multiply it by 2 rather than the function that takes the result of (x,y) => x and multiplies it by the result of (x,y) => 2.)

A follow-up to point 1 in the previous comment: Note it would also be possible to generate JavaScript code from the parse tree of a mathjs expression, and then turn that into a function with the Function() constructor that takes a string for the function body. But it's not clear to me in such a scheme how one would hook in the definitions of other mathjs operations, since only the global scope is available in the function body supplied to the Function() constructor. For example, suppose you want `f(z:Complex<number>) = floor(z.re/2)`. You know z is `Complex<number>`, so z.re/2 is a number, so you can select the number implementation of the floor operation -- how do you get that particular function, the number implementation of floor, to be called from code you supply as the function body in `Function('z', 'SOMETHING(z.re/2)')` I don't really see a way... So I guess you just build up functions as above, rather than trying to generate JavaScript and then at the top level make that the body of a Function(). But note even this is a bit of an extension to what mathjs does now; currently when we `.compile()` a parse tree, we get a function that takes a scope and returns a value. We want a way to go from a function definition expression like `f(x,y) = 2x +y` to a javascript function of two arguments x and y that returns 2x + y. But that shouldn't be so hard: at the leaves we have (x,y) => 2, (x,y) => x, and (x,y) => y and we then build up by multiplying the first two and adding the third. (_Maybe_ it's a worthwhile optimization to try to make the result at the multiplication node notice that the 2 is constant, and so then it can return the function that takes the result of (x,y) => x and multiply it by 2 rather than the function that takes the result of (x,y) => x and multiplies it by the result of (x,y) => 2.)
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: glen/pocomath#56
No description provided.