WIP: feat: define an Ohm.js parser for maje #47

Draft
glen wants to merge 12 commits from ohm_parse into main
Owner

Adds an Ohm parser for maje, the mathjs expression language. Note the grammar is 200 lines as compared to 1850 for mathjs src/expression/parse.js, although the Ohm grammar does not perform any of the semantic interpretation, it is just a recognizer, whereas mathjs parse.js also produces an AST for the expression.

This is indubitably not fully correct yet, but it appears to be a plausible start that nominally captures every feature of maje per mathjs parse.js. The next step going forward to head toward making this merge-ready is to ensure that every example in mathjs test/unit-tests/expression/parse.test.js either succeeds in parsing or fails in parsing, as appropriate. That should shake out the (probably numerous) grammar bugs. I think the only other ingredient necessary to merge this is to ensure that it's possible to generate good error messages in the failed parsing cases. Semantics, including any kind of AST if we decide to do one on top of the built-in concrete syntax tree produced by Ohm, will be left for future PRs.

Adds an [Ohm](https://ohmjs.org/) parser for maje, the mathjs expression language. Note the grammar is 200 lines as compared to 1850 for mathjs src/expression/parse.js, although the Ohm grammar does not perform any of the semantic interpretation, it is just a recognizer, whereas mathjs parse.js also produces an AST for the expression. This is indubitably not fully correct yet, but it appears to be a plausible start that nominally captures every feature of maje per mathjs parse.js. The next step going forward to head toward making this merge-ready is to ensure that every example in mathjs test/unit-tests/expression/parse.test.js either succeeds in parsing or fails in parsing, as appropriate. That should shake out the (probably numerous) grammar bugs. I think the only other ingredient necessary to merge this is to ensure that it's possible to generate good error messages in the failed parsing cases. Semantics, including any kind of AST if we decide to do one on top of the built-in concrete syntax tree produced by Ohm, will be left for future PRs.
feat: define an Ohm.js parser for maje
All checks were successful
/ test (pull_request) Successful in 19s
998e9c80c5
Author
Owner

Note that the scannerless nature of Ohm coupled with maje's convention of not generally allowing newlines at the "top level" while allowing them inside nested expressions complicates the Ohm grammar: it obliges us to carry a "white" parameter to most of the rules, indicating what kind of whitespace is active. (This parameter effectively produces two "flavors" of every rule, one in which newlines are ordinary whitespace, the other in which newlines are not always allowed.)

Similarly, the fact that we just want to skip the range notation a:b:c in the true branch of the ternary conditional introduces another parameter qRange in all of the rules above range, that indicates whether we are currently skipping range constructions.

It's conceivable there might be another mechanism for ensuring that ranges are not allowed after condition ? ... but that would be a refinement for the future. Best to just get this parsing to work.

Note that the scannerless nature of Ohm coupled with maje's convention of not generally allowing newlines at the "top level" while allowing them inside nested expressions complicates the Ohm grammar: it obliges us to carry a "white" parameter to most of the rules, indicating what kind of whitespace is active. (This parameter effectively produces two "flavors" of every rule, one in which newlines are ordinary whitespace, the other in which newlines are not always allowed.) Similarly, the fact that we just want to skip the range notation `a:b:c` in the true branch of the ternary conditional introduces another parameter `qRange` in all of the rules above `range`, that indicates whether we are currently skipping range constructions. It's conceivable there might be another mechanism for ensuring that ranges are not allowed after `condition ? ...` but that would be a refinement for the future. Best to just get this parsing to work.
test: first set of parse tests
All checks were successful
/ test (pull_request) Successful in 18s
45f8f8087a
test: another set of parse tests and associated fixes
All checks were successful
/ test (pull_request) Successful in 18s
ae98a750e7
test: number tests and associated grammar fixes
All checks were successful
/ test (pull_request) Successful in 19s
507056064a
chore: update to latest ohm, and remove 'holes' workaround
All checks were successful
/ test (pull_request) Successful in 18s
cf0706286a
feat: add transducer to take nested \n to \r
All checks were successful
/ test (pull_request) Successful in 18s
ae19e46adc
feat: add mageParse to match newlineTransduced string
All checks were successful
/ test (pull_request) Successful in 18s
e4a759e07e
test: doublequote string tests and associated fixes
All checks were successful
/ test (pull_request) Successful in 18s
4244b52158
All checks were successful
/ test (pull_request) Successful in 18s
Required
Details
This pull request is marked as a work in progress.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin ohm_parse:ohm_parse
git switch ohm_parse
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
StudioInfinity/nanomath!47
No description provided.