feat: define parse() function that takes a string to a Node tree #54

Open
glen wants to merge 21 commits from nodes into pc_notation
Owner

This POC is built on top of the #52 grammar. It adds a traversal of the concrete syntax tree produced directly by the parser_combinator grammar to translate it to a Node tree along the lines of mathjs parse().

This POC is built on top of the #52 grammar. It adds a traversal of the concrete syntax tree produced directly by the parser_combinator grammar to translate it to a Node tree along the lines of mathjs parse().
feat: parse Object rule
All checks were successful
/ test (pull_request) Successful in 17s
c856413dcf
feat: parse Matrix rule
All checks were successful
/ test (pull_request) Successful in 17s
362c96b205
feat: parse Symbol and String rules
All checks were successful
/ test (pull_request) Successful in 17s
e5c00a4101
feat: parse LeftHandOperators and add a Node dump() viewer
All checks were successful
/ test (pull_request) Successful in 17s
50914143e2
feat: parse all ordinary left-associative operators, and add linearize method
All checks were successful
/ test (pull_request) Successful in 17s
1619762b5d
feat: parse Pow rule (and fix aliasing bug in plucking)
All checks were successful
/ test (pull_request) Successful in 18s
0584e82f8b
feat: parse Unary, UnaryPercentage, and Parentheses, fix name of addition
All checks were successful
/ test (pull_request) Successful in 18s
cfd3b341e8
feat: parse Rule2 and ImplicitMultiplication, constrain function application
All checks were successful
/ test (pull_request) Successful in 17s
2afaf3b7d1
feat: parse Conditional
All checks were successful
/ test (pull_request) Successful in 17s
397b73aabb
Author
Owner

This approach is going well, and shaking out some minor bugs in the grammar. Note that I am converting to a streamlined collection of nodes. For example, a conditional like test ? yes : no is just converted to an ApplyNode of cond to [test, yes, no], and ApplyNode itself is a hybrid of FunctionNode and OperatorNode (as has long been planned for mathjs).
As of this comment, all that remains is to provide abstract syntax conversions for the Range and Assignment rules (and also, only single-entry Blocks are currently supported; we shall see how far we can get in the testing without that).
Ultimately, it should be just fine to convert an instance of Range to an ordinary ApplyNode, and even Assignment (e.g. assign(x, 7) is a perfectly reasonable abstract syntax for x = 7, and I don't see why we can't also support assign(f(x,y), x*y^2) as well. However, we will likely need a BlockNode for multi-entry Blocks, as there seems to be more structure going on than just a call to some notional block(expr1, expr2, expr3) function, at least insofar as collecting the return value.

Once those are done, it would be nice to get through all of the tests in mathjs/test/unit-tests/expression/parse.test.js, but there is one oddity. That test file currently tests both the parsing and the evaluation of an extensive catalogue of expressions. It would actually be better to test these two aspects of expression handling separately, especially because of cases in which two different parse trees for the same expression could lead to the same value, but we actually only want one of the two parse trees. (This has explicitly come up in the review of PRs to mathjs, in which supplied tests didn't actually disambiguate the possible parses of some expressions.)

I don't think that there is a bank of expressions and corresponding parse trees in the mathjs test suite. Therefore, the approach would be make the linearize() function on a Node that I've thrown together for the testing slightly less clunky, and then just go through the bank of expressions in the parse.test.js and write down all of their linearizations, so that we can test the parse trees directly going forward. That seems like worthwhile effort. It's a big job, we'll see if I get through it.

This approach is going well, and shaking out some minor bugs in the grammar. Note that I am converting to a streamlined collection of nodes. For example, a conditional like `test ? yes : no` is just converted to an ApplyNode of `cond` to `[test, yes, no]`, and ApplyNode itself is a hybrid of FunctionNode and OperatorNode (as has long been planned for mathjs). As of this comment, all that remains is to provide abstract syntax conversions for the Range and Assignment rules (and also, only single-entry Blocks are currently supported; we shall see how far we can get in the testing without that). Ultimately, it should be just fine to convert an instance of Range to an ordinary ApplyNode, and even Assignment (e.g. `assign(x, 7)` is a perfectly reasonable abstract syntax for `x = 7`, and I don't see why we can't also support `assign(f(x,y), x*y^2)` as well. However, we will likely need a BlockNode for multi-entry Blocks, as there seems to be more structure going on than just a call to some notional `block(expr1, expr2, expr3)` function, at least insofar as collecting the return value. Once those are done, it would be nice to get through all of the tests in mathjs/test/unit-tests/expression/parse.test.js, but there is one oddity. That test file currently tests both the _parsing_ and the _evaluation_ of an extensive catalogue of expressions. It would actually be better to test these two aspects of expression handling separately, especially because of cases in which two different parse trees for the same expression could lead to the same value, but we actually only want one of the two parse trees. (This has explicitly come up in the review of PRs to mathjs, in which supplied tests didn't actually disambiguate the possible parses of some expressions.) I don't think that there is a bank of expressions and corresponding parse trees in the mathjs test suite. Therefore, the approach would be make the `linearize()` function on a Node that I've thrown together for the testing slightly less clunky, and then just go through the bank of expressions in the parse.test.js and write down all of their linearizations, so that we can test the parse trees directly going forward. That seems like worthwhile effort. It's a big job, we'll see if I get through it.
feat: parse Range
All checks were successful
/ test (pull_request) Successful in 18s
9dc5059462
feat: parse Assignment
All checks were successful
/ test (pull_request) Successful in 18s
93469bc4b8
Author
Owner

Although I have Assignment, it doesn't do anything special for functions. So next I plan to switch to an explicit lambda() operator for defining functions and make it the result of parsing (x,y) => x + 3*y for example, and then make an assignment to a function call be syntactic sugar for assigning the corresponding lambda-expression to the symbol of the function name. That should be the last thing before adding Block. (This change will make the nanomath expression language a non-breaking extension of the current mathjs one.)

Although I have Assignment, it doesn't do anything special for functions. So next I plan to switch to an explicit `lambda()` operator for defining functions and make it the result of parsing `(x,y) => x + 3*y` for example, and then make an assignment to a function call be syntactic sugar for assigning the corresponding lambda-expression to the symbol of the function name. That should be the last thing before adding Block. (This change will make the nanomath expression language a non-breaking extension of the current mathjs one.)
feat: Add JavaScript => notation for in-line anonymous function definition
All checks were successful
/ test (pull_request) Successful in 18s
ca5f9ac92c
feat: parse multi-line Blocks
All checks were successful
/ test (pull_request) Successful in 18s
2cf94a3e42
feat: carry expression extent and text over to abstract syntax tree
All checks were successful
/ test (pull_request) Successful in 18s
5cb9a07c0c
test: set up a bank of expressions for parse/evaluate testing
All checks were successful
/ test (pull_request) Successful in 17s
d56e18e100
Author
Owner

The expression bank has uncovered a lexing error: the #foo comment is getting lost when lexing "#foo\n#bar\n". It's because comments are not allowed on newline tokens, and then it gets overwritten. Instead, they should coalesce separated by a newline.

The expression bank has uncovered a lexing error: the `#foo` comment is getting lost when lexing `"#foo\n#bar\n"`. It's because comments are not allowed on newline tokens, and then it gets overwritten. Instead, they should coalesce separated by a newline.
test: no trailing comma in linearize, keep multiline comments, more tests
All checks were successful
/ test (pull_request) Successful in 17s
af8e405e4f
Author
Owner

Multiple comments now coalesce as desired

Multiple comments now coalesce as desired
feat: parse Relational rule
All checks were successful
/ test (pull_request) Successful in 18s
5d2b71ac8e
refactor: convert : to irange to indicate inclusive range
All checks were successful
/ test (pull_request) Successful in 19s
b5b8544f56
All checks were successful
/ test (pull_request) Successful in 19s
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin nodes:nodes
git switch nodes

Merge

Merge the changes and update on Forgejo.

Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.

git switch pc_notation
git merge --no-ff nodes
git switch nodes
git rebase pc_notation
git switch pc_notation
git merge --ff-only nodes
git switch nodes
git rebase pc_notation
git switch pc_notation
git merge --no-ff nodes
git switch pc_notation
git merge --squash nodes
git switch pc_notation
git merge --ff-only nodes
git switch pc_notation
git merge nodes
git push origin pc_notation
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
StudioInfinity/nanomath!54
No description provided.