Add a config object and approximate equality #18

Open
opened 2025-04-13 16:52:36 +00:00 by glen · 7 comments
Owner

One feature in pocomath and mathjs 14 is that generally speaking, number equality is checked with some numerical fuzziness, allowing numbers that are very close to be considered equal. This convention seems to be a response to roundoff errors that occur in floating point calculations, preventing such issues as 0.1 + 0.2 not being equal to 0.3. Of course, it creates other issues; for example, approximate equality is not transitive: we can find numbers a, b, c such that a and b are considered equal, and b and c are considered equal, but a and c are not considered equal.

Nevertheless, in an effort to conform to existing conventions in mathjs (and in particular to make sure we produce the same results on the polynomialRoot test), we need to implement a configuration object and set tolerances for approximate equality.

However, experience has shown in the mathjs world that it would be best to have different values of these tolerance parameters for different types. And nanomath has the ability for any property to be indexed by type. So I think we should just set things up so that math.resolve('config', [BigNumber]) gives the bignumber configuration.

One feature in pocomath and mathjs 14 is that generally speaking, number equality is checked with some numerical fuzziness, allowing numbers that are very close to be considered equal. This convention seems to be a response to roundoff errors that occur in floating point calculations, preventing such issues as 0.1 + 0.2 not being equal to 0.3. Of course, it creates other issues; for example, approximate equality is not transitive: we can find numbers a, b, c such that a and b are considered equal, and b and c are considered equal, but a and c are not considered equal. Nevertheless, in an effort to conform to existing conventions in mathjs (and in particular to make sure we produce the same results on the polynomialRoot test), we need to implement a configuration object and set tolerances for approximate equality. However, experience has shown in the mathjs world that it would be best to have different values of these tolerance parameters for different types. And nanomath has the ability for any property to be indexed by type. So I think we should just set things up so that `math.resolve('config', [BigNumber])` gives the bignumber configuration.
glen added the
priority
design
labels 2025-04-13 16:52:36 +00:00
Author
Owner

One might reasonably wonder whether something like the tolerance parameters that seem well-tied to a specific type is an exception, and that at least not the entire config object should be tied to type. However, reviewing other config settings, it seems that many of them could plausibly be construed to do with type. For example, predictable that says whether to return a complex square root might be something someone would want to vary by input type, So for now I will just go with this plan.

One might reasonably wonder whether something like the tolerance parameters that seem well-tied to a specific type is an exception, and that at least not the entire config object should be tied to type. However, reviewing other config settings, it seems that many of them could plausibly be construed to do with type. For example, `predictable` that says whether to return a complex square root might be something someone would want to vary by input type, So for now I will just go with this plan.
Author
Owner

Resolved by #19. Leaving open for now because of the design questions, which Jos may want to comment on.

Resolved by #19. Leaving open for now because of the design questions, which Jos may want to comment on.
glen removed the
priority
label 2025-04-16 04:29:44 +00:00
Collaborator

Maybe we can think through a way to configure the precision based on the max precision offered by the numeric type at hand, so the same config works for both number and BigNumber, and automatically adjusts depending on the configured precision of BigNumber?

So let's say number has roughly 16 digits, and we want a relTol of 12 digits (1e-12) and absTol of 15 digits (1e-15). Then relTol is the available digits minus 4, and absTol is the available digits minus 1. Could we configure relTol: 4 and absTol: 1 then? When configuring BigNumber with say precision: 64, that would translate to 1e-60 and 1e-63 respectively.

Maybe we can think through a way to configure the precision based on the max precision offered by the numeric type at hand, so the same config works for both number and BigNumber, and automatically adjusts depending on the configured `precision` of BigNumber? So let's say number has roughly 16 digits, and we want a `relTol` of 12 digits (`1e-12`) and `absTol` of 15 digits (`1e-15`). Then `relTol` is the available digits minus 4, and `absTol` is the available digits minus 1. Could we configure `relTol: 4` and `absTol: 1` then? When configuring `BigNumber` with say `precision: 64`, that would translate to `1e-60` and `1e-63` respectively.
Author
Owner

Answering in the opposite order:

So let's say number has roughly 16 digits, and we want a relTol of 12 digits (1e-12) and absTol of 15 digits (1e-15). Then relTol is the available digits minus 4, and absTol is the available digits minus 1. Could we configure relTol: 4 and absTol: 1 then? When configuring BigNumber with say precision: 64, that would translate to 1e-60 and 1e-63 respectively.

That's a pretty reasonable idea. Should that be on a per-value basis? In other words, even when precision: 64 is specified, I think (if I am recalling correctly) it is possible to create an individual BigNumber with more or fewer digits of precision. When an operation involves such a BigNumber, should absTol and relTol be interpreted relative to that number's precision, or should they stay at their standard values for BigNumber based on the default precision given by the config.precision ?

so the same config works for both number and BigNumber,

Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do? Right now, methods and non-function properties resolve the same way. But I could change it so that explicitly there is only the possibility of a single value for a non-function property, that's independent of a type list. Then there would be no possibility of (say) configuration depending on type. Or should I take it that you just would like tolerances to be able to be independent of type by expressing them in terms of the native precision, and leave in the mechanism that other values could be dependent on type? Let me know which way you'd like to go -- as with the type entity scheme, I'd rather change this before going on to the benchmarks, if it's going to need to change. Thanks.

Answering in the opposite order: > So let's say number has roughly 16 digits, and we want a relTol of 12 digits (1e-12) and absTol of 15 digits (1e-15). Then relTol is the available digits minus 4, and absTol is the available digits minus 1. Could we configure relTol: 4 and absTol: 1 then? When configuring BigNumber with say precision: 64, that would translate to 1e-60 and 1e-63 respectively. That's a pretty reasonable idea. Should that be on a per-value basis? In other words, even when precision: 64 is specified, I think (if I am recalling correctly) it is possible to create an individual BigNumber with more or fewer digits of precision. When an operation involves such a BigNumber, should absTol and relTol be interpreted relative to that number's precision, or should they stay at their standard values for BigNumber based on the default precision given by the config.precision ? > so the same config works for both number and BigNumber, Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do? Right now, methods and non-function properties resolve the same way. But I could change it so that explicitly there is only the possibility of a single value for a non-function property, that's independent of a type list. Then there would be no possibility of (say) configuration depending on type. Or should I take it that you just would like tolerances to be able to be independent of type by expressing them in terms of the native precision, and leave in the mechanism that other values could be dependent on type? Let me know which way you'd like to go -- as with the type entity scheme, I'd rather change this before going on to the benchmarks, if it's going to need to change. Thanks.
Collaborator

So far it's just a wild idea, I would be curious to see if that could work out nicely.

Should that be on a per-value basis?

I think that would complicate things quite a bit, I would prefer to keep things simple and configure it not per value but on a global level.

Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do?

I'm not sure if I understand what you mean here.

I think under the hood we will have to translate to actual tolerances per data type that we can test a value against. So then, this "smart", dynamic config would be a convenience way on top of this. Maybe we can do something like this:

type Tolerance = 
    | number                                       // one number like 1e-12 applied to all numeric data types
    | Record<'number' | 'BigNumber' | ..., number> // an object with a different number for every data type
    | { "auto": number }                           // a number like "2" meaning the max available digits of the type minus 2

interface Config {
  absTol: Tolerance
  relTol: Tolerance
}
So far it's just a wild idea, I would be curious to see if that could work out nicely. > Should that be on a per-value basis? I think that would complicate things quite a bit, I would prefer to keep things simple and configure it not per value but on a global level. > Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do? I'm not sure if I understand what you mean here. I think under the hood we will have to translate to actual tolerances per data type that we can test a value against. So then, this "smart", dynamic config would be a convenience way on top of this. Maybe we can do something like this: ```ts type Tolerance = | number // one number like 1e-12 applied to all numeric data types | Record<'number' | 'BigNumber' | ..., number> // an object with a different number for every data type | { "auto": number } // a number like "2" meaning the max available digits of the type minus 2 interface Config { absTol: Tolerance relTol: Tolerance } ```
Author
Owner

@josdejong wrote in #18 (comment):

Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do?

I'm not sure if I understand what you mean here.

The basic functionality of a TypeDispatcher TD is TD.resolve(name, listOfTypes). A method foo on TD actually gets assigned to args => TD.resolve('foo', types(args))(args) -- so there is a natural place to get the list of types from, namely the actual arguments. But the resolve facility works for any identifier, including ones that happen to correspond to value properties, not methods. So the current design is to get the tolerances for BigNumber, you do TD.resolve('config', [BigNumber]) and use the relTol and absTol from that; it may be a different object with different values for these tolerances than TD.resolve('config', [NumberT]). This mechanism seemed like a natural way to use the type-based resolve to handle the need for different config values for different types.

However, tolerance was the poster child for type-dependent configuration. I will switch to the scheme you propose in which the relevant parameters are not type-dependent. (I think they should have new names because they have different semantics than in mathjs 14, how about relFuzz and absFuzz ?) With that case of type-dependence of config gone, should I

(A) No longer resolve config with different types, so there is only one config object? or even

(B) Absolutely turn off type-based resolution for non-function values, so that it is an error to TD.resolve('bar', [BooleanT]) if bar has not been registered as a function? (This change would be a special case to the general mechanism of type-based lookup in a TypeDispatcher, that I don't see much motivation to restrict in that way.)

(C) Or actually just leave type-based lookup for config alone, as it may come up for other reasons, just don't by default use it for approximate comparisons given the new scheme. If we go this route, then one could define the new absFuzz differently for BigNumber than NumberT, so that say you'd have to be within a factor of 10 of the precision limit to be considered to equal 0, but only within a factor of 1000 of the precision limit to be considered equal to bignumber(0).

Personally, I would just do (C), but if you think (A) or (B) is important to the design of this as a potential new base for mathjs, I can definitely switch over. Please let me know.

@josdejong wrote in https://code.studioinfinity.org/StudioInfinity/nanomath/issues/18#issuecomment-2868: > > Should I take that as an indication that you would like to have non-function values not able to resolve based on a type list, the way that methods do? > > I'm not sure if I understand what you mean here. The _basic_ functionality of a TypeDispatcher `TD` is `TD.resolve(name, listOfTypes)`. A _method_ `foo` on TD actually gets assigned to `args => TD.resolve('foo', types(args))(args)` -- so there is a natural place to get the list of types from, namely the actual arguments. But the resolve facility works for any identifier, including ones that happen to correspond to value properties, not methods. So the current design is to get the tolerances for BigNumber, you do `TD.resolve('config', [BigNumber])` and use the relTol and absTol from that; it may be a different object with different values for these tolerances than `TD.resolve('config', [NumberT])`. This mechanism seemed like a natural way to use the type-based resolve to handle the need for different config values for different types. However, tolerance was the poster child for type-dependent configuration. I will switch to the scheme you propose in which the relevant parameters are not type-dependent. (I think they should have new names because they have different semantics than in mathjs 14, how about `relFuzz` and `absFuzz` ?) With that case of type-dependence of config gone, should I (A) No longer resolve config with different types, so there is only one config object? or even (B) Absolutely turn off type-based resolution for non-function values, so that it is an error to `TD.resolve('bar', [BooleanT])` if bar has not been registered as a function? (This change would be a special case to the general mechanism of type-based lookup in a TypeDispatcher, that I don't see much motivation to restrict in that way.) (C) Or actually just leave type-based lookup for `config` alone, as it may come up for other reasons, just don't by default use it for approximate comparisons given the new scheme. If we go this route, then one could define the new absFuzz differently for BigNumber than NumberT, so that say you'd have to be within a factor of 10 of the precision limit to be considered to equal 0, but only within a factor of 1000 of the precision limit to be considered equal to bignumber(0). Personally, I would just do (C), but if you think (A) or (B) is important to the design of this as a potential new base for mathjs, I can definitely switch over. Please let me know.
Collaborator

Ah, I understand what you mean now, I wasn't aware of the TD.resolve('config', [BigNumber]) solution, your idea for type based configuration. I think my latest example proposal basically has type based config, but only for the relevant options absTol and relTol: they can be an object with numeric type as key and the tolerance as value. But not all config is type based (I think), like randomSeed and matrix.

The mathjs config currently looks like:

const config = {
  relTol: 1e-12,
  absTol: 1e-15,
  matrix: 'Matrix',
  number: 'number',
  precision: 64,
  predictable: false,
  randomSeed: null
}

So, how will the configuration object concretely look like with your latest plan?

I do not see much use for the type-based resolution right now, but it also doesn't do harm so I'm ok with (C) leaving it as it is and just not using this option. No need right now to add logic to disable this as a special case in the TypeDispatcher.

About the naming: I think it should at least have something like "tol" or "tolerance" in the name. How about absTolDigits and relTolDigits? but if we do something along the lines of my idea here I think we can keep things backward compatible, and the "smart" notation like {auto: 2} (or {fuzz: 2}) is something on top of that.

Ah, I understand what you mean now, I wasn't aware of the `TD.resolve('config', [BigNumber])` solution, your idea for type based configuration. I think my latest example proposal basically has type based config, but only for the relevant options `absTol` and `relTol`: they can be an object with numeric type as key and the tolerance as value. But not all config is type based (I think), like `randomSeed` and `matrix`. The mathjs config currently looks like: ```js const config = { relTol: 1e-12, absTol: 1e-15, matrix: 'Matrix', number: 'number', precision: 64, predictable: false, randomSeed: null } ``` So, how will the configuration object concretely look like with your latest plan? I do not see much use for the type-based resolution right now, but it also doesn't do harm so I'm ok with (C) leaving it as it is and just not using this option. No need right now to add logic to disable this as a special case in the TypeDispatcher. About the naming: I think it should at least have something like "tol" or "tolerance" in the name. How about `absTolDigits` and `relTolDigits`? but if we do something along the lines of [my idea here](https://code.studioinfinity.org/StudioInfinity/nanomath/issues/18#issuecomment-2868) I think we can keep things backward compatible, and the "smart" notation like `{auto: 2}` (or `{fuzz: 2}`) is something on top of that.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: StudioInfinity/nanomath#18
No description provided.