Properly implement Ueda and Yamashita's regularized Newton method #130

New issue

Open

opened 2025-11-07 10:37:27 +00:00 by Vectornaut · 0 comments

Vectornaut commented

2025-11-07 10:37:27 +00:00

Member

As of pull request #118, we carry out realization using a cheap imitation of Ueda and Yamashita's uniformly regularized Newton's method [UY]. We leave out the term of Ueda and Yamashita's regularization that involves the norm of the first derivative of the loss function. This has at least two downsides. One downside is practical: when the lowest eigenvalue of the Hessian is zero, our regularization is zero, so the regularized Hessian fails to be safely positive-definite. The other downside is conceptual: since we depart from Ueda and Yamashita's assumptions, we can't rely on their convergence results.

I informally tested a few regularization methods and decided that a proper implementation of Ueda and Yamashita’s method gave the most consistent convergence and the nicest-looking realizations. I therefore recommend switching to that method. To make the regularization easier to understand, and perhaps better adapted to our problem, I recommend taking the norm of the first derivative with respect to a meaningful metric on the configuration space.

As of pull request #118, we carry out realization using a cheap imitation of Ueda and Yamashita's uniformly regularized Newton's method [[UY](https://code.studioinfinity.org/StudioInfinity/dyna3/wiki/Numerical-optimization#uniform-regularization)]. We leave out the term of Ueda and Yamashita's regularization that involves the norm of the first derivative of the loss function. This has at least two downsides. One downside is practical: when the lowest eigenvalue of the Hessian is zero, our regularization is zero, so the regularized Hessian fails to be safely positive-definite. The other downside is conceptual: since we depart from Ueda and Yamashita's assumptions, we can't rely on their convergence results. I informally tested a few regularization methods and decided that a proper implementation of Ueda and Yamashita’s method gave the most consistent convergence and the nicest-looking realizations. I therefore recommend switching to that method. To make the regularization easier to understand, and perhaps better adapted to our problem, I recommend taking the norm of the first derivative with respect to a meaningful metric on the configuration space.