Write more about uniform regularization

Vectornaut 2025-10-29 00:26:15 +00:00
parent 5222bc7193
commit 2b241fb585

@ -63,11 +63,19 @@ If $f$ is convex, its second derivative is positive-definite everywhere, so the
#### Uniform regularization #### Uniform regularization
Given an inner product $(\_\!\_, \_\!\_)$ on $V$, we can make the modified second derivative $f^{(2)}_p(v, \_\!\_) + \lambda (\_\!\_, \_\!\_)$ positive-definite by choosing a large enough coefficient $\lambda$. We can say more precisely what it means for $\lambda$ to be large enough by expressing $f^{(2)}_p$ as $(\_\!\_, \tilde{F}^{(2)}_p\_\!\_)$ and taking the lowest eigenvalue $\lambda_{\text{min}}$ of $\tilde{F}^{(2)}_p$. The modified second derivative is positive-definite when $\lambda > -\max\{\lambda_\text{min}, 0\}$. Given an inner product $(\_\!\_, \_\!\_)$ on $V$, we can make the modified second derivative $f^{(2)}_p(v, \_\!\_) + \lambda (\_\!\_, \_\!\_)$ positive-definite by choosing a large enough coefficient $\lambda$. We can say precisely what it means for $\lambda$ to be large enough by expressing $f^{(2)}_p$ as $(\_\!\_, \tilde{F}^{(2)}_p\_\!\_)$ and taking the lowest eigenvalue $\lambda_{\text{min}}$ of $\tilde{F}^{(2)}_p$. The modified second derivative is positive-definite when $\delta := \lambda_\text{min} + \lambda$ is positive. We typically make a “minimal modification,” choosing $\lambda$ just a little larger than $-\max\{\lambda_\text{min}, 0\}$. This makes $\delta$ small when $\lambda_\text{min}$ is negative and $\lambda$ small when $\lambda_\text{min}$ is positive.
Uniform regularization can be seen as interpolating between Newtons method and gradient descent. To see why, consider the regularized equation that defines the Newton step: Uniform regularization can be seen as interpolating between Newtons method in regions where the second derivative is solidly positive-definite and gradient descent in regions where the second derivative is far from positive definite. To see why, consider the regularized Newton step $v$ defined by the equation
```math ```math
f^{(1)}_p(\_\!\_) + f^{(2)}_p(v, \_\!\_) + \lambda (v, \_\!\_) = 0. f^{(1)}_p(\_\!\_) + f^{(2)}_p(v, \_\!\_) + \lambda (v, \_\!\_) = 0,
```
the standard Newton step $w$ defined by the equation
```math
f^{(1)}_p(\_\!\_) + f^{(2)}_p(w, \_\!\_) = 0,
```
and the gradient descent step $u$ defined by the equation
```math
f^{(1)}_p(\_\!\_) + (u, \_\!\_) = 0.
``` ```
_To be continued_ _To be continued_