Bring up subtleties of regularization and frozen entries
parent
bdd7926ce2
commit
00d7d1b369
1 changed files with 7 additions and 4 deletions
|
@ -119,11 +119,14 @@ The minimization routine is implemented in [`engine.rs`](../src/branch/main/app-
|
|||
2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in "The second derivative of the loss function."
|
||||
* Recall that we express $H(f)$ as a matrix in the standard basis for $\operatorname{End}(\mathbb{R}^n)$.
|
||||
3. If the Hessian isn't positive-definite, make it positive definite by adding $-c \lambda_\text{min}$, where $\lambda_\text{min}$ is its lowest eigenvalue and $c > 1$ is a parameter of the minimization routine. In other words, find the regularized Hessian
|
||||
$$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \text{otherwise} \end{cases}.$$
|
||||
$$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \lambda_\text{min} \le 0 \end{cases}.$$
|
||||
* The parameter $c$ is passed to `realize_gram` as the argument `reg_scale`.
|
||||
* Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, but we don't bother.
|
||||
4. Find the base step $u$, which is defined by the property that $-\operatorname{grad}(f) = H(f)\,u$.
|
||||
5. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate.
|
||||
* When $\lambda_\text{min}$ is exactly zero, our regularization doesn't do anything, so $H_\text{reg}(f)$ isn't actually positive-definite. Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, which takes care of this problem.
|
||||
4. Project the negative gradient and the regularized Hessian onto the orthogonal complement of the frozen subspace of $\operatorname{End}(\mathbb{R}^n)$.
|
||||
* For this write-up, we'll write the projection as $\mathcal{Q}$.
|
||||
5. Find the base step $u \in \operatorname{End}(\mathbb{R}^n)$, which is defined by two properties: satisfying the equation $-\mathcal{Q} \operatorname{grad}(f) = H_\text{reg}(f)\,u$ and being orthogonal to the frozen subspace.
|
||||
* When we say in the code that we're "projecting" the regularized Hessian, we're really turning it into an operator that can be used to express both properties.
|
||||
6. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate.
|
||||
|
||||
### Reconstructing a rigid subassembly
|
||||
|
||||
|
|
Loading…
Add table
Reference in a new issue