From 00d7d1b36986748eddb4a6667cf984571e22a2c4 Mon Sep 17 00:00:00 2001 From: Vectornaut Date: Mon, 27 Jan 2025 07:29:36 +0000 Subject: [PATCH] Bring up subtleties of regularization and frozen entries --- Gram-matrix-parameterization.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/Gram-matrix-parameterization.md b/Gram-matrix-parameterization.md index 61d1fe0..31ee5db 100644 --- a/Gram-matrix-parameterization.md +++ b/Gram-matrix-parameterization.md @@ -119,11 +119,14 @@ The minimization routine is implemented in [`engine.rs`](../src/branch/main/app- 2. Find the Hessian $H(f) := d\operatorname{grad}(f)$, as described in "The second derivative of the loss function." * Recall that we express $H(f)$ as a matrix in the standard basis for $\operatorname{End}(\mathbb{R}^n)$. 3. If the Hessian isn't positive-definite, make it positive definite by adding $-c \lambda_\text{min}$, where $\lambda_\text{min}$ is its lowest eigenvalue and $c > 1$ is a parameter of the minimization routine. In other words, find the regularized Hessian - $$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \text{otherwise} \end{cases}.$$ + $$H_\text{reg}(f) := H(f) + \begin{cases}0 & \lambda_\text{min} > 0 \\ -c \lambda_\text{min} & \lambda_\text{min} \le 0 \end{cases}.$$ * The parameter $c$ is passed to `realize_gram` as the argument `reg_scale`. - * Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, but we don't bother. - 4. Find the base step $u$, which is defined by the property that $-\operatorname{grad}(f) = H(f)\,u$. - 5. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate. + * When $\lambda_\text{min}$ is exactly zero, our regularization doesn't do anything, so $H_\text{reg}(f)$ isn't actually positive-definite. Ueda and Yamashita add an extra regularization term that's proportional to a power of $\|\operatorname{grad}(f)\|$, which takes care of this problem. + 4. Project the negative gradient and the regularized Hessian onto the orthogonal complement of the frozen subspace of $\operatorname{End}(\mathbb{R}^n)$. + * For this write-up, we'll write the projection as $\mathcal{Q}$. + 5. Find the base step $u \in \operatorname{End}(\mathbb{R}^n)$, which is defined by two properties: satisfying the equation $-\mathcal{Q} \operatorname{grad}(f) = H_\text{reg}(f)\,u$ and being orthogonal to the frozen subspace. + * When we say in the code that we're "projecting" the regularized Hessian, we're really turning it into an operator that can be used to express both properties. + 6. Backtrack by reducing the step size until we find a step that reduces the loss at a good fraction of the maximum possible rate. ### Reconstructing a rigid subassembly