# 2. 最佳秩 k 逼近

\begin{aligned} \mathop{\min}\limits_{r(X)\le k}\quad \|A-X\|^2_F \end{aligned} \tag{2.1.1}

\begin{aligned} X_k = \sum_{i=1}^k \delta_iu_iv_i^T = U_1 \Sigma_1V_1^T \end{aligned}

$U_1 = [u_1,u_2,···,u_k],V_1 = [v_1,v_2,···,v_k]$ 为 $A$ SVD 分解的最大 $k$ 个特征值对应的左右特征向量。

## 2.1 最佳秩 k 逼近的证明

$$W \cdot e \preceq e, e^TW \preceq e^T \text{（即 } W \text{的每行、每列之和均 } \le \text{1）}\tag{2.1.2}$$

$$a^T(I-W)b \ge 0$$

$$(Q_t)_{ij}= \begin{cases} \max\lbrace \sum_{k\ne j,k \le t}w_{jk}, \sum_{k\ne j,k \le t}w_{kj} \rbrace& i=j \\ -w_{ij}& i \ne j \end{cases}$$

1. $Q_t + Q$ 为对角矩阵，且 $Q_t + Q \preceq I$（由 2.1.2 可得）
2. $Q_t$ 的各行之和、各列之和均非负
3. 对于选定的 $j$，$Q_t$ 的第 $j$ 个对角元素随 $t$ 单调递增

$$\alpha_t = [\alpha_t(1) – \alpha_t(t),···,\alpha_t(t-1) – \alpha_t(t)]^T,\quad t = n,···,2$$

$\alpha_{t-1}$ 的维数比 $\alpha_t$ 的维数少一维。由于 $a$ 中元素递减，所以 $\alpha_t$ 为非负向量。类似地，定义 $\lbrace \beta_t\rbrace$

$$a^T(I-W)b \ge a^TQ_nb = \alpha_n^TQ_n\beta_n$$

\begin{aligned} & \alpha_t^TQ_t\beta_t – (\alpha_t – \alpha_t(t)e)^TQ_t(\beta_t – \beta_t(t)e) \\ =& \alpha_t^TQ_t\beta_t – \alpha_t^TQ_t(\beta_t – \beta_t(t)e) + (\alpha_t(t)e)^TQ_t(\beta_t – \beta_t(t)e) \\ =& \alpha_t^TQ_t\beta_t(t)e + (\alpha_t(t)e)^TQ_t(\beta_t – \beta_t(t)e) \ge 0 \end{aligned}

\begin{aligned} \Rightarrow \alpha_t^TQ_t\beta_t \ge & (\alpha_t – \alpha_t(t)e)^T Q_t (\beta_t – \beta_t(t)e) \\ \ge & \alpha_{t-1}^T Q_{t-1} \beta_{t-1} \end{aligned}

$$\max\limits_{U^TU = V^TV = I} tr(AUBV^T) = \sum_{i=1}^n\sigma_i(A)\sigma_i(B) \tag{2.1.3}$$

\begin{aligned} & \max\limits_{U^TU = V^TV = I} tr(AUBV^T) \\ &= \max tr(diag(a)U diag(b) V^T) \end{aligned}

\begin{aligned} tr(diag(a)U diag(b) V^T) &= \sum_{i,j}a_iu_{ij}v_{ij}b_j \\ &\le \sum_{i,j}a_i|u_{ij}v_{ij}|b_j\\ &\triangleq \sum_{i,j}a_iw_{ij}b_j = a^TWb\\ \end{aligned}

$$tr(diag(a)U diag(b) V^T) \le a^Tb = \sum_{i=1}^n\sigma_i(A)\sigma_i(B)$$

$$|A|^2_F = \sum_i\sigma_i(A)^2 \tag{2.1.4}$$

$$\because A\cdot\vec{x} = \sigma\cdot\vec{x},\quad |A^T – \lambda E| = |A-\lambda E|$$
\begin{aligned} \therefore A^T\cdot\vec{x} =& \sigma\cdot\vec{x} \\ AA^T\cdot\vec{x} =& \sigma^2\cdot\vec{x} \\ \end{aligned}
$$\therefore \sum_i\sigma^2 = tr(AA^T) = \sum_{i,j}a_{ij}\cdot a_{ij} = \|A\|^2_F$$

\begin{aligned} \|A-B\|^2_F &= \sum_{i,j}(a_{ij} – b_{ij})^2 \\ &= \sum_{i,j}(a_{ij}^2 + b_{ij}^2) – 2\sum{i,j}a_{ij}b_{ij}\\ &= \|A\|^2_F + \|B\|^2_F – atr(AB^T)\\ &\ge \sum_{i=1}^n\sigma_i(A)^2 + \sum_i\sigma_i(B)^2 – 2\sum_{i=1}^n\sigma_i(A)\sigma_i(B)\quad (由 2.1.3 \; 2.1.4)\\ &= \sum_{i=1}^n(\sigma_i(A) – \sigma_i(B))^2 \\ &= \sum_{i=1}^k(\sigma_i(A) – \sigma_i(B))^2 + \sum_{i=k+1}^n\sigma_i(A)^2 \\ &\ge \sum_{i=k+1}^n\sigma_i(A)^2 \end{aligned}

# 3. 引用

[1] 孙素敏，“子空间聚类及其应用”，西安建筑科技大学硕士学位论文

[2] 赵科科，“低秩矩阵分解的正则化方法与应用”，浙江大学博士学位论文

[mathjax]