MathDB
AGMC 2021 Prelim Q3

Source:

January 10, 2023
algorithmfunctionanalytic geometryvector

Problem Statement

Last year, Master Cheung is famous for multi-rotation. This year, he comes to DAMO to make noodles for sweeping monk. One day, software engineer Xiao Li talks with Master Cheung about his job. Xiao Li mainly researches and designs the algorithm to adjust the paramter of different kinds of products. These paramters can normally be obtainly by minimising loss function ff on Rn\mathbb{R}^n. In the recent project of Xiao Li, this loss function is obtained by other topics. For safety consideration and technique reasons, this topic makes Xiao Li difficult to find the interal details of the function. They only provide a port to calculate the value of f(x)f(\text x) for any xRn\text x\in\mathbb{R}^n. Therefore, Xiao Li must only use the value of the function to minimise ff. Also, every times calculating the value of ff will use a lot of calculating resources. It is good to know that the dimension nn is not very high (around 1010). Also, colleague who provides the function tells Xiao Li to assume ff is smooth first. This problem reminds Master Cheung of his antique radio. If you want to hear a programme from the radio, you need to turn the knob of the radio carefully. At the same time, you need to pay attention to the quality of the radio received, until the quality is the best. In this process, no one knows the relationship between the angle of turning the knob and the quality of the radio received. Master Cheung and Xiao Li realizes that minimising ff is same as adjusting the machine with multiple knobs: Assume every weight of x\text x is controlled by a knob. f(x)f(\text x) is a certain performance of the machine. We only need to adjust every knobs again and again and observes the value of ff in the same time. Maybe there is hope to find the best x\text x. As a result, two people suggest an iteration algorithm (named Automated Forward/Backward Tuning, AFBT\text{AFBT}, to minimise ff. In kk-th iteration, the algorithm adjusts the individual weight of xk\text{x}_k to 2n2n points {xk±tkei:i=1,...,n}\{\text x_k\pm t_k\text e^i:i=1,...,n\}, where tkt_k is the step size; then, make yky_k be the smallest one among the value of the function of thosse points. Then check if yk\text y_k sufficiently makes ff decrease; then, take xk+1=yk\text x_{k+1}=\text y_k, then make the step size doubled. Otherwise, make xk+1=xk\text x_{k+1}=\text x_k and makes the step size decrease in half. In the algorithm, ei\text e^i is the ii-th coordinate vector in Rn\mathbb{R}^n. The weight of ii-th is 11. Others are 00; 1()\mathbf{1}(\cdot) is indicator function. If f(xk)f(yk)f(\text x_k)-f(\text y_k) is at least the square of tkt_k, then take the value of 1(f(k)f(yk)tk2)\mathbf{1}(f(\text k)-f(y_k)\ge t^2_k) as 11. Otherwise, take it as 00.
AFBT\text{AFBT} algorithm Input x0Rn\text{x}_0\in \mathbb{R}^n, t0>0t_0>0. For k=0,1,2,...k=0, 1, 2, ..., perform the following loop: 1: #Calculate loss function. 2: sk:=1[f(xk)f(yk)tk2]s_k:=\mathbb{1}[f(\text{x}_k)-f(\text{y}_k)\ge t^2_k] #Is it sufficiently decreasing? Yes: sk=1s_k=1; No: sk=0s_k=0. 3: xk+1:=(1sk)xk+skyk\text{x}_{k+1}:=(1-s_k)\text{x}_k+s_k\text{y}_k #Update the point of iteration. 4: tk+1:=22Sk1tkt_{k+1}:=2^{2S_k-1}t_k #Update step size. sk=1s_k=1: Step size doubles; sk=0s_k=0: Step size decreases by half.
Now, we made assumption to the loss function f:RnRf:\mathbb{R}^n\to \mathbb{R}. Assumption 1. Let ff be a convex function. For any x,yRn\text{x}, \text{y}\in \mathbb{R}^n and α[0,1]\alpha \in [0, 1], we have f((1α)x+y)(1α)f(x)+αf(y)f((1-\alpha)\text{x}+\text{y})\le (1-\alpha)f(\text{x})+\alpha f(\text{y}). Assumption 2. ff is differentiable on Rn\mathbb{R}^n and f\nabla f is L-Lipschitz continuous on Rn\mathbb{R}^n. Assumption 3. The level set of ff is bounded. For any λR\lambda\in\mathbb{R}, set {xRn:f(x)λ}\{\text x\in \mathbb{R}^n:f(\text x)\le \lambda\} is all bounded. Based on assumption 1 and 2, we can prove that f(x),yxf(y)f(x)f(x),yx+L2xy2\left\langle \nabla f(\text x),\text y-\text x \right\rangle \le f(\text y)-f(\text x)\le \left\langle \nabla f(\text x),\text y-\text x\right\rangle+\frac{L}{2}||\text x-\text y||^2 You can refer to any convex analysis textbook for more properties of convex function. Prove that under the assumption 1-3, for AFBTAFBT, limkf(xk)=f\lim_{k \to \infty}f(\text{x}_k)=f^*