AGMC 2021 Prelim Q3
Source:
January 10, 2023
algorithmfunctionanalytic geometryvector
Problem Statement
Last year, Master Cheung is famous for multi-rotation. This year, he comes to DAMO to make noodles for sweeping monk. One day, software engineer Xiao Li talks with Master Cheung about his job. Xiao Li mainly researches and designs the algorithm to adjust the paramter of different kinds of products. These paramters can normally be obtainly by minimising loss function on . In the recent project of Xiao Li, this loss function is obtained by other topics. For safety consideration and technique reasons, this topic makes Xiao Li difficult to find the interal details of the function. They only provide a port to calculate the value of for any . Therefore, Xiao Li must only use the value of the function to minimise . Also, every times calculating the value of will use a lot of calculating resources. It is good to know that the dimension is not very high (around ). Also, colleague who provides the function tells Xiao Li to assume is smooth first.
This problem reminds Master Cheung of his antique radio. If you want to hear a programme from the radio, you need to turn the knob of the radio carefully. At the same time, you need to pay attention to the quality of the radio received, until the quality is the best. In this process, no one knows the relationship between the angle of turning the knob and the quality of the radio received. Master Cheung and Xiao Li realizes that minimising is same as adjusting the machine with multiple knobs: Assume every weight of is controlled by a knob. is a certain performance of the machine. We only need to adjust every knobs again and again and observes the value of in the same time. Maybe there is hope to find the best . As a result, two people suggest an iteration algorithm (named Automated Forward/Backward Tuning, , to minimise . In -th iteration, the algorithm adjusts the individual weight of to points , where is the step size; then, make be the smallest one among the value of the function of thosse points. Then check if sufficiently makes decrease; then, take , then make the step size doubled. Otherwise, make and makes the step size decrease in half. In the algorithm, is the -th coordinate vector in . The weight of -th is . Others are ; is indicator function. If is at least the square of , then take the value of as . Otherwise, take it as . algorithm
Input , . For , perform the following loop:
1: #Calculate loss function.
2: #Is it sufficiently decreasing? Yes: ; No: .
3: #Update the point of iteration.
4: #Update step size. : Step size doubles; : Step size decreases by half.Now, we made assumption to the loss function .
Assumption 1. Let be a convex function. For any and , we have .
Assumption 2. is differentiable on and is L-Lipschitz continuous on .
Assumption 3. The level set of is bounded. For any , set is all bounded.
Based on assumption 1 and 2, we can prove that
You can refer to any convex analysis textbook for more properties of convex function.
Prove that under the assumption 1-3, for ,