Calculus of Variations, Classic Mechanics, Optimal Control, and Convex Optimization

by allenlu2007

以下四個領域雖然屬於不同學科,但高度相關相連。

Calculus of variations: math

Classic mechanics: physics

Optimal control: control/engineering

Convex optimization: engineering/math

 

Lagrange Heritage

把這些領域串在一起的 “keyword” 就是 Lagrange. 例如:

(i) Calculus of variations:  Euler-Lagrange equation

(ii) Classic mechanics: Lagrangian mechanics

(iii) Optimal control: Lagrangian

(iv) Convex optimization: Lagrange multiplier; Lagrangian dual problem

Lagrange 本人對 (i) and (ii) 有直接的貢獻。但 (iii) and (iv) 只能說和 Lagrange 相關,以下是一篇很好的參考文章把 (i), (ii), and (iii) 串在一起。

http://yima.csl.illinois.edu/psfile/ECE553/sussmann-willems.pdf

Optimization

更具體說,把這些領域串在一起的 “key concept” 就是 optimization.  

什麼是 optimization? 最簡單的解釋就是找最小(或最大)。

例如國、高中的二次式和大學的 calculus, 是在找一個已知 function 的最小或最大值, {x, y}。

更進一步的 optimization, 則是找一個 “objective function” 其對應的實值 (R is either physics quantity such as time, distance, energy, action, or non-physics quantity such as cost, loss, etc.) 為最小, {f(x), R}.  以上的四個領域 (i)-(iv) 基本上都是屬於這類 optimization.

 

可以參考 Wiki 關於 “optimization” 的定義和說明:

An optimization problem can be represented in the following way:

Given: a function f : A \to R from some set A to the real numbers
Sought: an element x0 in A such that f(x0) ≤ f(x) for all x in A (“minimization”) or such that f(x0) ≥ f(x) for all x in A (“maximization”).

A 一般稱為 search space, 通常是 specified by a set of constraints, equalities or inequalities.

f 一般稱為 cost function, loss function, objective function.  在特定的應用如 (ii) and (iii) 稱為 Lagrangian.

 

Why Convex Optimization ( i. 存在 minimum, and (ii) global minimum=local minimum) 

General optimization problem 最大的挑戰是 local minimum/maximum 的存在。常用的 algorithms (牛頓法,gradient method, etc.) 有可能會收歛到 local minimum/maximum 而非 global minimum, 甚至有時不收歛。比較複雜的方法 (annealing method) 也無法保証一定收歛到 global minimum.  

如果 cost function f and search space A are both convex, 可以保証 local minimum = global minimum.  反之不成立。  

一般自然非複雜的物理問題大多是 local minimum = global minimum (least time, least action, least energy, etc.)  雖然並不一定是 convex optimization.  但多數可以 formulate 成 convex optimization 問題。    

另一個 convex optimization handy 之處在於可以引入 Lagrangian dual problem: both primal and dual are convex optimization.  可以一次解兩個問題,同時可以 self-validate 演算法的正確性。

還有一個重要原因為什麼需要 convex, 是 mimimum 存在的必要條件 ( first order variation = 0; second order variation, Hessian > 0).  避免 saddle point 的 stationary condition; 同時假設非 boundary 的 minimum.

 

可以參考 Wiki 關於 “convex optimization” 的定義:

An optimization problem (also referred to as a mathematical programming problem orminimization problem) of finding some x^\ast \in \mathcal{X} such that

f(x^\ast) = \min \{f(x):x \in \mathcal{X}\},

where \mathcal{X} \subset \mathbb{R}^n is the feasible set and f(x):\mathbb{R}^n \rightarrow \mathbb{R} is the objective, is called convex if \mathcal{X} is a closed convex set and f(x) is convex on \mathbb{R}^n.

Alternatively, an optimization problem on the form

\begin{align}
&\operatorname{minimize}& & f(x) \\
&\operatorname{subject\;to}
& &g_i(x) \leq 0, \quad i = 1,\dots,m
\end{align}

is called convex if the functions f, g_1 \ldots g_m : \mathbb{R}^n \rightarrow \mathbb{R} are convex.

 

Calculus of Variations – Euler-Lagrange Equation

Calculus of variations 最早是解 Bernoulli 1696 年所提出的 “最速下降線問題” (brachistochrone curve) 而誕生。 17世紀幾位著名的數學物理家 (包含牛頓和 Leibniz) 都提出自己的解法,答案是 cycloid 如下圖。

一個直觀的解釋: 假設 A 和 B 在同一高度,如果走直線永遠不會到。但如果下降太多再回昇時間會太長也非最佳解,最佳解介於之間。

 NewImage

Consider the functional integral

 J[y] = \int_{x_1}^{x_2}  L[x,y(x),y'(x)]\, dx  \, .

where

x1x2 are constants,
y (x) is twice continuously differentiable,
y ′(x) = dy / dx  ,
L[xy (x), y ′(x)] is twice continuously differentiable with respect to its arguments x,  y,  y ′.

L 在一些特定的應用也稱為 Lagrangian.

Euler–Lagrange equation:

 \frac{\part L}{\part f} -\frac{d}{dx} \frac{\part L}{\part f'}=0

 

Lagrangian Mechanics – Least Action Principle

The starting point is the action, denoted  \mathcal{S} , of a physical system. It is defined as the integral of the Lagrangian L between two instants of time t1 and t2 

 \mathcal{S}[\mathbf{q}(t)] = \int_{t_1}^{t_2} L(\mathbf{q}(t),\mathbf{\dot{q}}(t), t) dt
where L = T – V  (kinetic energy – potential energy)

Lagrangian mechanics 就是找 least action (or stationary action) 解,只要代入上述的 Euler-Lagrange equation.   基本上 Lagrangian mechanics 是 calculus of variations 的一個應用。唯一的問題是 L=T-V 是否有比較直觀的解釋? (參考 Feynman 對 least action 的說明)。

 

Optimal Control

optimal control 和 calculus of variations 的關係請參考本文

首先 standard form from calculus of variations 

NewImage

Optimal control 對於 (2) 做了一個修改如下:

NewImage

從數學上看 (3) 只是 (2) 或 (1) 的推廣。但觀念上的突破在於引入新的 control input u(t) and dynamic system f.

(1) 是一個 autonomous system (自洽系統),基本上我們是從 search space of q(t) 中找最佳解 (如最速下降線問題) under the boundary condition constraints (i.e. q(a) and q(b)).

(3) 則引入新的 control input u(t) and dynamic system f(q, u, t).  藉著 control u(t) (有時甚至可以設計或改變 f(.)), 得到最佳解。當然 (3) 本質上還是 optimization 的問題,只是 search space 變成 u(t).

另外 cost function 中的 Lagrangian 包含了 q and u, 這是 regulated optimization/control, 可以同時 optimize q(t) and u(t) for minimum control/energy/power, etc.  

 

到現在為止,我們已經把 calculus of variations, Lagrangian mechanics, 和 optimal control 串在一起 (i) 都是 optimization 問題; (ii) objective function 都是 Lagrangian 的積分。Sounds straightforward and simple.

下一步是 deal with constraints.  絕大多數的 optimization problems 都會 under some constraints.  

最簡單的 constraints 是 cost function 中的 Lagrangian 包含 boundary condition (remember least action principle 的 (t1, p1) and (t2, p2) 或最速下降線的 p1 and p2.  這在 Lagrangian 積分時 boundary conditions 大多可以直接代入 Euler-Lagrangian equation 求解。

也有另外的 constraint 是積分形 (e.g. necklace chain, the constraint of total length). 

更複雜的 constraint 是 equation 形式, e.g. g(x)=0 or 微分形式如 (3) above; 甚至是 inequality 形式 e.g. g(x)<0.   如何處理?

 

Lagrange Multiplier

這時要提到另個 Lagrange 的貢獻: Lagrange multiplier.  主要是用來處理 optimization problem 中的 constraint 問題。乍看之下,Lagrange multiplier 和 Lagrangian in mechanics or calculus of variations 是兩個無關的 topic.   Lagrangian sit inside integral and govern by Euler-Lagrange equation.   Lagrange multiplier is used for setting constraint.

神奇的是這兩者有密切的關係

1. Lagrange multiplier physical interpretation 1 (math):   f and g tangent are the same (or normal direction are the same), multiplier is simply a scaling constant

2. Lagrange multiplier physical interpretation 2:  generalized force to balance Lagrangian

3. Therefore, in convex optimization with constraint, we simply call f Lagrangian.

4. Hamiltonian and Legendre transformation lead to Hamiltonia essentially Lagrangian multiplier?

Legendre transformation is a mapping of tangent …

For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the graph of the function and the family of tangents of the graph. (For a function of one variable, the tangents are well-defined at all but at most countably many points, since a convex function is differentiable at all but at most countably many points.)


Convex Optimization


 

 

常見的問題

* Lagrangian (physics) 和 Lagrange multiplier (optimization) 是兩個不相關或相關的觀念? 

請見 http://www.slimy.com/~steuard/teaching/tutorials/Lagrange.html

Advertisements