Optimal Time Control – Bang Bang Controller

by allenlu2007

在 control theory 中, bang-bang controller 也被稱為 on-off controller.  基本上就是一種 feedback controller 在兩個  state 之間切換。常用於冷氣、暖氣、熱水器等定溫的 controller.  

Bang-bang controller 最大的優點就是簡單,一般的經驗是溫度常常會太熱或太冷時,plant (冷氣, etc) 才開始動作。同時對 noise 比較敏感,比較容易有誤動作。多半在 controller 會加上 hysteresis 避免誤動作。因此 bang-bang controller 也稱為 hysteresis controller.

很難把 bang-bang controller 和 optimal controller 聯想在一起。但在某些情況或條件下, bang-bang controller 是 optimal controller.  乍看很奇怪,其實 optimal controller 一般是在 noiseless 情況下 optimize (minimize) 某一個特定量 (時間, 燃料)。以定溫 controller 而言,mmse 可能是更好的 criterion.  如果在 noisy (Gaussian noise) 環境下,bang-bang controller 一般不會是最好的 controller.  只有在 LQR optimal controller 的情況下,optimal controller 恰好也是 mmse controller.    但在 bang-bang controller 和 mmse controller 有相當差

 

Example 1:  PRODUCTION AND CONSUMPTION (孤注一擲)

NewImage

NewImage

NewImageNewImage

  * 幾個特點:  (i) 限制型  a belongs to [0, 1]; (ii) x(t) is nonlinear function of t (exponential in this case); (iii) 孤注一擲養大再消費。或是 invest 在一項最好收益的一次收割。

 

Example 2:  Bang Bang Principle for Linear Systems

可以參考 article 1, and article 2

it turns out bang bang controller 有更深的的含義。

1. linear dynamic system

2. u is in a convex set, or it is bounded by a polygon

3. we can always find a path at finite time to reach 

4. the bang-bang is optimal control as minimum time problem

 

Problem Statement

考慮一個 general time-invariant dynamics

$\displaystyle \dot x=Ax+Bu$    (4.51)

where $ x\in\mathbb{R}^n$ and $ u\in U\subset \mathbb{R}^m$

$\displaystyle U=\{u\in\mathbb{R}^m: u_i\in[-1,1],\, i=1,\dots,m\}.$   (4.52)

(4.52) 假設 m components 都有相同的 magnitude for simplicity.  可以推廣到不同 magnitude 的 case.

假設存在 u belongs to (4.52) 可以 steer x from initial state xo to a given final state x1 in minimal time.

 

Another notation

NewImage

NewImage

M = A (nxn) and N = B (nxm)

 

Controllability = Reachability + Stability

求解之前,先要確定存在 solution.  先不考慮時間,從 xo 是否能 steering 到 x1, 也稱為 reachability.

先定義 controllability matrix G:

NewImage

 

rank G = n 確認是 reachable.  是否只要 reachable 就是 controllable?  直觀上好像是。Wrong!   如果是 exponentially explode 有時也是 reachable, 但終將 fail.  在 control theory 中另外一個非常重要的因素是 stable!

Controllable 的第一步是 G matrix 必須是 reachable (full rank).   第二步是 M matrix 必須是 stable.  

Controllable =  Reachable + Stable

Theorem 2.3 說明了 rank(G) = n 保証了 reachability.   所謂 stability 就是當 α=0 (無外力), 系統在有 small perturbation 時不會 explode (Lyapunov stability) 或是 exponential decay down (exponential stability).  對於 linear dynamic system, 我們用 exponential stability.   

Exponential stability =>  M 的 eigenvalue 的 real part 必須小於 0.

綜合以上所述  controllable = reachable ( rankG=n ) + stable ( Re eig(M) < 0)

NewImage

Example 1 (double integrator model)

NewImage

NewImage

雖然 G 是 full rank, 但 M matrix 並非 stable.  因些並非 controllable.

 x = (q, v) =>  q_dot = v,  v_dot = a.   因些上述式子表示:  bounded acceleration model 不是 controllable.  但這是否表示無法在任何 initial conditions (any q and v) 停止到原點? No.    事實上,上述 bounded acceleration model 可以由任何 initial condition reach origin and stop at origin at finite time.  Moreover, the optimal time control turns out to be bang-bang controller. 見下節例子。

所以 controllability 有什麼用? 

From Wiki, complete state controllability (or simply controllability if no other context is given) describes the ability of an external input to move the internal state of a system from any initial state to any other final state in a finite time interval.  

所以符合 controllability 必能由任意 initial condition to any final final state in a finite time.  但不符合 controllability 也可能由任意 initial condition to any final final state in a finite time.


State Feedback Save Controllability (Assume G is full rank)

藉由 state feedback 可以把 uncontrollable system 變為 controllable? Yes.   但先決條件是 G 必須 full rank.

NewImage


回到 double integrator 的 model. 可以藉由 state feedback 變為 controllable. 

 

NewImage

 

另一個 double integrator example 但非 controllable.   G 的 Rank=1 < 2.  即使經由 state feedback 也無法讓 M 變為 stable. 

 

NewImage

除了 state feedback 之外,另一個方式是 output feedback.  請參考 reference

Example 2 (Inverted Pendulum Model) 

 NewImage

 

 

Linear Optimal Time Control

先解決 reachable 的問題,下一步就是 optimal control 的問題。包含 optimal time control, 以及 optimal fuel control, or optimal … 

有兩種方式來求解: (i) 用 control Hamilotonian (PMP);  (ii) 直接求解

 

(i) Hamiltonian  H(x, u, p, po) = <p, Ax+Bu> + po

Hamiltonian maximization condition implies

max_u H = max_u <p*(t), Bu*(t)>   for all t belongs to [to, t*]

可以改寫成

NewImage

where b1, .., bm are the columns of B.  Since each ui(t) of the optimal control 可以獨立選擇 (independent). 因此 solution 也很明顯。

NewImage

上面的說明可能太抽象。以 double integrator model 為例:

NewImage

NewImage

NewImage

第三個 case p2*=0 似乎 imply u is arbitrary.  不過如果 p2*=0 over some time, it implies p1*=0 from 4.49.  這個 case 會違反 … theorem.  p2*=0 只會發生在某些點,也就是 bang-bang 跳的點。

因些 time-optimal control 只會用最大加速 (u=+1) 和最大剎車 (u=-1) 兩值,而且只切換一次。具體的 sign (加速或剎車) 和 switching time depends on initial condition.  可以用 phaser diagram 圖解更清楚。

NewImage

NewImage 

Time-optimal control law 有兩個特徵:(i) bang-bang controller; (ii) state feedback law. 意即 u* 原來只是 open loop, 只 depends on x* (state) and p* (costate).  但我們可以消除 p* 的 dependency, 只 depends on x* (state) in 圖解法 (x* = (x, x_dot)’)

 

Open loop:  u depends on both x and p and even t

Close loop: u depends only on x (?) so that we can make it state feedback control !!

 NewImage

對於 general time optimal control, 是否都是 (i) bang-bang controller; and (ii) state feedback law (u depends only on x)?

LQR 非 bang-bang controller; 但 follow state feedback law.

  

(ii) 另一種方式是直接解

NewImage

The assumption of normality, which was needed to prove the bang-bang property of time-optimal controls for $ U$a hypercube, is quite strong. A different, weaker version of the bang-bang principle could be formulated as follows. Rather than wishing for every time-optimal control to be bang-bang, we could ask whether every state $ x_1$ reachable from $ x_0$ by some control is also reachable from $ x_0$ in the same time by a bang-bang control; in other words, whether reachable sets for bang-bang controls coincide with reachable sets for all controls. This would imply that, even though not all time-optimal controls are necessarily bang-bang, we can always select one that is bang-bang. It turns out that this modified bang-bang principle holds for every linear control system (no controllability assumption is necessary) and every control set $ U$ that is a convex polyhedron.



The above is bang-bang principle.

簡單來說:  

1. linear dynamic system

2. bounded control (inside polyhedron)

3. optimal time (least time)

4. Strong assertion:  all optimal paths are bang-bang control (along the edge of control polygon, but has many discontinuity jumps like in bang-bang controller) ==>  need to satisfy controllability conditions

5. Weak assertion: exist one optimal path is bang-bang control.  ==> relax the controllability conditions?

 

4. and 5. need to be further clarified. 

 

 

 

 

 

 

 

 

 

 

Advertisements