PLA | 牛的大腦

1.假設
X={x1(維度1的值), x2(維度1的值)}
W={w1(維度1權重),w2(維度2權重)}
而且以下條件可以判斷allow和deny
if x1*y1+x2*y2 > threshold:
　allow
else:
　deny

2.上述可定義以下
(W^T)*(X) > threshold, allow
(W^T)*(X) < threshold, deny
=以下
h(X)=sign( (W^T)*(X) – threshold )
sign表示取得正或負的結果
若h(X)為正,表示有過threshold,allow
若h(X)為負,表示沒過threshold,deny
ps:
h也稱為perceptron
ps:
inner product=x1*y1+x2*y2+….xn*yn,這裡以符號(X^T)*(Y)表示
　

3
將原本的X多加一個維度，此值永遠為+1
將原本的W多加一個維度，此值為負的threshold
所以X和W會變成以下
X={x0(+1),x1(維度1的值), x2(維度1的值)}
W={w0(-threshold),w1(維度1權重),w2(維度2權重)}
所以公式(從step2得到)
h(X)=sign( (W^T)*(X) – threshold )
會變成以下
h(X)=sign( (W^T)*(X) + (-threshold)*(+1) )

4
由於w0和x0都在W,X內，可一起用inner product表示
w0=(-threshold)
x0=(+1)
所以公式(從step3得到)
h(X)=sign( (W^T)*(X) + (-threshold)*(+1) )
可化減為以下
h(X)=sign( (W^T)*(X) )
若h(X)為+1,有過threshold
若h(x)為-1 ,沒過threshold
=以下
h(x)=sign(w0x0+w1x1+w2x2)
ps:
截距公式: (-1)+X*(1/A)+Y*(1/B)=0
ps:
x 對應到一個點
y 對應到結果 1or0
h(hypothesis) 對應到線

………………………………………………………………..．

perceptrons learning algorithm

H=all possible perceptrons
目標: 要從H中找出一條g, 而g可以用一條線做為1和0界線

Algorithm
1
先讓 g0為開始的第一條線
並讓g0=W
2. Cyclic PLA,
for data in X: 把所有點放進去檢查
if ismistake(W,data): 若這個點不對
w=correct(data) 更新
return w
完成後得到W of Perceptron Learning Algorithm

ismistake() 檢查這個點是否有錯
sign[ ((W^T)*Xn) ] != Yn

correct() 根據錯誤調整斜率
要正的,但結果是負的, 表示角度太大, 把他轉小接近X, W+(+1)*X
要負的,但結果是正的, 角度太小, 把他轉開遠離X, W+(-1)*X

更新原則
假設以下兩種點x 分錯的情況
1.它是類別1，但卻被分在類別2,這代表 wt 向量與 x 之間的夾角太大，
所以要讓它們之間的夾角變小，可用向量相加的方式來做到, 可透過W+(+1)*X得到新的W
2.它是類別 2，但卻被分在類別1,這代表 wt 向量與 x 向量之間的夾角太小，
所以要讓他們之間的夾角變大，可用 wt 向量減掉 x 向量來做到, 可透過W+(-1)*X得到新的W

refer
http://wizmann.tk/ml-foundations-pla.html
http://shaoxiongjiang.com/2013/03/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E5%85%A5%E9%97%A8-%E6%84%9F%E7%9F%A5%E5%99%A8-perceptron/
http://cpmarkchang.logdown.com/posts/189108-machine-learning-perceptron-algorithm

…….

linear and non-linear separable
linear separable表示不同的資料結果可以用一條線分開
所以當找到這條線,PLA就會停下來
non-linear separable表示不同的資料結果無法用一條線分開
在此情況下, 無法找到一條完美的線(無犯錯的線)
若要解決此辦法,就要退而求其次
找一條犯的錯最小的線
ps:
Off-Training-Set error is used for estimating how your hypothesis perform on the dataset which is not used in training

modify PLA algorithm/Pocket Algorithm
每跑一次,就算一下線的分數,
不斷的跑,若下一次線的分數較好,則使用此線
跑N次後,看最後是那一條線
ps:
modify PLA algorithm 會比 PLA 慢, 因為會做較多事所以用較多資源而變慢

…………….

example for python

#vi file.csv
-1,8,9
-1,7,8
1,1,2
1,3,4

#vi pla.py

rawdata=loadtxt('file.csv', delimiter=',')
feature=rawdata[:,1:]
tag=rawdata[:,0]
datas=list()
for f1,t1 in zip(feature,tag):
　datas.append([f1,t1])

def pla(datas):
　w = datas[0][0]
　iteration = 0
　while True:
　　iteration += 1
　　false_data = 0
　　for data in datas:
　　　t = dot(w, data[0])
　　　if sign(data[1]) != sign(t):
　　　　error = data[1]
　　　　false_data += 1
　　　　w = w+error * data[0]

　　print 'iter%d (%d / %d)' % (iteration, false_data, len(datas))
　　if not false_data:
　　　break
　return w
pla(datas)

ps:
以上example若遇到non-linear separable不會自動停下來

分享此文：