course notes

intro

find the best among those feasible models for linear-separable data

两类训练样本中离超平面最近的样本与超平面之间的距离是最大的

linear separable, hard margin

1. 分类正确,且没有尺度:

方便起见 c 取 1

2. 最优,距离最大:

is equivalent to

…solve…

non-linear, soft margin

not linear separable or margin is too small

experiment, C 2^-5 ... 2^15     1 .. .. ..   2 .. ..    ... ..     10      sum .. .. ..    

find the C that yields the smallest summed error

non-linear SVM

map every sample to high dimensional space

the map function is hard to find but it’s easier to find the dot product of the mapped sample

the kernel

the kernel matrix should be

  • symmetry
  • semi positive definite
  • Mercer’s condition

multi-kernel learning

application

…blabla…

multi-class problem: Error-Correcting Output-Codes

reading notes

modeling

maximize margin

Lagrange multiplier

why lagrange multiplier works

problem solving and duality

dual problem transform

KKT condition

soft margin

add offset to margin

kernel methods

final form contains dot product

Mercey’s condition

what makes a kernel kernel

SMO