Feature Extraction -definition
Given a set of features F = {𝒳1,.....,𝒳N}
the Feature Extraction ("Construction") problem is to map F to some feature set F" that maximizes the learner's ability to classify patterns.
Find a projection matrix w from N-dimensional to M-dimensional vectors that keeps error low.
Assume that N features are linear combination of M < N vectors
Zi = Wi1𝒳i1 + ......+ Wid𝒳iN
Z = Wt𝒳
What we expect from such basis
- Uncorrelated cannot be reduced further
- Have large variance or otherwise bear no information.
Algebraic definition of PCs
PCA
PCA for image Compression
Is PCA a good criterion for classification ?
- Data variation determines the projection direction
- What's missing ?
- Class information
What is a good projection ?
- Similarly, what is a good criterion ?
- Separating different classes
What class information may be useful ?
Between-class distance
- Distance between the centroids of different classes
Within-class distance
- Accumulated distance of an instance to the centroid of its class.
Linear discriminant analysis (LDA) finds most discriminant projection by
- maximizing between-class distance
- and minimizing within-class distance
Linear Discriminant Analysis
Find a low-dimensional space such that when 𝓍 is projected, classes are well-separated.
Means and Scatter after projection
Good Projection
- Means are as far away as possible
- Scatter is small as possible
- Fisher Linear Discriminant
J(w) = (m1 - m2)2 /s12 + s2 square
No comments:
Post a Comment