Deep Learning
Breakthrough results in
- Image classification
- Speech Recognition
- Machine Translation
- Multi-modal learning
Deep Neural Network
Hierarchical RepresentationBreakthrough results in
- Image classification
- Speech Recognition
- Machine Translation
- Multi-modal learning
Deep Neural Network
- Problem: training networks with many hidden layers doesn't work very well.
- Local minima, very slow training if initialize with zero weights.
- Diffusion of gradient.
- Hierarchical Representation help represent complex functions.
- NLP:character -> word -> Chunk -> Clause -> Sentence
- Image: pixel > edge -> texton -> motif -> part -> object
- Deep Learning: Learning a hierarchy of internal representations
- Learned internal representation at the hidden layers (trainable feature extractor)
- Feature learning
Unsupervised Pre-training
We will use greedy, layer wise pre-training
- Train one layer at a time
- Fix the parameters of previous hidden layers
- Previous layers viewed as feature extraction
Tuning the Classifier
After pre-training of the layers
- Add output layer
- Train the whole network using supervised learning (Back propagation)
In deep neural Network
- Feed forward NN
- Stacked Autoencoders (multi layer neural net with target output = input)
- Stacked restricted Boltzmann machine
- Convolutional Neural Network
Output Layer
Here predicting a supervised
Hidden layers
These learn more abstract representations as you head up
Input layer
Raw sensory inputs
A Neural Network
Training: Back Propagation of Error
- Calculate total error at the top
- Calculate contributions to error at each step going backwards
- The weights are modified as the error is propagated
Training Deep Networks
Difficulties of supervised training of deep networks
1. Early layers of MLP do not get trained well
- Diffusion of Gradient - error attenuates as it propagates to earlier layers
- Leads to very slow training
- the error to earlier layers drops quickly as the top layers "mostly" solve the task
3. Deep networks tend to have more local minima problems than shallow networks during supervised training.
Training of neural networks
- Forward Propagation :
- feed-forward
Activation Functions
Autoencoder
Unlabeled training examples set
{ 𝒳1, 𝒳2, 𝒳3, .... }, 𝒳i ∈ Rn
Set the target values to be equal to the inputs. yi = 𝒳i
Network is trained to output the input (learn identify function).
hw,b (𝒳) ≂ 𝒳
Solution may be trivial!
No comments:
Post a Comment