Watch the 3blue1brown series before the lecture: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
Moreover, read the Illustrated Guide to Recurrent Neural Networks by Michael Nguyen.
See the Neural Network Zoo by the Asimov Institute.
Our goal is to create a Neural Network that is able to recognize numbers that were written by hand.
We use the MNIST dataset, which contains 60k training examples + 10k test examples.
Open the “feed-forward-nn-hand-written-recognition” Jupyter notebook.
Our goal is to create a RNN that write songs like Freddy Mercury.
We use all Queen’s songs as datasets.
Open the “rnn-and-lstm-sing-like-freddy” Jupyter notebook.
It is hard to know in advance the best architecture for your problem.
We have to experiment with different hyper parameters: number of layers, neurons per layer, learning rate, activation functions.
Machine learning is empirical!
Too little layers/neurons: Underfitting. The problem might be too complex to be represented with such a little number of neurons.
Too many layers/neurons: Overfitting. The network might just “memorize” and not learn.
Read a simple explanation of activation functions here.
Read this nice explanation on how to choose activation and loss functions.
Should also be tuned.
Read the tradeoff batch size vs number of iterations to train a NN discussion on Stack Overflow.
Dropout is a technique used to improve over-fit on neural networks.
Basically, during training half of neurons on a particular layer will be deactivated. This improve generalization because force your layer to learn with different neurons the same “concept”.
During the prediction phase the dropout is deactivated.
(Extracted from Leonardo Araujo Santos’s online book)
The course contents are copyrighted (c) 2018 - onwards by TU Delft and their respective authors and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.