Administrative Information
Title | Recurrent Neural Networks |
Duration | 60 minutes |
Module | B |
Lesson Type | Lecture |
Focus | Technical - Deep Learning |
Topic | Multiple (Recurrent Neural Networks (RNN), Backpropagation Through Time (BPTT), Long Short-Term Memory (LSTM)) |
Keywords
Recurrent Neural Networks (RNN), Backpropagation Trough Time (BPTT), Long Short-Term Memory (LSTM),
Learning Goals
- Learning the fundamentals of Recurrent Neural Networks (RNN), Backpropagation Trough Time (BPTT) and Long Short-Term Memory (LSTM)
Expected Preparation
Learning Events to be Completed Before
Obligatory for Students
- Revision of backpropagation algorithm
- Ian Goodfellow and Yoshua Bengio and Aaron Courville: Deep Learning, MIT Press, 2016, Chapter 10
- François Chollet: Deep Learning with Python, Manning Publications, 2017, Chapter 6: Deep Learning for text and sequences
Optional for Students
None.
References and background for students
None.
Recommended for Teachers
None.
Lesson materials
Instructions for Teachers
In the beginning a general overview of sequential data is recommended. Here, you can discuss the main challanges of modeling sequential data (including receptive field, multidimensionality and multiscale nature).
Later we introduce RNNs first, with the basic principles and show that training RNNs after unfolding is very simlar to training MLPs, but the input, recurrent and output weight matrices are shared.
We introduce BPTT (Backpropagation through time) and the truncated version.
Next, we discuss how the vanishing gradient problem makes RNNs impractical.
In order to solve the vanishing gradient, we introduce the LSTM architecture, which has an inner memory part (also reffered to as memory cell), which does not have any activatoin - so vanishing gradient does not occure here. Please make it really clear, that the gating mechanism is truly controlled by the data.
At the final part of the lecture, we show that LSTM (and RNN, indeed) layers can be stacked onto each other with one or two directions. (uni- and bidirectional networks)
Outline
- Overview of sequential data
- Recurrent neural networks basics
- Backpropagation through time
- Vanishing gradient
- Long Short-Term Memory
- Stacking RNN/LSTM layers
Duration (Min) | Description |
---|---|
10 | Sequential data introduction |
15 | Recurrent neural networks and Backpropagation through time |
5 | Vanishing gradients in RNNs |
20 | LSTMs |
5 | Stacking RNN/LSTM layers |
5 | Conclusions |
Acknowledgements
Balint Gyires-Tóth (Budapest University of Technology and Economics)
The Human-Centered AI Masters programme was Co-Financed by the Connecting Europe Facility of the European Union Under Grant №CEF-TC-2020-1 Digital Skills 2020-EU-IA-0068.