Seminar in Deep Reinforcement Learning (FS 2020)
As a seminar participant, you are invited to attend all the talks and make a presentation. Your presentation should be in English. The presentation should last 35 minutes plus about 10 minutes of discussion.
Disclaimer: This is a seminar, we will focus on reasearch and skip most of the basics. If you feel like you cannot follow the discussions we invite you to check out this lecture or at least this talk.
Presentation & Discussion
The seminar will be exciting if the presentations are exciting. Here is a 2 page guideline how to do a great scientific presentation. Here are some additional guidelines: 1, 2 and 3. You can find further guidance to structure your talk as well as resources and ideas about what each topic should address here.
We further expect the presentation to motivate a lively discussion. We encourage discussions during and after the presentations as a main objective of this seminar. It may help discussions if you also try to be critical about the presented work. These are all scientific papers, but if you have been in science long enough...
Due to the COVID-19 outbreak we will continue the seminar in digital form, with discussions over zoom. All talks can be found here (you should have received the corresponding password per mail). If you still have a presentation upcoming, please have it ready as video the day before you would have held it in the seminar.
Your grade will mostly depend on your presentation. In addition, we also grade how actively you participate in the discussions throughout the whole semester. Further, there will be a programming challenge alongside the seminar, in which you can take part to improve your grade.
The idea of the coding challenge is that you once code a deep reinforcement learning algorithm from scratch. You can implement whichever algorithm you like and take inspiration from existing code libraries, however, your agent should be your implementation. The goal is to learn variations of the card game BlackJack. Further instructions and the environment to train on can be found here. A hand in of your code including a description of your results is expected by 19.05. such that we can discuss your implementations in the last seminar.
How To Sign Up
There will be two presentation per week, so there is a limited number of slots (topics) which will be assigned based on preference. If you have not received a mail so far, confirming your spot in the seminar, write a sentence regarding your background (courses, projects, ...) in deep reinforcement learning to Oliver Richter, to get a spot on the waiting list, in case someone cancels.
After You Got Your TopicWe established the following rules to ensure a high quality of the talks and hope that these will result in a good grade for you:
- At least 5 weeks before your talk: first meeting with your mentor (you need to read the assigned literature before this meeting).
- At least 3 weeks before your talk: meet your mentor to discuss the structure of your talk.
- At least 1 week before your talk: give the talk in front of your mentor who will provide feedback.
- At the presentation date we expect an electronic copy of your slides.
|25.02.2020||Zhao Ma, Constantin Le Clei||Deep Learning and Neural Architecture||Zhao Meng||[pdf][pptx]|
|03.03.2020||Alexander Nedergaard, Xiang Li||On Policy vs. Off-Policy vs. Batch-Policy Learning||Gino Brunner||[pdf][pdf]|
|10.03.2020||Samriddhi Jain, Yunke Ao||Deep reinforcement learning in continuous action spaces||Oliver Richter||[pdf][pptx]|
|17.03.2020||Cliff Li, Lucas Brunner||Hierarchical deep reinforcement learning||Béni Egressy||[pdf][pdf]|
|24.03.2020||Tommaso Macri, Jérémy Scheurer||Deep reinforcement learning and stochastic planning in games||Kevin Roth||[pdf][pdf]|
|31.03.2020||Adrian Hoffmann, Lee Sharkey||Model based vs. model free deep reinforcement learning||Pascal Weber||[pdf][pdf]|
|07.04.2020||Thomas Langerak, Sébastien Foucher||Deep reinforcement learning in partial observability||Lukas Faber||[pdf][pdf]|
|21.04.2020||Florian Turati, Orhan Saeedi||Multi-Armed Bandits||Ye Wang||[pdf][pdf]|
|28.04.2020||Felix Schur||Non-differentiable optimization||Henri Devillez||[pdf]|
|05.05.2020||Philippe Blatter, Steven Battilana||Meta-Learning||Giambattista Parascandolo||[pdf][pdf]|
|12.05.2020||-||No seminar - free time for coding challenge|
|19.05.2020||Ramon Witschi, Nicolas Zucchet||Continual Learning||Damian Pascual||[pdf][pdf]|
|26.05.2020||Oliver Richter||Discussion and Review|