Seminar in Deep Reinforcement Learning (FS 2020)

Organization

When & Where: Tuesdays 10:15 @ ETZ G 91
First seminar: 18.02.2018
Last seminar: 26.05.2018
Coordinators: Roger Wattenhofer & Oliver Richter

As a seminar participant, you are invited to attend all the talks and make a presentation. Your presentation should be in English. The presentation should last 35 minutes plus about 10 minutes of discussion.

Disclaimer: This is a seminar, we will focus on reasearch and skip most of the basics. If you feel like you cannot follow the discussions we invite you to check out this lecture or at least this talk.

Presentation & Discussion

The seminar will be exciting if the presentations are exciting. Here is a 2 page guideline how to do a great scientific presentation. Here are some additional guidelines: 1, 2 and 3. You can find further guidance to structure your talk as well as resources and ideas about what each topic should address here.

We further expect the presentation to motivate a lively discussion. We encourage discussions during and after the presentations as a main objective of this seminar. It may help discussions if you also try to be critical about the presented work. These are all scientific papers, but if you have been in science long enough...

COVID-19 Situation

Due to the COVID-19 outbreak we will continue the seminar in digital form, with discussions over zoom. All talks can be found here (you should have received the corresponding password per mail). If you still have a presentation upcoming, please have it ready as video the day before you would have held it in the seminar.

Grade

Your grade will mostly depend on your presentation. In addition, we also grade how actively you participate in the discussions throughout the whole semester. Further, there will be a programming challenge alongside the seminar, in which you can take part to improve your grade.

Coding Challenge

The idea of the coding challenge is that you once code a deep reinforcement learning algorithm from scratch. You can implement whichever algorithm you like and take inspiration from existing code libraries, however, your agent should be your implementation. The goal is to learn variations of the card game BlackJack. Further instructions and the environment to train on can be found here. A hand in of your code including a description of your results is expected by 19.05. such that we can discuss your implementations in the last seminar.

How To Sign Up

There will be two presentation per week, so there is a limited number of slots (topics) which will be assigned based on preference. If you have not received a mail so far, confirming your spot in the seminar, write a sentence regarding your background (courses, projects, ...) in deep reinforcement learning to Oliver Richter, to get a spot on the waiting list, in case someone cancels.

After You Got Your Topic

We established the following rules to ensure a high quality of the talks and hope that these will result in a good grade for you:

At least 5 weeks before your talk: first meeting with your mentor (you need to read the assigned literature before this meeting).
At least 3 weeks before your talk: meet your mentor to discuss the structure of your talk.
At least 1 week before your talk: give the talk in front of your mentor who will provide feedback.
At the presentation date we expect an electronic copy of your slides.

Schedule

Date	Presenter(s)	Title	Mentor	Slides
18.02.2020	Oliver Richter	Introduction		[pdf]
25.02.2020	Zhao Ma, Constantin Le Clei	Deep Learning and Neural Architecture	Zhao Meng	[pdf][pptx]
03.03.2020	Alexander Nedergaard, Xiang Li	On Policy vs. Off-Policy vs. Batch-Policy Learning	Gino Brunner	[pdf][pdf]
10.03.2020	Samriddhi Jain, Yunke Ao	Deep reinforcement learning in continuous action spaces	Oliver Richter	[pdf][pptx]
17.03.2020	Cliff Li, Lucas Brunner	Hierarchical deep reinforcement learning	Béni Egressy	[pdf][pdf]
24.03.2020	Tommaso Macri, Jérémy Scheurer	Deep reinforcement learning and stochastic planning in games	Kevin Roth	[pdf][pdf]
31.03.2020	Adrian Hoffmann, Lee Sharkey	Model based vs. model free deep reinforcement learning	Pascal Weber	[pdf][pdf]
07.04.2020	Thomas Langerak, Sébastien Foucher	Deep reinforcement learning in partial observability	Lukas Faber	[pdf][pdf]
21.04.2020	Florian Turati, Orhan Saeedi	Multi-Armed Bandits	Ye Wang	[pdf][pdf]
28.04.2020	Felix Schur	Non-differentiable optimization	Henri Devillez	[pdf]
05.05.2020	Philippe Blatter, Steven Battilana	Meta-Learning	Giambattista Parascandolo	[pdf][pdf]
12.05.2020	-	No seminar - free time for coding challenge
19.05.2020	Ramon Witschi, Nicolas Zucchet	Continual Learning	Damian Pascual	[pdf][pdf]
26.05.2020	Oliver Richter	Discussion and Review