Seminar in Deep Neural Networks (FS 2025)
Organization
When & Where: Tuesdays 10:15 - 12:00, ETZ E 9.
First seminar: 18.02.2025
Last seminar: 27.05.2025
Easter holidays: No seminar on 22.04.2025
Coordinators: Frédéric Berdoz, Benjamin Estermann, Roger Wattenhofer.
Background
This is a seminar, we will focus on recent reasearch and skip most of the basics. We assume that all participants are familiar with the fundamentals on deep neural networks. If you feel like you cannot follow the discussions, please check out this playlist, this lecture, the book by Francois Chollet on Deep Learning with Python, or any other lectures or books on deep neural networks. As a seminar participant, you are asked to attend all the talks and make a presentation.
Seminar Timeline
- When you sign up to the seminar on myStudies, you will automatically be placed on a waiting list.
- Before the start of the semester, we will publish a list of papers (you will be notified by mail). You can then tell us what your preferences are.
- We will assign the papers (and consequently grant a definitive spot in the seminar) on a first-come first-serve principle, based on how quickly you submit these preferences. The presentations will also be scheduled in this order. Further, you will be assigned a mentor that is familiar with the topic and helps you with the preparation of your presentation.
- In the first week of the semester, there will be no presentations. Instead, we will give an introduction to the seminar and some tips on scientific presentations.
- After that, every week two students will present their respective papers.
Preparation Timeline
- Around 4 weeks before your talk: first meeting with your mentor where you discuss the structure of the talk.
- 4 to 1 week before your talk: you meet with your mentor as often as both parties find necessary to be making progress.
- At least 1 week before your talk: The presentation is ready, and you will present your presentation (as a test run) to your mentor only.
- Your mentor will get a copy of your test run presentation slides. (These slides will not become public, but they may influence your seminar grade.)
- Your mentor will give you feedback, and you are supposed to update your final presentation based on this feedback.
- Please send us your slides at least on the day before your presentation.
Your Presentation
- Your presentation should be 30 minutes long.
- After your presentation, you should organize a lively discussion about your presentation, for up to 15 minutes.
- It may help discussions if you also try to be critical about the presented work.
- Your presentation should take into account these presentation guidelines.
- Beyond these guidelines, you may find other useful tips about good scientific presentations online, for instance here, or here.
- All work copy/pasted from others (figures, explanations, examples, or equations) must be properly referenced.
Grade
The most important part of your grade will be the quality of your presentation, both content and style. In addition, we grade how well you direct the discussions with the audience, during and after the presentation. Also, we also grade how actively you participate in the discussions throughout the semester. And finally, we also value the quality of your mentor-only test presentation.
Papers
You can find the list of available papers here. Send us an ordered list (by preference) of up to 5 papers. We try to assign the papers first-come first-serve according to your preferences, while also taking into account the availability of the supervisor. To maximize the chance that you get a paper from your list, we recommend that you diversify the papers sufficiently. If you do not have any preference, still send us an e-mail and we will assign a paper to you.
Schedule
Date | Presenter | Title | Mentor | Slides |
---|---|---|---|---|
February 18 | Frédéric Berdoz | Introduction to Scientific Presentations | - | [pdf] |
February 25 | Qi Ma Harald Semmelrock |
InstructPix2Pix: Learning To Follow Image Editing Instructions rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking |
Florian Grötschla Benjamin Estermann |
TBA |
March 04 | Jakob Hütteneder Yanik Künzi |
Guiding a Diffusion Model with a Bad Version of Itself Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% |
Till Aczel Luca Lanzendörfer |
TBA |
March 11 | Alexandre Elsig Adam Suma |
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning |
Andreas Plesner Samuel Dauncey |
TBA |
March 18 | Niccolò Avogaro Valentin Abadie |
Vision Transformers Need Registers Towards Foundation Models for Knowledge Graph Reasoning |
Frédéric Berdoz Florian Grötschla |
TBA |
March 25 | Nandor Köfarago Coralie Sage |
High-Fidelity Audio Compression with Improved RVQGAN Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet |
Luca Lanzendörfer Samuel Dauncey |
TBA |
April 01 | Anna Kosovskaia Florian Zogaj |
You Only Cache Once: Decoder-Decoder Architectures for Language Models Multimodal Neurons in Artificial Neural Networks |
Benjamin Estermann Andreas Plesner |
TBA |
April 08 | Diego Arapovic Lukas Rüttgers |
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning Convolutional Differentiable Logic Gate Networks |
Andreas Plesner Till Aczel |
TBA |
April 15 | Giovanni De Muri Jonas Mirlach |
In-context Learning and Induction Heads Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion |
Samuel Dauncey Till Aczel |
TBA |
April 22 | - | Easter Break | - | - |
April 29 | Hua Zun Lyubov Dudchenko |
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision Training Large Language Models to Reason in a Continuous Latent Space |
Florian Grötschla Samuel Dauncey |
TBA |
May 06 | Marian Schneider Nicola Farronato |
Byte Latent Transformer: Patches Scale Better Than Tokens Erdös Goes Neural - an Unsupervised Learning Framework for Combinatorial Optimization on Graphs |
Frédéric Berdoz Saku Peltonen |
TBA |
May 13 | Pepijn Cobben Frederik Barba |
Components Beat Patches: Eigenvector Masking For Visual Representation Learning On Evaluating Adversarial Robustness |
Till Aczel Andreas Plesner |
TBA |
May 20 | Yiming Wang Sebastian Brunner |
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models It’s Not What Machines Can Learn, It’s What We Cannot Teach |
Frédéric Berdoz Saku Peltonen |
TBA |
May 27 | - | TBA | - | - |