Distributed Computing
ETH Zurich

Seminar in Deep Neural Networks (FS 2025)

Organization

When & Where: Tuesdays 10:15 - 12:00, ETZ E 9.
First seminar: 18.02.2025
Last seminar: 27.05.2025
Easter holidays: No seminar on 22.04.2025
Coordinators: Frédéric Berdoz, Benjamin Estermann, Roger Wattenhofer.

Background

This is a seminar, we will focus on recent reasearch and skip most of the basics. We assume that all participants are familiar with the fundamentals on deep neural networks. If you feel like you cannot follow the discussions, please check out this playlist, this lecture, the book by Francois Chollet on Deep Learning with Python, or any other lectures or books on deep neural networks. As a seminar participant, you are asked to attend all the talks and make a presentation.

Seminar Timeline

Preparation Timeline

Your Presentation

Grade

The most important part of your grade will be the quality of your presentation, both content and style. In addition, we grade how well you direct the discussions with the audience, during and after the presentation. Also, we also grade how actively you participate in the discussions throughout the semester. And finally, we also value the quality of your mentor-only test presentation.

Papers

You can find the list of available papers here. Send us an ordered list (by preference) of up to 5 papers. We try to assign the papers first-come first-serve according to your preferences, while also taking into account the availability of the supervisor. To maximize the chance that you get a paper from your list, we recommend that you diversify the papers sufficiently. If you do not have any preference, still send us an e-mail and we will assign a paper to you.

Schedule

Date Presenter Title Mentor Slides
February 18 Frédéric Berdoz Introduction to Scientific Presentations - [pdf]
February 25 Qi Ma
Harald Semmelrock
InstructPix2Pix: Learning To Follow Image Editing Instructions
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Florian Grötschla
Benjamin Estermann
TBA
March 04 Jakob Hütteneder
Yanik Künzi
Guiding a Diffusion Model with a Bad Version of Itself
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Till Aczel
Luca Lanzendörfer
TBA
March 11 Alexandre Elsig
Adam Suma
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Andreas Plesner
Samuel Dauncey
TBA
March 18 Niccolò Avogaro
Valentin Abadie
Vision Transformers Need Registers
Towards Foundation Models for Knowledge Graph Reasoning
Frédéric Berdoz
Florian Grötschla
TBA
March 25 Nandor Köfarago
Coralie Sage
High-Fidelity Audio Compression with Improved RVQGAN
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Luca Lanzendörfer
Samuel Dauncey
TBA
April 01 Anna Kosovskaia
Florian Zogaj
You Only Cache Once: Decoder-Decoder Architectures for Language Models
Multimodal Neurons in Artificial Neural Networks
Benjamin Estermann
Andreas Plesner
TBA
April 08 Diego Arapovic
Lukas Rüttgers
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Convolutional Differentiable Logic Gate Networks
Andreas Plesner
Till Aczel
TBA
April 15 Giovanni De Muri
Jonas Mirlach
In-context Learning and Induction Heads
Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion
Samuel Dauncey
Till Aczel
TBA
April 22 - Easter Break - -
April 29 Hua Zun
Lyubov Dudchenko
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
Training Large Language Models to Reason in a Continuous Latent Space
Florian Grötschla
Samuel Dauncey
TBA
May 06 Marian Schneider
Nicola Farronato
Byte Latent Transformer: Patches Scale Better Than Tokens
Erdös Goes Neural - an Unsupervised Learning Framework for Combinatorial Optimization on Graphs
Frédéric Berdoz
Saku Peltonen
TBA
May 13 Pepijn Cobben
Frederik Barba
Components Beat Patches: Eigenvector Masking For Visual Representation Learning
On Evaluating Adversarial Robustness
Till Aczel
Andreas Plesner
TBA
May 20 Yiming Wang
Sebastian Brunner
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
It’s Not What Machines Can Learn, It’s What We Cannot Teach
Frédéric Berdoz
Saku Peltonen
TBA
May 27 - TBA - -