Distributed Computing
ETH Zurich

Seminar in Deep Neural Networks (FS 2025)

Organization

When & Where: Tuesdays 10:15 - 12:00, ETZ E 9.
First seminar: 18.02.2025
Last seminar: 27.05.2025
Easter holidays: No seminar on 22.04.2025
Coordinators: Frédéric Berdoz, Benjamin Estermann, Roger Wattenhofer.

Background

This is a seminar, we will focus on recent reasearch and skip most of the basics. We assume that all participants are familiar with the fundamentals on deep neural networks. If you feel like you cannot follow the discussions, please check out this playlist, this lecture, the book by Francois Chollet on Deep Learning with Python, or any other lectures or books on deep neural networks. As a seminar participant, you are asked to attend all the talks and make a presentation.

Seminar Timeline

Preparation Timeline

Your Presentation

Grade

The most important part of your grade will be the quality of your presentation, both content and style. In addition, we grade how well you direct the discussions with the audience, during and after the presentation. Beside your final presentation, we also grade how actively you participate in the discussions throughout the semester and we value the quality of your mentor-only test presentation. Attendance is mandatory.

Papers

You can find the list of available papers here. Send us an ordered list (by preference) of up to 5 papers. We try to assign the papers first-come first-serve according to your preferences, while also taking into account the availability of the supervisor. To maximize the chance that you get a paper from your list, we recommend that you diversify the papers sufficiently. If you do not have any preference, still send us an e-mail and we will assign a paper to you.

Schedule

Date Presenter Title Mentor Slides
February 18 Frédéric Berdoz Introduction to Scientific Presentations - [pdf]
February 25 Qi Ma
Harald Semmelrock
InstructPix2Pix: Learning To Follow Image Editing Instructions
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Florian Grötschla
Benjamin Estermann
[pdf]
[pdf]
March 04 Jakob Hütteneder
Yanik Künzi
Guiding a Diffusion Model with a Bad Version of Itself
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99%
Till Aczel
Luca Lanzendörfer
[pdf]
[pdf]
March 11 Alexandre Elsig
Adam Suma
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Andreas Plesner
Samuel Dauncey
[pdf]
[pdf]
March 18 Niccolò Avogaro
Valentin Abadie
Vision Transformers Need Registers
Towards Foundation Models for Knowledge Graph Reasoning
Frédéric Berdoz
Florian Grötschla
[pdf]
[pdf]
March 25 Nandor Köfarago
Coralie Sage
High-Fidelity Audio Compression with Improved RVQGAN
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
Luca Lanzendörfer
Samuel Dauncey
[pdf]
[pdf]
April 01 Marian Schneider
Florian Zogaj
Byte Latent Transformer: Patches Scale Better Than Tokens
Multimodal Neurons in Artificial Neural Networks
Frédéric Berdoz
Andreas Plesner
TBA
April 08 Lukas Rüttgers Convolutional Differentiable Logic Gate Networks Till Aczel TBA
April 15 Giovanni De Muri
Jonas Mirlach
In-context Learning and Induction Heads
Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion
Samuel Dauncey
Till Aczel
TBA
April 22 - Easter Break - -
April 29 Hua Zun
Ménéilik Nouvellon
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
Training Large Language Models to Reason in a Continuous Latent Space
Florian Grötschla
Samuel Dauncey
TBA
May 06 Diego Arapovic
Nicola Farronato
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Erdös Goes Neural - an Unsupervised Learning Framework for Combinatorial Optimization on Graphs
Andreas Plesner
Saku Peltonen
TBA
May 13 Frederik Barba On Evaluating Adversarial Robustness Andreas Plesner TBA
May 20 Yiming Wang
Sebastian Brunner
nGPT: Normalized Transformer with Representation Learning on the Hypersphere
It’s Not What Machines Can Learn, It’s What We Cannot Teach
Andreas Plesner
Saku Peltonen
TBA
May 27 - TBA - -