Gregor Bachmann (@GregorBachmann1) Twitter Tweets • TwiCopy

Gregor Bachmann

@GregorBachmann1

+ Follow

I am a PhD student @ETH Zürich working on deep learning. MLP-pilled 💊.

https://t.co/yWdDEV6Z15

ID:1527256391806746624

calendar_today19-05-2022 11:54:49

84 Tweets

230 Followers

272 Following

Ayça Takmaz

1 month ago

📢Call for papers

If you work on open-vocabulary 3D scene understanding, consider submitting your work to our #CVPR2024 workshop OpenSUN3D!

⌛️Deadline: April 1st, 2024🏃
Only 1 week left to submit your 8-page full papers or 4-page abstracts!

More info: opensun3d.github.io

📢Call for papers If you work on open-vocabulary 3D scene understanding, consider submitting your work to our #CVPR2024 workshop OpenSUN3D! ⌛️Deadline: April 1st, 2024🏃 Only 1 week left to submit your 8-page full papers or 4-page abstracts! More info: opensun3d.github.io

thumb_up_off_alt43

chat_bubble_outline0

account_circle

Gregor Bachmann

@GregorBachmann1

2 months ago

From stochastic parrot 🦜 to Clever Hans 🐴? In our work with Vaishnavh Nagarajan we carefully analyse the debate surrounding next-token prediction and identify a new failure of LLMs due to teacher-forcing 👨🏻‍🎓! Check out our work arxiv.org/abs/2403.06963 and the linked thread!

From stochastic parrot 🦜 to Clever Hans 🐴? In our work with @_vaishnavh we carefully analyse the debate surrounding next-token prediction and identify a new failure of LLMs due to teacher-forcing 👨🏻‍🎓! Check out our work arxiv.org/abs/2403.06963 and the linked thread!

thumb_up_off_alt28

chat_bubble_outline0

account_circle

Vaishnavh Nagarajan

2 months ago

🗣️ “Next-token predictors can’t plan!” ⚔️ “False! Every distribution is expressible as product of next-token probabilities!” 🗣️

In work w/ Gregor Bachmann , we carefully flesh out this emerging, fragmented debate & articulate a key new failure. 🔴 arxiv.org/abs/2403.06963

thumb_up_off_alt384

chat_bubble_outline0

account_circle

Lorenzo Noci

2 months ago

Why in neural networks the learning rate can transfer from small to large models (both in width and depth)? It turns out that the sharpness dynamics can explain it. Check out our new work! arxiv.org/abs/2402.17457

w/ Alex Meterez (co-first),

Antonio Orvieto and T. Hofmann

thumb_up_off_alt142

chat_bubble_outline0

account_circle

Dimitri von Rütte

2 months ago

🚨📜 Announcing our latest work on LLM interpretability: We are able to control a model's humor, creativity, quality, truthfulness, and compliance by applying concept vectors to its hidden neural activations. 🧵
arxiv.org/abs/2402.14433

🚨📜 Announcing our latest work on LLM interpretability: We are able to control a model's humor, creativity, quality, truthfulness, and compliance by applying concept vectors to its hidden neural activations. 🧵 arxiv.org/abs/2402.14433

thumb_up_off_alt133

chat_bubble_outline0

account_circle

Gregor Bachmann

@GregorBachmann1

3 months ago

Meanwhile I'm trying to finetune GPT2 to solve chess puzzles... There's definitely room for improvement 😅

Meanwhile I'm trying to finetune GPT2 to solve chess puzzles... There's definitely room for improvement 😅

thumb_up_off_alt11

chat_bubble_outline0

account_circle

Ayça Takmaz

3 months ago

Our workshop ☀️OpenSUN 3D🌍 on Open-Vocabulary 3D Scene Understanding will be held in conjunction with #CVPR2024 2024 at Seattle!

Call for papers is out!

thumb_up_off_alt24

chat_bubble_outline0

account_circle

Dimitri von Rütte

3 months ago

🚨 Calling on all FABRIC users! We need your help to learn about how you’ve been using FABRIC. Help us by taking 5 minutes to fill out the survey.

Haven’t tried FABRIC yet? Just try it using our Gradio demo! ✨👨‍🎨

📊 Survey: forms.gle/aMWLDW8xvyhkLb…
👾 Demo:

thumb_up_off_alt29

chat_bubble_outline0

account_circle

Enis Simsar

5 months ago

🌟 Excited to present LIME, localized image editing via cross-attention regularization without extra data, re-training, or fine-tuning!

Collaboration with Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari

📄 Paper: arxiv.org/pdf/2312.09256
🔗 Project: enisimsar.github.io/LIME

thumb_up_off_alt24

chat_bubble_outline0

account_circle

Gregor Bachmann

@GregorBachmann1

5 months ago

I’ll be presenting 'Scaling MLPs' at #NeurIPS2023 , tomorrow (Wed) at 10:45am!
Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

I’ll be presenting 'Scaling MLPs' at #NeurIPS2023, tomorrow (Wed) at 10:45am! Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

thumb_up_off_alt59

chat_bubble_outline0

account_circle

Ayça Takmaz

5 months ago

Today Elisabetta Fedele and I will present our work OpenMask3D at NeurIPS Conference 🎷

Visit our poster to learn more about OpenMask3D or to chat with us!

📍 Great Hall & Hall B1+B2 (level 1) #906
🕰️ 10:45-12:45
🌎 openmask3d.github.io

Francis Engelmann Federico Tombari Marc Pollefeys

Today @efedele16 and I will present our work OpenMask3D at @NeurIPSConf 🎷 Visit our poster to learn more about OpenMask3D or to chat with us! 📍 Great Hall & Hall B1+B2 (level 1) #906 🕰️ 10:45-12:45 🌎 openmask3d.github.io @FrancisEngelman @fedassa @mapo1

thumb_up_off_alt66

chat_bubble_outline0

account_circle

Yuhui Ding

5 months ago

Inspired by recent breakthroughs in SSMs, we propose a new architecture, Graph Recurrent Encoding by Distance (GRED), for long-range graph representation learning: arxiv.org/abs/2312.01538
with Antonio Orvieto, Bobby and Thomas Hofmann (1/4)

thumb_up_off_alt144

chat_bubble_outline0

account_circle

Gregor Bachmann

@GregorBachmann1

6 months ago

Want to train a compute-optimal model but get there faster?
Try shape-adaptive training and follow the optimal curve for different “shape' configurations 🏎️💨!
Check-out Sotiris Anagnostidis and my work for more!
📝arxiv.org/abs/2311.03233

Want to train a compute-optimal model but get there faster? Try shape-adaptive training and follow the optimal curve for different “shape' configurations 🏎️💨! Check-out @SAnagnostidis and my work for more! 📝arxiv.org/abs/2311.03233

thumb_up_off_alt23

chat_bubble_outline0

account_circle

Sotiris Anagnostidis

6 months ago

Scaling laws predict the minimum required amount of compute to reach a given performance, but can we do better? Yes, if we allow for a flexible 'shape' of the model! 🤸

Scaling laws predict the minimum required amount of compute to reach a given performance, but can we do better? Yes, if we allow for a flexible 'shape' of the model! 🤸

thumb_up_off_alt31

chat_bubble_outline0

account_circle

Vaishnavh Nagarajan

7 months ago

Isn’t it arbitrary that a Transformer must produce the K+1'th token by attending to only K vectors in each layer?

In work led by Sachin Goyal, we explore a way to break this rule: by appending copies of a *single* “pause” token to delay the output.

arxiv.org/abs/2310.02226 1/

Isn’t it arbitrary that a Transformer must produce the K+1'th token by attending to only K vectors in each layer? In work led by @goyalsachin007, we explore a way to break this rule: by appending copies of a *single* “pause” token to delay the output. arxiv.org/abs/2310.02226 1/

thumb_up_off_alt89

chat_bubble_outline0

account_circle

Ayça Takmaz

7 months ago

Join us for our OpenMask3D demo at the Google AI booth today!

thumb_up_off_alt14

chat_bubble_outline0

account_circle

Vaishaal Shankar

7 months ago

I had an argument with Preetum Nakkiran about MLPs 4 years ago. He said with enough data + compute the MLP/ConvNet gap would go to 0. I was convolution-pilled and convinced this wasn't possible. He was right: arxiv.org/abs/2306.13575

thumb_up_off_alt140

chat_bubble_outline0

account_circle

Ayça Takmaz

7 months ago

Our ICCV Workshop ☀️OpenSUN3D🌍 on Open-Vocabulary 3D Scene Understanding will take place tomorrow afternoon at #ICCV2023!

Date: October 3rd, Tuesday
Time: 13:20-17:30
Location: E06

More info: OpenSUN3D.github.io

Our ICCV Workshop ☀️OpenSUN3D🌍 on Open-Vocabulary 3D Scene Understanding will take place tomorrow afternoon at @ICCVConference! Date: October 3rd, Tuesday Time: 13:20-17:30 Location: E06 More info: OpenSUN3D.github.io

thumb_up_off_alt33

chat_bubble_outline0

account_circle

Ayça Takmaz

7 months ago

We will be at #ICCV2023 to present Human3D 🧑‍🤝‍🧑!

📌Poster: Wednesday, October 4th - 10:30-12:30, Paper ID 4949 - Room 'Nord' - 103

Project page: human-3d.github.io
Code & data: github.com/human-3d

Jonas Schult Irem Kaftan Mertcan Akçay Francis Engelmann Siyu Tang @VLG-ETHZ

thumb_up_off_alt43

chat_bubble_outline0

account_circle