Jonas Geiping(@jonasgeiping) 's Twitter Profileg
Jonas Geiping

@jonasgeiping

Machine Learning Research at the ELLIS Institute & MPI-IS // Investigating fundamental questions in Safety, Security, Privacy & Efficiency of modern ML

ID:1443639893083758598

linkhttps://jonasgeiping.github.io/ calendar_today30-09-2021 18:12:52

328 Tweets

1,6K Followers

612 Following

ELLIS Institute Tübingen(@ELLISInst_Tue) 's Twitter Profile Photo

🎙 The second episode of the Cyber Valley Podcast with our Principal Investigator Jonas Geiping is now available🚀Tune in to learn about Safety and Efficiency of AI.
👉 Check it out: institute-tue.ellis.eu/en/news/cyber-…

account_circle
ELLIS Institute Tübingen(@ELLISInst_Tue) 's Twitter Profile Photo

🎙 The first episode of the Cyber Valley Podcast with our Principal Investigators is now out! 🚀 @Orvieto_Antonio Podcast Research
🔗 Learn more: institute-tue.ellis.eu/en/news/cyber-…

🎙 The first episode of the @Cyber_Valley Podcast with our Principal Investigators is now out! 🚀 @Orvieto_Antonio #AIPodcast #AIResearch #AI 🔗 Learn more: institute-tue.ellis.eu/en/news/cyber-…
account_circle
Cyber Valley(@Cyber_Valley) 's Twitter Profile Photo

🚀 Get ready to dive deep into the captivating world of artificial intelligence with us!
The Cyber Valley Podcast coming soon...
🎙️ Don’t miss our unforgettable episodes, created in collaboration with the ELLIS Institute Tübingen Podcast Research Antonio Orvieto

account_circle
Yuxin Wen(@ywen99) 's Twitter Profile Photo

Very interesting paper!

Sharing a similar threat model but with a different focus, our recent paper (arxiv.org/abs/2404.01231), also titled Privacy Backdoors 🤗, achieves strong membership inference performance through poisoning pre-trained models.

account_circle
Jonas Geiping(@jonasgeiping) 's Twitter Profile Photo

How can we define and how can we compare style info in generated images?
For example when trying to figure out if a generated image copies an existing art style by accident?

Gowthami Somepalli led our recent investigation about new models and data for this purpose, summarized here:

account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Measuring Style Similarity in Diffusion Models

Generative models are now widely used by graphic designers and artists. Prior works have shown that these models remember and often replicate content from their training data during generation. Hence as their proliferation

Measuring Style Similarity in Diffusion Models Generative models are now widely used by graphic designers and artists. Prior works have shown that these models remember and often replicate content from their training data during generation. Hence as their proliferation
account_circle
Jonas Geiping(@jonasgeiping) 's Twitter Profile Photo

Had a very interesting chat with Sam Charrington for the The TWIML AI Podcast podcast recently, broadly about adversarial attacks on LLMs.

I'm long overdue to post a thread about this research and all the ways of coercing LLMs to do and reveal (almost) anything, I'll get to it tomorrow!

account_circle
Micah Goldblum(@micahgoldblum) 's Twitter Profile Photo

We show how to make data poisoning and backdoor attacks way more potent by synthesizing them from scratch with guided diffusion. 🧵 1/8

Paper: arxiv.org/abs/2403.16365

We show how to make data poisoning and backdoor attacks way more potent by synthesizing them from scratch with guided diffusion. 🧵 1/8 Paper: arxiv.org/abs/2403.16365
account_circle
Hamid(@hamid_kazemi22) 's Twitter Profile Photo

Excited to share our latest paper on CLIP model inversion uncovering surprising NSFW image occurrences and more! Heartfelt thanks to all my amazing collaborators! ♥️Atoosa Chegini Jonas Geiping Soheil Feizi Tom Goldstein
Paper: huggingface.co/papers/2403.02…

account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Coercing LLMs to do and reveal (almost) anything

It has recently been shown that adversarial attacks on large language models (LLMs) can 'jailbreak' the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger

Coercing LLMs to do and reveal (almost) anything It has recently been shown that adversarial attacks on large language models (LLMs) can 'jailbreak' the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger
account_circle
John Kirchenbauer(@jwkirchenbauer) 's Twitter Profile Photo

Happy to share that our work studying the reliability of watermarking techniques for AI generated text detection has been accepted at ! See y'all in Vienna 🇦🇹

account_circle
Neel Jain(@neeljain1717) 's Twitter Profile Photo

Want to invert your image to text?!?!?

Come check out our PEZ optimizer that makes optimizing hard prompts easy today at 5pm, #606. Also, we show how you can use PEZ to break content filter like those in Midjourney. 23

Want to invert your image to text?!?!? Come check out our PEZ optimizer that makes optimizing hard prompts easy today at 5pm, #606. Also, we show how you can use PEZ to break content filter like those in Midjourney. #NeurIPS #NeurIPS23
account_circle
Yangsibo Huang(@YangsiboHuang) 's Twitter Profile Photo

I am at now.

I am also on the academic job market, and humbled to be selected as a 2023 EECS Rising Star✨. I work on ML security, privacy & data transparency.

Appreciate any reposts & happy to chat in person! CV+statements: tinyurl.com/yangsibo

Find me at ⬇️

account_circle