Shibani Santurkar(@ShibaniSan) 's Twitter Profileg
Shibani Santurkar

@ShibaniSan

@OpenAI

ID:2789364932

linkhttps://shibanisanturkar.com/ calendar_today04-09-2014 07:59:16

146 Tweets

2,9K Followers

184 Following

aditya sant(@adi7sant) 's Twitter Profile Photo

Dear Embassy team, I am an Indian citizen studying in San Diego. I misplaced my passport while travelling from US to Greece via Canada. I am in contact with the consulate in Vancouver but desperate for help.
IndiainToronto
EmbassyIndiaDC Ppt
Indian Diplomacy
Dr. S. Jaishankar (Modi Ka Parivar)

account_circle
OpenAI(@OpenAI) 's Twitter Profile Photo

We're launching ten $100,000 grants for building prototypes of a democratic process for steering AI. Our goal is to fund experimentation with methods for gathering nuanced feedback from everyone on how AI should behave. Apply by June 24, 2023: openai.com/blog/democrati…

account_circle
Michael Bernstein(@msbernst) 's Twitter Profile Photo

Very cool work by Tatsunori Hashimoto and colleagues: ask LLMs questions from Pew Surveys in order to measure whose opinions the model's outputs most closely reflects.

account_circle
Percy Liang(@percyliang) 's Twitter Profile Photo

I would not say that LMs *have* opinions, but they certainly *reflect* opinions represented in their training data. OpinionsQA is an LM benchmark with no right or wrong answers. It's rather the *distribution* of answers (and divergence from humans) that's interesting to study.

account_circle
Tatsunori Hashimoto(@tatsu_hashimoto) 's Twitter Profile Photo

We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out: arxiv.org/abs/2303.17548

We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out: arxiv.org/abs/2303.17548
account_circle
Aleksander Madry(@aleks_madry) 's Twitter Profile Photo

As ML models/datasets get bigger + more opaque, we need a *scalable* way to ask: where in the *data* did a prediction come from?

Presenting TRAK: data attribution with (significantly) better speed/efficacy tradeoffs:

w/ Sam Park Kristian Georgiev Andrew Ilyas Guillaume Leclerc 1/6

As ML models/datasets get bigger + more opaque, we need a *scalable* way to ask: where in the *data* did a prediction come from? Presenting TRAK: data attribution with (significantly) better speed/efficacy tradeoffs: w/ @smsampark @kris_georgiev1 @andrew_ilyas @gpoleclerc 1/6
account_circle
Shibani Santurkar(@ShibaniSan) 's Twitter Profile Photo

Auto data selection is comparable to expert curated data for pretraining LMs!

The leverage: n-gram overlap between pretrain and downstream predicts downstream acc well (r=0.89). But it's not the whole story - lots to uncover on the effect of pretrain data on downstream tasks.

account_circle
Percy Liang(@percyliang) 's Twitter Profile Photo

I have 6 fantastic students and post-docs who are on the academic job market this year. Here is a short thread summarizing their work along with one representative paper:

account_circle
rishi(@RishiBommasani) 's Twitter Profile Photo

In August 2021, we launched CRFM with our report on foundation models.
15 months to the day, we now have launched HELM on the holistic evaluation of language models.

Blog: crfm.stanford.edu/2022/11/17/hel…
Website: crfm.stanford.edu/helm/v1.0/
Paper: arxiv.org/pdf/2211.09110…

1/n 🧵

In August 2021, we launched CRFM with our report on foundation models. 15 months to the day, we now have launched HELM on the holistic evaluation of language models. Blog: crfm.stanford.edu/2022/11/17/hel… Website: crfm.stanford.edu/helm/v1.0/ Paper: arxiv.org/pdf/2211.09110… 1/n 🧵
account_circle
Percy Liang(@percyliang) 's Twitter Profile Photo

Language models are becoming the foundation of language technologies, but when do they work or don’t work? In a new CRFM paper, we propose Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of LMs. Holistic evaluation includes three elements:

account_circle
Dimitris Tsipras(@tsiprasd) 's Twitter Profile Photo

LLMs can do in-context learning, but are they 'learning' new tasks or just retrieving ones seen during training? w/ Shivam Garg, Percy Liang, & Greg Valiant we study a simpler Q:

Can we train Transformers to learn simple function classes in-context? 🧵
arxiv.org/abs/2208.01066

account_circle