Shibani Santurkar (@ShibaniSan) Twitter Tweets • TwiCopy

Dear Embassy team, I am an Indian citizen studying in San Diego. I misplaced my passport while travelling from US to Greece via Canada. I am in contact with the consulate in Vancouver but desperate for help.
IndiainToronto
EmbassyIndiaDC Ppt
Indian Diplomacy
Dr. S. Jaishankar (Modi Ka Parivar)

thumb_up_off_alt17

chat_bubble_outline0

repeat12

shareShare

account_circle

OpenAI

@OpenAI

11 months ago

We're launching ten $100,000 grants for building prototypes of a democratic process for steering AI. Our goal is to fund experimentation with methods for gathering nuanced feedback from everyone on how AI should behave. Apply by June 24, 2023: openai.com/blog/democrati…

account_circle

Michael Bernstein

@msbernst

1 year ago

Very cool work by Tatsunori Hashimoto and colleagues: ask LLMs questions from Pew Surveys in order to measure whose opinions the model's outputs most closely reflects.

thumb_up_off_alt31

chat_bubble_outline0

repeat6

shareShare

account_circle

Percy Liang

@percyliang

1 year ago

I would not say that LMs *have* opinions, but they certainly *reflect* opinions represented in their training data. OpinionsQA is an LM benchmark with no right or wrong answers. It's rather the *distribution* of answers (and divergence from humans) that's interesting to study.

account_circle

Tatsunori Hashimoto

@tatsu_hashimoto

1 year ago

We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out: arxiv.org/abs/2303.17548

account_circle

Aleksander Madry

@aleks_madry

1 year ago

As ML models/datasets get bigger + more opaque, we need a *scalable* way to ask: where in the *data* did a prediction come from?

Presenting TRAK: data attribution with (significantly) better speed/efficacy tradeoffs:

w/ Sam Park Kristian Georgiev Andrew Ilyas Guillaume Leclerc 1/6

account_circle

Shibani Santurkar

@ShibaniSan

1 year ago

Auto data selection is comparable to expert curated data for pretraining LMs!

The leverage: n-gram overlap between pretrain and downstream predicts downstream acc well (r=0.89). But it's not the whole story - lots to uncover on the effect of pretrain data on downstream tasks.

thumb_up_off_alt38

chat_bubble_outline0

repeat8

shareShare

account_circle

Percy Liang

@percyliang

1 year ago

I have 6 fantastic students and post-docs who are on the academic job market this year. Here is a short thread summarizing their work along with one representative paper:

account_circle

Dimitris Tsipras

@tsiprasd

1 year ago

Our #NeurIPS2022 poster on in-context learning will be tomorrow (Thursday) at 4pm! Come talk to Shivam Garg and me at poster #928 🔥

thumb_up_off_alt37

chat_bubble_outline0

repeat6

shareShare

account_circle

rishi

@RishiBommasani

1 year ago

In August 2021, we launched CRFM with our report on foundation models.
15 months to the day, we now have launched HELM on the holistic evaluation of language models.

Blog: crfm.stanford.edu/2022/11/17/hel…
Website: crfm.stanford.edu/helm/v1.0/
Paper: arxiv.org/pdf/2211.09110…

1/n 🧵

account_circle

Percy Liang

@percyliang

1 year ago

Language models are becoming the foundation of language technologies, but when do they work or don’t work? In a new CRFM paper, we propose Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of LMs. Holistic evaluation includes three elements:

account_circle

Dimitris Tsipras

@tsiprasd

1 year ago

LLMs can do in-context learning, but are they 'learning' new tasks or just retrieving ones seen during training? w/ Shivam Garg, Percy Liang, & Greg Valiant we study a simpler Q:

Can we train Transformers to learn simple function classes in-context? 🧵
arxiv.org/abs/2208.01066

account_circle