Oreva Ahia (@orevaahia) Twitter Tweets • TwiCopy

Oreva Ahia

@orevaahia

+ Follow

PhD student @uwcse | ex: AI/ML Research Intern @apple | Co-organizer @AISaturdayLagos | Researcher @MasakhaneNLP

ID:836314434

linkhttps://orevaahia.github.io/ calendar_today20-09-2012 20:39:09

1,7K Tweets

1,4K Followers

971 Following

Sebastian Ruder

3 weeks ago

Ahia et al. (2023; aclanthology.org/2023.emnlp-mai…) observed that the same is true for current LLMs such as ChatGPT: They segment text in non-English languages into many more tokens and are thus much more costly to use in such languages.

They call this “double unfairness”: higher prices…

Ahia et al. (2023; aclanthology.org/2023.emnlp-mai…) observed that the same is true for current LLMs such as ChatGPT: They segment text in non-English languages into many more tokens and are thus much more costly to use in such languages. They call this “double unfairness”: higher prices…

thumb_up_off_alt27

chat_bubble_outline0

account_circle

Alex Hägele

3 weeks ago

If you haven't seen it yesterday, the Mixture-of-Depths is a really nice idea for dynamic compute
I decided to quickly code down a MoD block in a small GPT and try it out -- if you want to play with it too (and check correctness pls!), the code is here:
github.com/epfml/llm-base…

If you haven't seen it yesterday, the Mixture-of-Depths is a really nice idea for dynamic compute I decided to quickly code down a MoD block in a small GPT and try it out -- if you want to play with it too (and check correctness pls!), the code is here: github.com/epfml/llm-base…

thumb_up_off_alt242

chat_bubble_outline0

account_circle

Suchin Gururangan

3 weeks ago

Shoutout to Oreva Ahia et al who wrote a great paper that revealed this issue! arxiv.org/abs/2305.13707

thumb_up_off_alt70

chat_bubble_outline0

account_circle

Aidan Gomez

1 month ago

⌘-R

Introducing Command-R, a model focused on scalability, RAG, and Tool Use. We've also released the weights for research use, we hope they're useful to the community!

txt.cohere.com/command-r/

thumb_up_off_alt1,0K

chat_bubble_outline0

account_circle

Weijia Shi

1 month ago

Happy to share REPLUG🔌 is accepted to #NAACL2024

We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs.
📄:…

Happy to share REPLUG🔌 is accepted to #NAACL2024 We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs. 📄:…

thumb_up_off_alt274

chat_bubble_outline0

account_circle

Inna Lin

1 month ago

Ever find yourself delaying a conversation because you're nervous about how it might go?😩
We developed IMBUE, an #LLM -backed tool, to help you improve #communication skills and manage #emotions , through simulation and just-in-time feedback.

Paper🔗: arxiv.org/pdf/2402.12556…

Ever find yourself delaying a conversation because you're nervous about how it might go?😩 We developed IMBUE, an #LLM-backed tool, to help you improve #communication skills and manage #emotions, through simulation and just-in-time feedback. Paper🔗: arxiv.org/pdf/2402.12556…

thumb_up_off_alt122

chat_bubble_outline0

account_circle

Roy Xie

1 month ago

📢 Excited to share that our paper has been accepted to #NAACL2024 main conference! 🌟Huge shoutout to amazing co-authors Oreva Ahia tsvetshop Antonis Anastasopoulos @EACL!

⛱️ See you in Mexico City! 🇲🇽 #NLProc

thumb_up_off_alt19

chat_bubble_outline0

account_circle

Roy Xie

1 month ago

✨ Can we use interpretability methods to extract linguistic features that characterize dialects❓

🎉 New preprint: arxiv.org/abs/2402.17914 (Roy Xie, Oreva Ahia, tsvetshop, Antonis Anastasopoulos @EACL)

👉Code & Data: github.com/ruoyuxie/inter…

🧵(1/6)

✨ Can we use interpretability methods to extract linguistic features that characterize dialects❓ 🎉 New preprint: arxiv.org/abs/2402.17914 (@ruoyuxyz, @orevaahia, @tsvetshop, @anas_ant) 👉Code & Data: github.com/ruoyuxie/inter… 🧵(1/6)

thumb_up_off_alt23

chat_bubble_outline0

account_circle

Oreva Ahia

1 month ago

Check out our work on 'extracting distinguishing dialectal features via interpretable dialect classifiers' led by the amazing Roy Xie ! Accepted to #NAACL2024

thumb_up_off_alt45

chat_bubble_outline0

account_circle

Valentin Hofmann

1 month ago

💥 New paper 💥

We discover a form of covert racism in LLMs that is triggered by dialect features alone, with massive harms for affected groups.

For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English.

🧵

💥 New paper 💥 We discover a form of covert racism in LLMs that is triggered by dialect features alone, with massive harms for affected groups. For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English. 🧵

thumb_up_off_alt1,6K

chat_bubble_outline0

account_circle

Sara Hooker

2 months ago

Today, I am very proud share what we have been working on for the last 14 months. ✨

Introducing Aya -- a new state-of-art for massively multilingual models. 🔥🎉

thumb_up_off_alt1,0K

chat_bubble_outline0

account_circle

Fego Ahia

2 months ago

atonement / mannequin

atonement / mannequin

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Cohere For AI

2 months ago

Today, we’re launching Aya, a new open-source, massively multilingual LLM & dataset to help support under-represented languages. Aya outperforms existing open-source models and covers 101 different languages – more than double covered by previous models.

cohere.com/research/aya

thumb_up_off_alt1,3K

chat_bubble_outline0

account_circle

Deep Learning Indaba

3 months ago

📢 We are coming to Senegal 🇸🇳! 📢

We are excited to announce that the Deep Leaning Indaba 2024 will be held in Dakar, Senegal at Université Amadou Mahtar MBOW from the 1st to the 7th of September.

Applications launch 🚀 is just around the corner ! Don't miss it 🔔

#DLI2024
#Indaba2024

📢 We are coming to Senegal 🇸🇳! 📢 We are excited to announce that the Deep Leaning Indaba 2024 will be held in Dakar, Senegal at @MahtarMbow from the 1st to the 7th of September. Applications launch 🚀 is just around the corner ! Don't miss it 🔔 #DLI2024 #Indaba2024

thumb_up_off_alt381

chat_bubble_outline0

account_circle

Iz Beltagy

3 months ago

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data.

Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are…

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data. Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are…

thumb_up_off_alt317

chat_bubble_outline0

account_circle

Jiacheng Liu (Gary)

3 months ago

It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…

thumb_up_off_alt378

chat_bubble_outline0

account_circle

Alisa Liu

3 months ago

LMs are increasingly large🐘 and proprietary🔒 — what if we could “tune”🔧 them without accessing their internal weights?

Enter: proxy-tuning, which operates on only the *outputs* of LMs at decoding-time to achieve the effect of direct tuning!

📄: arxiv.org/abs/2401.08565 1/

LMs are increasingly large🐘 and proprietary🔒 — what if we could “tune”🔧 them without accessing their internal weights? Enter: proxy-tuning, which operates on only the *outputs* of LMs at decoding-time to achieve the effect of direct tuning! 📄: arxiv.org/abs/2401.08565 1/

thumb_up_off_alt343

chat_bubble_outline0

account_circle

Terra Blevins

3 months ago

Expert language models go multilingual!
Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages!
Paper: arxiv.org/abs/2401.10440

Expert language models go multilingual! Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages! Paper: arxiv.org/abs/2401.10440

thumb_up_off_alt170

chat_bubble_outline0

account_circle

Melanie Sclar

3 months ago

Happy to share that FormatSpread has been accepted to #ICLR2024 🎉

Extremely grateful to my advisors Yejin Choi tsvetshop (as always!), and to Alane Suhr / suhr @ sigmoid . social who was the best collaborator I could have asked for during this project!

See you all in Vienna 😀

thumb_up_off_alt70

chat_bubble_outline0

account_circle

AK

3 months ago

Tuning Language Models by Proxy

paper page: huggingface.co/papers/2401.08…

Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly…

Tuning Language Models by Proxy paper page: huggingface.co/papers/2401.08… Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly…

thumb_up_off_alt170

chat_bubble_outline0

account_circle

fpc ok :)