Oreva Ahia(@orevaahia) 's Twitter Profileg
Oreva Ahia

@orevaahia

PhD student @uwcse | ex: AI/ML Research Intern @apple | Co-organizer @AISaturdayLagos | Researcher @MasakhaneNLP

ID:836314434

linkhttps://orevaahia.github.io/ calendar_today20-09-2012 20:39:09

1,7K Tweets

1,4K Followers

971 Following

Sebastian Ruder(@seb_ruder) 's Twitter Profile Photo

Ahia et al. (2023; aclanthology.org/2023.emnlp-mai…) observed that the same is true for current LLMs such as ChatGPT: They segment text in non-English languages into many more tokens and are thus much more costly to use in such languages.

They call this “double unfairness”: higher prices…

Ahia et al. (2023; aclanthology.org/2023.emnlp-mai…) observed that the same is true for current LLMs such as ChatGPT: They segment text in non-English languages into many more tokens and are thus much more costly to use in such languages. They call this “double unfairness”: higher prices…
account_circle
Alex Hägele(@haeggee) 's Twitter Profile Photo

If you haven't seen it yesterday, the Mixture-of-Depths is a really nice idea for dynamic compute
I decided to quickly code down a MoD block in a small GPT and try it out -- if you want to play with it too (and check correctness pls!), the code is here:
github.com/epfml/llm-base…

If you haven't seen it yesterday, the Mixture-of-Depths is a really nice idea for dynamic compute I decided to quickly code down a MoD block in a small GPT and try it out -- if you want to play with it too (and check correctness pls!), the code is here: github.com/epfml/llm-base…
account_circle
Aidan Gomez(@aidangomez) 's Twitter Profile Photo

⌘-R

Introducing Command-R, a model focused on scalability, RAG, and Tool Use. We've also released the weights for research use, we hope they're useful to the community!

txt.cohere.com/command-r/

account_circle
Weijia Shi(@WeijiaShi2) 's Twitter Profile Photo

Happy to share REPLUG🔌 is accepted to

We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs.
📄:…

Happy to share REPLUG🔌 is accepted to #NAACL2024 We introduce a retrieval-augmented LM framework that combines a frozen LM with a frozen/tunable retriever. Improving GPT-3 in language modeling & downstream tasks by prepending retrieved docs to LM inputs. 📄:…
account_circle
Inna Lin(@iwylin) 's Twitter Profile Photo

Ever find yourself delaying a conversation because you're nervous about how it might go?😩
We developed IMBUE, an -backed tool, to help you improve skills and manage , through simulation and just-in-time feedback.

Paper🔗: arxiv.org/pdf/2402.12556…

Ever find yourself delaying a conversation because you're nervous about how it might go?😩 We developed IMBUE, an #LLM-backed tool, to help you improve #communication skills and manage #emotions, through simulation and just-in-time feedback. Paper🔗: arxiv.org/pdf/2402.12556…
account_circle
Roy Xie(@ruoyuxyz) 's Twitter Profile Photo

📢 Excited to share that our paper has been accepted to main conference! 🌟Huge shoutout to amazing co-authors Oreva Ahia tsvetshop Antonis Anastasopoulos @EACL!

⛱️ See you in Mexico City! 🇲🇽

account_circle
Roy Xie(@ruoyuxyz) 's Twitter Profile Photo

✨ Can we use interpretability methods to extract linguistic features that characterize dialects❓

🎉 New preprint: arxiv.org/abs/2402.17914 (Roy Xie, Oreva Ahia, tsvetshop, Antonis Anastasopoulos @EACL)

👉Code & Data: github.com/ruoyuxie/inter…

🧵(1/6)

✨ Can we use interpretability methods to extract linguistic features that characterize dialects❓ 🎉 New preprint: arxiv.org/abs/2402.17914 (@ruoyuxyz, @orevaahia, @tsvetshop, @anas_ant) 👉Code & Data: github.com/ruoyuxie/inter… 🧵(1/6)
account_circle
Oreva Ahia(@orevaahia) 's Twitter Profile Photo

Check out our work on 'extracting distinguishing dialectal features via interpretable dialect classifiers' led by the amazing Roy Xie ! Accepted to

account_circle
Valentin Hofmann(@vjhofmann) 's Twitter Profile Photo

💥 New paper 💥

We discover a form of covert racism in LLMs that is triggered by dialect features alone, with massive harms for affected groups.

For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English.

🧵

💥 New paper 💥 We discover a form of covert racism in LLMs that is triggered by dialect features alone, with massive harms for affected groups. For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English. 🧵
account_circle
Sara Hooker(@sarahookr) 's Twitter Profile Photo

Today, I am very proud share what we have been working on for the last 14 months. ✨

Introducing Aya -- a new state-of-art for massively multilingual models. 🔥🎉

account_circle
Cohere For AI(@CohereForAI) 's Twitter Profile Photo

Today, we’re launching Aya, a new open-source, massively multilingual LLM & dataset to help support under-represented languages. Aya outperforms existing open-source models and covers 101 different languages – more than double covered by previous models.

cohere.com/research/aya

account_circle
Deep Learning Indaba(@DeepIndaba) 's Twitter Profile Photo

📢 We are coming to Senegal 🇸🇳! 📢

We are excited to announce that the Deep Leaning Indaba 2024 will be held in Dakar, Senegal at Université Amadou Mahtar MBOW from the 1st to the 7th of September.

Applications launch 🚀 is just around the corner ! Don't miss it 🔔


📢 We are coming to Senegal 🇸🇳! 📢 We are excited to announce that the Deep Leaning Indaba 2024 will be held in Dakar, Senegal at @MahtarMbow from the 1st to the 7th of September. Applications launch 🚀 is just around the corner ! Don't miss it 🔔 #DLI2024 #Indaba2024
account_circle
Iz Beltagy(@i_beltagy) 's Twitter Profile Photo

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data.

Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are…

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data. Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are…
account_circle
Jiacheng Liu (Gary)(@liujc1998) 's Twitter Profile Photo

It’s year 2024, and n-gram LMs are making a comeback!!

We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…

It’s year 2024, and n-gram LMs are making a comeback!! We develop infini-gram, an engine that efficiently processes n-gram queries with unbounded n and trillion-token corpora. It takes merely 20 milliseconds to count the frequency of an arbitrarily long n-gram in RedPajama (1.4T…
account_circle
Alisa Liu(@alisawuffles) 's Twitter Profile Photo

LMs are increasingly large🐘 and proprietary🔒 — what if we could “tune”🔧 them without accessing their internal weights?

Enter: proxy-tuning, which operates on only the *outputs* of LMs at decoding-time to achieve the effect of direct tuning!

📄: arxiv.org/abs/2401.08565 1/

LMs are increasingly large🐘 and proprietary🔒 — what if we could “tune”🔧 them without accessing their internal weights? Enter: proxy-tuning, which operates on only the *outputs* of LMs at decoding-time to achieve the effect of direct tuning! 📄: arxiv.org/abs/2401.08565 1/
account_circle
Terra Blevins(@TerraBlvns) 's Twitter Profile Photo

Expert language models go multilingual!
Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages!
Paper: arxiv.org/abs/2401.10440

Expert language models go multilingual! Introducing ✨X-ELM✨(Cross-lingual Expert Language Models), a multilingual generalization of the BTM paradigm to efficiently and fairly scale model capacity for many languages! Paper: arxiv.org/abs/2401.10440
account_circle
Melanie Sclar(@melaniesclar) 's Twitter Profile Photo

Happy to share that FormatSpread has been accepted to 🎉

Extremely grateful to my advisors Yejin Choi tsvetshop (as always!), and to Alane Suhr / suhr @ sigmoid . social who was the best collaborator I could have asked for during this project!

See you all in Vienna 😀

account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Tuning Language Models by Proxy

paper page: huggingface.co/papers/2401.08…

Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly…

Tuning Language Models by Proxy paper page: huggingface.co/papers/2401.08… Despite the general capabilities of large pretrained language models, they consistently benefit from further adaptation to better achieve desired behaviors. However, tuning these models has become increasingly…
account_circle