Tirthankar Ghosal(@TirthankarSlg) 's Twitter Profileg
Tirthankar Ghosal

@TirthankarSlg

Scientist @ORNL #NLProc #LLMs #peerreview #SDProc Editor @SIGIRForum Org. #AutoMin2023 @SDProc @wiesp_nlp AC @IJCAIconf @emnlpmeeting Prevly @ufal_cuni @IITPAT

ID:817603403677253633

linkhttps://member.acm.org/~tghosal calendar_today07-01-2017 05:26:56

3,4K Tweets

513 Followers

1,3K Following

naklecha(@naklecha) 's Twitter Profile Photo

today, im excited to release a repository that implements llama3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained. have fun :)

github.com/naklecha/llama…

today, im excited to release a repository that implements llama3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained. have fun :) github.com/naklecha/llama…
account_circle
Sebastian Raschka(@rasbt) 's Twitter Profile Photo

A suggestion for an effective 11-step LLM summer study plan:
1) Read* Chapters 1 and 2 on implementing the data loading pipeline (manning.com/books/build-a-… & github.com/rasbt/LLMs-fro…).
2) Watch Karpathy's video on training a BPE tokenizer from scratch (youtube.com/watch?v=zduSFx…).
3)

account_circle
Cameron R. Wolfe, Ph.D.(@cwolferesearch) 's Twitter Profile Photo

Prompt engineering is one of the most rapidly-evolving research topics in AI, but we can (roughly) group recent research on this topic into four categories…

(1) Reasoning: Simple prompting techniques are effective for many problems, but more sophisticated strategies are

Prompt engineering is one of the most rapidly-evolving research topics in AI, but we can (roughly) group recent research on this topic into four categories… (1) Reasoning: Simple prompting techniques are effective for many problems, but more sophisticated strategies are
account_circle
Sasha Rush(@srush_nlp) 's Twitter Profile Photo

Talk: 'OLMo: Findings of Training an Open LM' from Hanna Hajirshizi at AI2 from OSGAI.

Extremely interesting overview of the 4 parts (Data, Training, Adaptation, Eval) of the OLMo open LLM project. Rare insight into how these processes work at scale.

youtube.com/watch?v=qFZbu2…

account_circle
NIK(@ns123abc) 's Twitter Profile Photo

“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out

“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out
account_circle
Min Choi(@minchoi) 's Twitter Profile Photo

Llama 3 just changed the LLM game.

People are finding wild use cases at GPT-4 level. There is a massive movement in the open source community.

10 examples (and ways to use Llama 3):

Llama 3 just changed the LLM game. People are finding wild use cases at GPT-4 level. There is a massive movement in the open source community. 10 examples (and ways to use Llama 3):
account_circle
Thomas Wolf(@Thom_Wolf) 's Twitter Profile Photo

Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes??

Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation

account_circle
Rosa Zhou(@qiaoyu_rosa) 's Twitter Profile Photo

🔥Thrilled to introduce HypoGeniC: Hypothesis Generation with Large Language Models 🔥
How can LLMs systematically propose and verify hypotheses based on observations for ?
Read our paper to find out!
📄: arxiv.org/abs/2404.04326…
Details in 🧵 (1/n):

🔥Thrilled to introduce HypoGeniC: Hypothesis Generation with Large Language Models 🔥 How can LLMs systematically propose and verify hypotheses based on observations for #ScientificDiscovery? Read our paper to find out! 📄: arxiv.org/abs/2404.04326… Details in 🧵 (1/n):
account_circle
Scholarly Document Processing Workshop(@sdpworkshop) 's Twitter Profile Photo

Excited to share that our proposal for the 4th Scientific Document Processing Workshop has been accepted at ACL 2024! Join our workshop on advancing NLP, text mining, and more to tackle the challenges of scholarly text processing. Stay tuned for updates!

account_circle
Scholarly Document Processing Workshop(@sdpworkshop) 's Twitter Profile Photo

Check the CFP for the 4th SDP workshop at ! We are accepting archival short/long papers and submissions on two shared tasks. If you’re interested in addressing the challenges in processing scholarly text, come join us!🙌

CFP: sdproc.org/2024/cfp.html
Deadline: May, 17

account_circle
Dasha Herrmannova(@robodasha) 's Twitter Profile Photo

Hi ! Happy to announce competition on detecting automatically generated scientific papers at Scholarly Document Processing Workshop @acl2024

👉 Competition starting Apr 2
👉 Submit your systems by Apr 30
👉 Monetary prizes for top 3 systems
👉 More info at sdproc.org/2024/sharedtas…

account_circle
Jeremy Howard(@jeremyphoward) 's Twitter Profile Photo

Here is a much more detailed explanation of Q*:

Here is a detailed explanation of the Q-star energy-based model (EBM) ideas for dialog generation, written for an average undergraduate student:

**Energy-Based Model for Dialog Generation**

The Q-star project proposes using an

account_circle
Barbara Plank(@barbara_plank) 's Twitter Profile Photo

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

Surangika Ranathunga, Nisansa De Silva, Velayuthan Menan, Aloka Fernando, Charitha Rathnayake

aclanthology.org/2024.eacl-long…

Low-Resource Paper Award

account_circle
Graham Neubig(@gneubig) 's Twitter Profile Photo

Check out our new analysis framework for RAG systems!

RAGGED's goal is to allow researchers/practitioners to examine the interaction between:
* Retriever choice
* Reader choice
* Context selection
* Dataset/domain

and get insights about models/methods.

account_circle
Cameron R. Wolfe, Ph.D.(@cwolferesearch) 's Twitter Profile Photo

Now that Grok-1 has been released, it’s the perfect time to brush up on how Mixture-of-Experts (MoE) layers work in LLMs. Here’s a quick explainer…

TL;DR: When applied to transformer models, MoE layers have two primary components:

- Sparse MoE Layer: replaces dense

Now that Grok-1 has been released, it’s the perfect time to brush up on how Mixture-of-Experts (MoE) layers work in LLMs. Here’s a quick explainer… TL;DR: When applied to transformer models, MoE layers have two primary components: - Sparse MoE Layer: replaces dense
account_circle
Frank van Harmelen(@FrankVanHarmele) 's Twitter Profile Photo

A course on Knowledge Graphs is nothing new, but a course on Knowledge Graphs from this set of teachers is an interesting signal. GenAI is rapidly becoming the best motivation for symbolic AI in a long time!

account_circle
Andrew Gao(@itsandrewgao) 's Twitter Profile Photo

here's your DEEP DIVE into Grok's architecture!
I just went through the model.py, for this 314B open source behemoth with *no strings attached*.

👇🧵

here's your DEEP DIVE into @grok's architecture! I just went through the model.py, for this 314B open source behemoth with *no strings attached*. 👇🧵
account_circle