Tirthankar Ghosal (@TirthankarSlg) Twitter Tweets • TwiCopy

Tirthankar Ghosal

@TirthankarSlg

+ Follow

Scientist @ORNL #NLProc #LLMs #peerreview #SDProc Editor @SIGIRForum Org. #AutoMin2023 @SDProc @wiesp_nlp AC @IJCAIconf @emnlpmeeting Prevly @ufal_cuni @IITPAT

ID:817603403677253633

linkhttps://member.acm.org/~tghosal calendar_today07-01-2017 05:26:56

3,4K Tweets

513 Followers

1,3K Following

naklecha

3 days ago

today, im excited to release a repository that implements llama3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained. have fun :)

github.com/naklecha/llama…

today, im excited to release a repository that implements llama3 from scratch -- every matrix multiplication from attention across multiple heads, positional encoding and every other layer in between has been carefully unwrapped & explained. have fun :) github.com/naklecha/llama…

thumb_up_off_alt4,8K

chat_bubble_outline0

account_circle

Sebastian Raschka

1 week ago

A suggestion for an effective 11-step LLM summer study plan:
1) Read* Chapters 1 and 2 on implementing the data loading pipeline (manning.com/books/build-a-… & github.com/rasbt/LLMs-fro…).
2) Watch Karpathy's video on training a BPE tokenizer from scratch (youtube.com/watch?v=zduSFx…).
3)

thumb_up_off_alt1,8K

chat_bubble_outline0

account_circle

Andrew Curran

2 weeks ago

Hallucination as a feature.

Hallucination as a feature.

thumb_up_off_alt95

chat_bubble_outline0

account_circle

Cameron R. Wolfe, Ph.D.

@cwolferesearch

3 weeks ago

Prompt engineering is one of the most rapidly-evolving research topics in AI, but we can (roughly) group recent research on this topic into four categories…

(1) Reasoning: Simple prompting techniques are effective for many problems, but more sophisticated strategies are

Prompt engineering is one of the most rapidly-evolving research topics in AI, but we can (roughly) group recent research on this topic into four categories… (1) Reasoning: Simple prompting techniques are effective for many problems, but more sophisticated strategies are

thumb_up_off_alt725

chat_bubble_outline0

account_circle

Sasha Rush

3 weeks ago

Talk: 'OLMo: Findings of Training an Open LM' from Hanna Hajirshizi at AI2 from OSGAI.

Extremely interesting overview of the 4 parts (Data, Training, Adaptation, Eval) of the OLMo open LLM project. Rare insight into how these processes work at scale.

youtube.com/watch?v=qFZbu2…

thumb_up_off_alt244

chat_bubble_outline0

account_circle

NIK

3 weeks ago

“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out

“So I asked Ilya Sutskever, OpenAI’s chief scientist, for a reading list. He gave me a list of like 40 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’ And I did. I plowed through all those things and it all started sorting out

thumb_up_off_alt2,6K

chat_bubble_outline0

account_circle

Min Choi

1 month ago

Llama 3 just changed the LLM game.

People are finding wild use cases at GPT-4 level. There is a massive movement in the open source community.

10 examples (and ways to use Llama 3):

Llama 3 just changed the LLM game. People are finding wild use cases at GPT-4 level. There is a massive movement in the open source community. 10 examples (and ways to use Llama 3):

thumb_up_off_alt5,8K

chat_bubble_outline0

account_circle

Thomas Wolf

1 month ago

Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes??

Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation

thumb_up_off_alt1,7K

chat_bubble_outline0

account_circle

Rosa Zhou

1 month ago

🔥Thrilled to introduce HypoGeniC: Hypothesis Generation with Large Language Models 🔥
How can LLMs systematically propose and verify hypotheses based on observations for #ScientificDiscovery ?
Read our paper to find out!
📄: arxiv.org/abs/2404.04326…
Details in 🧵 (1/n):

🔥Thrilled to introduce HypoGeniC: Hypothesis Generation with Large Language Models 🔥 How can LLMs systematically propose and verify hypotheses based on observations for #ScientificDiscovery? Read our paper to find out! 📄: arxiv.org/abs/2404.04326… Details in 🧵 (1/n):

thumb_up_off_alt78

chat_bubble_outline0

account_circle

Scholarly Document Processing Workshop

3 months ago

Excited to share that our proposal for the 4th Scientific Document Processing Workshop has been accepted at ACL 2024! Join our workshop on advancing NLP, text mining, and more to tackle the challenges of scholarly text processing. Stay tuned for updates! #NLProc #ACL2024

thumb_up_off_alt3

chat_bubble_outline0

account_circle

Scholarly Document Processing Workshop

2 months ago

Check the CFP for the 4th SDP workshop at #ACL2024 ! We are accepting archival short/long papers and submissions on two shared tasks. If you’re interested in addressing the challenges in processing scholarly text, come join us!🙌

CFP: sdproc.org/2024/cfp.html
Deadline: May, 17

thumb_up_off_alt5

chat_bubble_outline0

account_circle

Dasha Herrmannova

1 month ago

Hi #NLProc ! Happy to announce #DAGPap24 competition on detecting automatically generated scientific papers at Scholarly Document Processing Workshop @acl2024

👉 Competition starting Apr 2
👉 Submit your systems by Apr 30
👉 Monetary prizes for top 3 systems
👉 More info at sdproc.org/2024/sharedtas…

thumb_up_off_alt5

chat_bubble_outline0

account_circle

Scholarly Document Processing Workshop

2 months ago

Link to get more information: sdproc.org/2024/sharedtas…
#SDP #NLProc #acl2024

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Jeremy Howard

2 months ago

Here is a much more detailed explanation of Q*:

Here is a detailed explanation of the Q-star energy-based model (EBM) ideas for dialog generation, written for an average undergraduate student:

**Energy-Based Model for Dialog Generation**

The Q-star project proposes using an

thumb_up_off_alt1,0K

chat_bubble_outline0

account_circle

Barbara Plank

2 months ago

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

Surangika Ranathunga, Nisansa De Silva, Velayuthan Menan, Aloka Fernando, Charitha Rathnayake

aclanthology.org/2024.eacl-long…

Low-Resource Paper Award #eacl2024

thumb_up_off_alt23

chat_bubble_outline0

account_circle

Graham Neubig

2 months ago

Check out our new analysis framework for RAG systems!

RAGGED's goal is to allow researchers/practitioners to examine the interaction between:
* Retriever choice
* Reader choice
* Context selection
* Dataset/domain

and get insights about models/methods.

thumb_up_off_alt146

chat_bubble_outline0

account_circle

Cameron R. Wolfe, Ph.D.

@cwolferesearch

2 months ago

Now that Grok-1 has been released, it’s the perfect time to brush up on how Mixture-of-Experts (MoE) layers work in LLMs. Here’s a quick explainer…

TL;DR: When applied to transformer models, MoE layers have two primary components:

- Sparse MoE Layer: replaces dense

Now that Grok-1 has been released, it’s the perfect time to brush up on how Mixture-of-Experts (MoE) layers work in LLMs. Here’s a quick explainer… TL;DR: When applied to transformer models, MoE layers have two primary components: - Sparse MoE Layer: replaces dense

thumb_up_off_alt504

chat_bubble_outline0

account_circle

Frank van Harmelen

@FrankVanHarmele

2 months ago

A course on Knowledge Graphs is nothing new, but a course on Knowledge Graphs from this set of teachers is an interesting signal. GenAI is rapidly becoming the best motivation for symbolic AI in a long time!

thumb_up_off_alt108

chat_bubble_outline0

account_circle

Andrew Gao

2 months ago

here's your DEEP DIVE into Grok's architecture!
I just went through the model.py, for this 314B open source behemoth with *no strings attached*.

👇🧵

here's your DEEP DIVE into @grok's architecture! I just went through the model.py, for this 314B open source behemoth with *no strings attached*. 👇🧵

thumb_up_off_alt2,2K

chat_bubble_outline0

account_circle

Jeremy Howard

2 months ago

How to write CUDA on AMD.

How to write CUDA on AMD.

thumb_up_off_alt740

chat_bubble_outline0

account_circle

fpc ok :)