Twitter #quantizing hashtag • TwiCopy

OpenMOSE

2 days ago

RWKV-LM-State-4bit

I implemented 4-bit quantization of the main weights using Bitsandbytes and State-Tuning. and I have enabled differential output in the checkpoint outputs.

Quantizing to 4 bits can reduce VRAM usage by about 40% 😀 2B @ 10GB

github.com/OpenMOSE/RWKV-…

thumb_up_off_alt15

chat_bubble_outline0

account_circle

Roberto Luis Rodriguez

9 hours ago

new portal. Produced by RLuis77

“I Can Make Your Peoples DANCE”

YOU CANT??? Kendrick Lamar

How?🤔

What does that say?🤔🤷🏽‍♂️

REALLY???

THEY MY PPL’S…

The Drums Never Lie…Kendrick Lamar

NO QUANTIZING

STRAIGHT SOUL.

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Benjamin Trent

3 days ago

A new parameter is added to the linear offset correction created when scalar quantizing for dot-product. This means we can optimize the quantization bucketing and that correction: elastic.co/search-labs/bl…

A new parameter is added to the linear offset correction created when scalar quantizing for dot-product. This means we can optimize the quantization bucketing and that correction: elastic.co/search-labs/bl…

thumb_up_off_alt1

chat_bubble_outline0

account_circle

OpenMOSE

6 days ago

Experimenting with quantizing the permanent freeze layer in Bitsandbytes to reduce VRAM usage.

On RWKV x060 7b, LISA Last 4layer Enabled, ds2offload,1 Layer active each steps

NonQuant: OOM
NF4Quant: around 19GB

but, lost the speed advantages of TorchJIT.....

Experimenting with quantizing the permanent freeze layer in Bitsandbytes to reduce VRAM usage.

On RWKV x060 7b, LISA Last 4layer Enabled, ds2offload,1 Layer active each steps

NonQuant: OOM
NF4Quant: around 19GB

but, lost the speed advantages of TorchJIT.....

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Christian Hernandez

6 days ago

Never felt more vindicated than hearing Danny Carey say he’s never tracked a Tool song to a metronome. I’m always against quantizing (for the most part) and a big advocate for organic sound and letting a groove breathe

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Kalin Ovtcharov

@KalinOvtcharov

5 days ago

quantizing llama v3 may prove tricky.

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Yorkie

6 days ago

Pretty cool that I have my own fine-tuned model working in LM Studio now. Rather successful weekend learning about fine-tuning, quantizing models etc. Here's to more!

Pretty cool that I have my own fine-tuned model working in @LMStudioAI now. Rather successful weekend learning about fine-tuning, quantizing models etc. Here's to more!

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Kadenxe

3 days ago

I'm not quantizing anything anymore either it hits or I redo it

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Burny — Effective Omni

5 days ago

Loop quantum gravity and string theory are the two main approaches that attempt to reconcile quantum mechanics and general relativity to develop a theory of quantum gravity. However, they have some key differences:

- LQG focuses on quantizing space-time itself, treating it as…

Loop quantum gravity and string theory are the two main approaches that attempt to reconcile quantum mechanics and general relativity to develop a theory of quantum gravity. However, they have some key differences:

- LQG focuses on quantizing space-time itself, treating it as…

thumb_up_off_alt24

chat_bubble_outline0

account_circle

KD

1 day ago

Just finished quantizing and uploading Mixtral 8x7B to Apple MLX. Enjoy!

huggingface.co/mlx-community/…
huggingface.co/mlx-community/…

thumb_up_off_alt0

chat_bubble_outline0

account_circle

nisten

3 days ago

Teortaxes▶️ Mmmmm not sure this is the case.

6.23 at f16 and 6.38 at 4bit q4km looks like it’s quantizing pretty good ?

@teortaxesTex Mmmmm not sure this is the case.

6.23 at f16 and 6.38 at 4bit q4km looks like it’s quantizing pretty good ?

thumb_up_off_alt6

chat_bubble_outline0

account_circle

Rohan Paul

5 days ago

A nice example from Official implementation of Half-Quadratic Quantization (HQQ) repo

HQQ’s `HQQLinear.quantize` and `HQQLinear.dequantize` methods have been modified to support FSDP training by viewing int dtype quantized weights as a selectable float dtype when quantizing, and…

A nice example from Official implementation of Half-Quadratic Quantization (HQQ) repo

HQQ’s `HQQLinear.quantize` and `HQQLinear.dequantize` methods have been modified to support FSDP training by viewing int dtype quantized weights as a selectable float dtype when quantizing, and…

thumb_up_off_alt5

chat_bubble_outline0

account_circle

Exxact Corporation

5 days ago

LLMs are huge. Lets make them smaller with the by quantizing them. Read more about quantization.
bit.ly/3UOT9MV

#LLM #quantization

LLMs are huge. Lets make them smaller with the by quantizing them. Read more about quantization.
bit.ly/3UOT9MV

#LLM #quantization

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Private LLM

6 days ago

M Maarouf Quantizing the model’s embedding layer, hurts the model’s perplexity and by extension, not quantizing it, improves perplexity. It’s similar with Gemma 2B IT. For phi-3-mini, it comes at the cost of ~100MB increase in memory footprint, which we feel is a fair trade off.

thumb_up_off_alt1

chat_bubble_outline0

account_circle

Major Malarkey

5 days ago

This looks/sounds like mad ppl who care about chords, quantizing, & advertising got together

thumb_up_off_alt0

chat_bubble_outline0

account_circle

cocktail peanut

@cocktailpeanut

5 days ago

> I did hear that quantizing llama3 results in bad results

sorry I meant to say finetuning*

thumb_up_off_alt1

chat_bubble_outline0

account_circle

cocktail peanut

@cocktailpeanut

5 days ago

llama3-gradient first impressions

At least for the quantized llama3-gradient from ollama, the quality is pretty bad. Hallucinates like hell.

I did hear that quantizing llama3 results in bad results generally and maybe that's why. Anyone tried the original one?

Some results:

thumb_up_off_alt27

chat_bubble_outline0

account_circle

cocktail peanut

@cocktailpeanut

5 days ago

> I did hear that quantizing llama3

sorry I meant finetuning*

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Ozlem Peksoy Bishop 💥♻️🛸

6 days ago

: continuums inquiries are precious :: our efforts of quantizing and categorizing everything artificially confuse our ‘understanding’ ::: fluidity as step one affords quite a lot especially when you free your perceptions of time limitations

thumb_up_off_alt0

chat_bubble_outline0

account_circle

Black Radioactive Boi 🚂☢

@TokenOfTheMonth

4 days ago

We as a society are COOKED if mfrs are coming on sayin quantizing a track is harder than rocket science like bro what is HAPPENING in our schools LMAOOO

thumb_up_off_alt40

chat_bubble_outline0

account_circle