Rita Sevastjanova(@RSevastjanova) 's Twitter Profile Photo

Want to explore properties? Here are some ideas for data preparation steps✂️, interesting additional features🍩, and some relevant references✍️: embedding-framework.lingvis.io With
Menna El-Assady 😊

account_circle
Rita Sevastjanova(@RSevastjanova) 's Twitter Profile Photo

Our visualizations show that larger models, e.g., ChatGPT and Bard, are less biased than their smaller counterparts. Are the models getting better or the engineering steps used to adapt the outputs? 🤔Menna El-Assady 👉prompt-comparison.lingvis.io

Our visualizations show that larger models, e.g., ChatGPT and Bard, are less biased than their smaller counterparts. Are the models getting better or the engineering steps used to adapt the outputs? 🤔@melassady #ieeevis #vds 👉prompt-comparison.lingvis.io
account_circle
Rita Sevastjanova(@RSevastjanova) 's Twitter Profile Photo

What biases are encoded in texts generated by ? Our workspace helps to explore stereotypes encoded in prompt outputs and detect unexpected word associations. Menna El-Assady will present our work
VDS at KDD and VIS🥳 Paper and demo: prompt-comparison.lingvis.io IEEE VIS

What biases are encoded in texts generated by #LLMs? Our workspace helps to explore stereotypes encoded in prompt outputs and detect unexpected word associations. @melassady will present our work
@VisualDataSci🥳 Paper and demo: prompt-comparison.lingvis.io #ieeevis2023 @ieeevis
account_circle
Thilo Spinner(@spinthil) 's Twitter Profile Photo

Ian Johnson 💻🔥 Rita Sevastjanova Thanks a lot for your effort! 😊
I also figured that would be the correct CLIP version. However, regardless if I try the laion/CLIP-ViT-H-14-laion2B-s32B-b79K model from HuggingFace or your linked code with the open-clip-torch pip package, I get embeddings differing from SDv2.1.

@enjalot @RSevastjanova Thanks a lot for your effort! 😊
I also figured that would be the correct CLIP version. However, regardless if I try the laion/CLIP-ViT-H-14-laion2B-s32B-b79K model from HuggingFace or your linked code with the open-clip-torch pip package, I get embeddings differing from SDv2.1.
account_circle
Ian Johnson 💻🔥(@enjalot) 's Twitter Profile Photo

Thilo Spinner Rita Sevastjanova yeah, i forgot where i found the relevant line, but i made this little server to give me both the old clip embeddings for 1.x and new ones for 2.x
gist.github.com/enjalot/222895…

i made a 'full_encode' method that borrowed an extra piece from the transformers lib i think

@spinthil @RSevastjanova yeah, i forgot where i found the relevant line, but i made this little server to give me both the old clip embeddings for 1.x and new ones for 2.x
gist.github.com/enjalot/222895…

i made a 'full_encode' method that borrowed an extra piece from the transformers lib i think
account_circle
Thilo Spinner(@spinthil) 's Twitter Profile Photo

Ian Johnson 💻🔥 Rita Sevastjanova Okay, by chance, I found that SDv2.1 uses the penultimate (second to last) layer. By that, I get the correct embeddings. 🎉
Thanks a lot for your hints - without experimenting with the FrozenOpenCLIPEmbedder I would not have noticed this! 😌

account_circle
Thilo Spinner(@spinthil) 's Twitter Profile Photo

Ian Johnson 💻🔥 Rita Sevastjanova I feel like I'm missing something here...
Did you compare the embeddings between CLIP-ViT-H-14-laion2B-s32B-b79K and the SDv2.1 text encoder?

account_circle
Elena Leuschner(@ElenaLeuschner) 's Twitter Profile Photo

There are so many people to thank! This has been so long in the making and we hope we added a complete list in the paper 😊

A not complete list + only ppl on Twitter: Ian Burton, Eva Thomann @evathomann.bsky.social, Ronny Patz, Mirko Heinzel, Laurin Friedrich, @kateweaverUT, Rita Sevastjanova

account_circle
Rita Sevastjanova(@RSevastjanova) 's Twitter Profile Photo

Brendan Dolan-Gavitt @idkmyna40085148 If it is still relevant: Roberta (encoder-decoder) fine tuned for code classification task (including C++): huggingface.co/NTUYG/DeepSCC-… I would take the CLS token (here: <s>) embeddings from the second to last layer.

account_circle
Rita Sevastjanova(@RSevastjanova) 's Twitter Profile Photo

Brendan Dolan-Gavitt @idkmyna40085148 Yes. But I think the problem here is the input data - GPT2 has never seen code during its training - rather than the model’s type.

account_circle