Twitter search results for "huiyinxue" • TwiCopy

1 year ago

thumb_up_off_alt6

account_circle

casszhao

@casszzx

1 year ago

Maybe see you again next year, at Sentosa island, Singapore #emnlp #emnlp 2023 Huiyin Xue Nikos Aletras Sheffield NLP

Maybe see you again next year, at Sentosa island, Singapore #emnlp #emnlp2023 @HuiyinXue @nikaletras @SheffieldNLP

thumb_up_off_alt28

repeat4

account_circle

Nikos Aletras

@nikaletras

1 year ago

If you missed Huiyin Xue's #EMNLP2022 poster yesterday on HashFormers, you can find the video/poster/slides here: underline.io/events/342/ses…

thumb_up_off_alt17

repeat1

account_circle

Yizhi Li

@yizhilll

1 year ago

Huiyin Xue Huiyin, where is your paper?

thumb_up_off_alt0

account_circle

arXiv Daily

@Arxiv_Daily

1 year ago

HashFormers: Towards Vocabulary-independent Pre-trained Transformers
deepai.org/publication/ha…
by Huiyin Xue et al.
#Computation #Language

thumb_up_off_alt10

repeat1

account_circle

Huiyin Xue

1 year ago

Xutan Peng Mark Stevenson Aline Villavicencio Haiping Lu Nikos Aletras Ivan Vulić Sheffield Comp Sci Sheffield NLP Congrats👏!

thumb_up_off_alt1

account_circle

Huiyin Xue

6 months ago

Danae Sánchez Catalina Goanta Nikos Aletras Congratulations!👏👏👏

thumb_up_off_alt1

account_circle

Huiyin Xue

1 year ago

Yizhi Li It's UAE Presidential Palace🤖

thumb_up_off_alt0

account_circle

MHE attention only requires a negligible fraction of additional parameters (3nd, where n is the number of attention heads and d the size of the head embeddings) compared to a single-head attention, while MHA requires (3n^2−3n)d^2−3nd additional parameters.
[4/k]

thumb_up_off_alt2

account_circle

Huiyin Xue

6 months ago

MHE is substantially more memory efficient compared to alternative attention mechanisms while achieving high predictive performance retention ratio to vanilla MHA on several downstream tasks.
[3/k]

thumb_up_off_alt2