Analysis of mean-field models arising from self-attention dynamics in transformer architectures with layer normalization

Martin Burger, Samira Kabri, Yury Korolev, Tim Roith, Lukas Weigand

Research output: Working paper / PreprintPreprint

6 Downloads (Pure)

Fingerprint

Dive into the research topics of 'Analysis of mean-field models arising from self-attention dynamics in transformer architectures with layer normalization'. Together they form a unique fingerprint.

Mathematics