2 Comments
User's avatar
Daniel Popescu / ⧉ Pluralisk's avatar

Fascinating. Remember RNNs? This article nails why self-attention was such a relif.

Gonçalo Perdigão's avatar

Thanks, Daniel! Absolutely, it’s wild to think how much time we spent tuning RNNs and LSTMs to handle long dependencies. The elegance of self-attention really did feel like a breath of fresh air, both conceptually and computationally. Glad you enjoyed the read!