2020 https://towardsdatascience.com/topic-modeling-with-bert-779f7db187e6 https://nuancesprog.ru/p/11195/ https://medium.com/read-a-paper/bert-read-a-paper-811b836141e9 https://nuancesprog.ru/p/10597/ 2019 https://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/ https://habr.com/ru/post/498144/ Ubaidulaev - Transformers https://www.youtube.com/watch?v=tfGkuYkjDpI https://habr.com/ru/company/mipt/blog/462989/ https://habr.com/ru/post/458992/ https://medium.com/dair-ai/adapters-a-compact-and-extensible-transfer-learning-method-for-nlp-6d18c2399f62 Bogdanov - Attention ru https://dev.by/news/dmitry-bogdanov https://www.youtube.com/watch?v=qKL9hWQQQic https://www.dropbox.com/s/1nk66rixz4ets03/Lecture%2012%20-%20Attention%20-%20annotated.pdf?dl=0 Kurbanov - Attention, attention https://www.youtube.com/watch?v=q9svwVYduSo https://research.jetbrains.org/files/material/5c642f68c724e.pdf https://ai.googleblog.com/2019/01/transformer-xl-unleashing-potential-of.html https://habr.com/ru/post/436878/ 2018 Attention Is All You Need https://www.youtube.com/watch?v=iDulhoQ2pro https://arxiv.org/abs/1706.03762 http://nlp.seas.harvard.edu/2018/04/03/attention.html https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/#.W-2AUeK-mUk https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html BERT https://habr.com/ru/company/neoflex/blog/589563/ https://habr.com/ru/company/avito/blog/485290/ https://medium.com/syncedreview/googles-albert-is-a-leaner-bert-achieves-sota-on-3-nlp-benchmarks-f64466dd583 https://medium.com/huggingface/distilbert-8cf3380435b5 http://www.nlp.town/blog/distilling-bert/ https://towardsdatascience.com/deconstructing-bert-distilling-6-patterns-from-100-million-parameters-b49113672f77 https://jalammar.github.io/illustrated-bert/ https://habr.com/ru/post/487358/ https://medium.com/dissecting-bert/dissecting-bert-appendix-the-decoder-3b86f66b0e5f https://medium.com/dissecting-bert/dissecting-bert-part2-335ff2ed9c73 https://medium.com/dissecting-bert/dissecting-bert-part-1-d3c3d495cdb3 https://www.infoq.com/news/2018/11/google-bert-nlp https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html https://www.nytimes.com/2018/11/18/technology/artificial-intelligence-language.html https://github.com/google-research/bert/ BERT - Pre-training of Deep Bidirectional Transformers for Language Understanding https://www.youtube.com/watch?v=-9evrZnBorM https://arxiv.org/abs/1810.04805 What Does BERT Look At? An Analysis of BERT's Attention https://arxiv.org/abs/1906.04341 https://github.com/clarkkev/attention-analysis https://blog.einstein.ai/leveraging-language-models-for-commonsense/ ALBERT https://ai.googleblog.com/2019/12/albert-lite-bert-for-self-supervised.html https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270 https://www.reddit.com/r/MachineLearning/comments/9nfqxz/r_bert_pretraining_of_deep_bidirectional/ seq2seq https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/ LSTMs http://colah.github.io/posts/2015-08-Understanding-LSTMs/ bert-tf2 https://github.com/u10000129/bert_tf2 fast-bert https://github.com/kaushaltrivedi/fast-bert https://medium.com/huggingface/introducing-fastbert-a-simple-deep-learning-library-for-bert-models-89ff763ad384