LoRA: Low-Rank Adaptation of Large Language Models

How does LoRA work?


LoRA freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture. paste-93f70700ed8f25149460afe0c85505cd00ee1d46.jpg

They use a random Gaussian initialization for AA and zero for BB, so ΔW=BA\Delta W = BA is zero at the beginning of training.

Machine Learning Research Flashcards is a collection of flashcards associated with scientific research papers in the field of machine learning. Best used with Anki or Obsidian. Edit MLRF on GitHub.