Skip to main content

Posts

Showing posts from July, 2023

LLaMA 2 and its Symbolic Regression Explanation

On July 17, a new family of AI models, LLaMA 2 was announced by Meta. LLaMA 2 is trained on a mix of publicly available data. According to Meta LLaMA 2 performs significantly better than the previous generation of LLaMA models. Two flavors of the model: LLaMA 2 and LLaMA 2-Chat, a model fine tuned for two-way conversations, were released. Each flavor further has three versions with the parameters ranging from 7 billions to 70 billions. Meta is also freely releasing the code and data behind the model for  researchers to build upon and improve the technology. There are several ways to access LLaMA 2 for development work; you can download it from HuggingFace or access it via Microsoft Azure or Amazon SageMaker . For those interested in interacting with the LLaMA 2-Chat version, you can do so by visiting llama2.ai , a chatbot model demo hosted by the venture capitalist Andreessen Horowitz. This is the route I took to interact with LLaMA 2-Chat. Since I was reading an excellent paper on

Low Rank Adaptation (LoRA): Enhancing Fine-Tuning of LLMs

Pre-trained large language models (LLMs) are being used for numerous natural language processing applications. These models perform well out of the box and are fine-tuned for any desired down-stream application. However, fine-tuning these models to adapt to specific tasks often poses challenges due to their large parameter sizes. To address this, a technique called Low Rank Adaptation (LoRA) has emerged, enabling efficient fine-tuning of LLMs. In this post, we will try to understand LoRA, and delve into its importance and application in fine-tuning LLMs. We will begin our journey by first looking at the concept of rank of a matrix, followed by a look at matrix factorization, and then to LoRA. Rank of a Matrix The rank of a matrix indicates the number of independent rows or column in the matrix. As an example, consider the following 4x4 matrix A: A = [[2, 4, 6, 8], [1, 3, 5, 7], [4, 8, 12, 16], [3, 9, 15, 21]] Looking at the first and third row of this matrix, we see that the third row