Skip to main content


Showing posts with the label Language Models

Difference Between Semi-Supervised Learning and Self-Supervised Learning

There are many styles of training machine learning models including the familiar supervised and unsupervised learning to active learning, semi-supervised learning and self-supervised learning. In this post, I will explain the difference between semi-supervised and self-supervised styles of learning. To get started, let us first recap what is  supervised learning, the most popular machine learning methodology to build predictive models. Supervised learning uses annotated or labeled data to train predictive models. A   label   attached to a data vector is nothing but the response that the predictive model should generate  for that data vector as input during the model training. For example, we will label pictures of cats and dogs with labels   cat   and   dog  to train  a Cat versus Dog classifier. We assume a large enough training data set with labels is available w hen building a classifier. When there are no labels attached to the training data, then the learning style is known as uns

Retrieval Augmented Generation: What is it and Why do we need it?

What is Retrieval Augmented Generation? Generative AI is currently garnering lots of attention. While the responses provided by the large language models (LLMs) are satisfactory in most situations, sometimes we want to get better focused responses when employing LLMs in specific domains. Retrieval-augmented generation (RAG) offers one such way to improve the output of generative AI systems. RAG enhances the LLMs capabilities by providing them with additional knowledge context through information retrieval. Thus, RAG aims to combine the strengths of both retrieval-based methods, which focus on selecting relevant information, and generation-based methods, which produce coherent and fluent text.  RAG works in the following way: Retrieval : The process starts with retrieving relevant documents, passages, or pieces of information from a pre-defined corpus or database. These retrieved sources contain content that is related to the topic or context for which you want to gen

Exploring Large Language Models: Types and Applications

Large language models (LLMs) are currently the craze. Who hasn't heard of ChatGPT that can deliver all kinds of responses to user prompts, be a recipe or suggestions for vacation or an essay on a topic for a term paper. It is all possible because of the underlying large language models. So what are large language models? How do these models work? What can we do with these models? Let's try to answer these questions without going into much technical details. What are Large Language Models? We will begin by first trying to understand what is a language model. Think about using your cell phone for messaging. As you enter text, your cell phone tries to guess the word you are typing, see the figure below. Under the hood, a language model is computing probabilities for the next character/word and is displaying the top three or five most probable characters/words.  There are a few types of language models such as rule-based models, statistical language models, and the recurrent neura