Skip to main content


Showing posts with the label ChatGPT

Claude 2: A New Member of the Growing Family of Large Language Models

AI has advanced rapidly in recent years, with large language models (LLMs) like ChatGPT creating enormous excitement. These models can generate remarkably human-like text albeit  with certain limitations. In this post, we'll look at a new member of the family of large language models, Anthropic's Claude 2 , and highlight some of its features. Claude 2 Overview Claude2 was released in February 2023.  Claude 2 utilizes a context window of approximately 4,000 tokens during conversations. This allows it to actively reference the last 1,000-2,000 words spoken in order to strengthen contextual awareness and continuity. The context window is dynamically managed, expanding or contracting slightly based on factors like conversation complexity. This context capacity exceeds ChatGPT's approximately 1,000 token window, enabling Claude 2 to sustain longer, more intricate dialogues while retaining appropriate context.  In addition to conversational context, Claude 2 can take in multiple

LLaMA 2 and its Symbolic Regression Explanation

On July 17, a new family of AI models, LLaMA 2 was announced by Meta. LLaMA 2 is trained on a mix of publicly available data. According to Meta LLaMA 2 performs significantly better than the previous generation of LLaMA models. Two flavors of the model: LLaMA 2 and LLaMA 2-Chat, a model fine tuned for two-way conversations, were released. Each flavor further has three versions with the parameters ranging from 7 billions to 70 billions. Meta is also freely releasing the code and data behind the model for  researchers to build upon and improve the technology. There are several ways to access LLaMA 2 for development work; you can download it from HuggingFace or access it via Microsoft Azure or Amazon SageMaker . For those interested in interacting with the LLaMA 2-Chat version, you can do so by visiting , a chatbot model demo hosted by the venture capitalist Andreessen Horowitz. This is the route I took to interact with LLaMA 2-Chat. Since I was reading an excellent paper on

Reinforcement Learning with Human Feedback: A Powerful Approach to AI Training

The unprecedented capabilities exhibited by the large language models (LLMs) such as ChatGPT and GPT-4 have created enormous excitement as well as concerns about the impact of AI on the society in near and far future. Behind the success of LLMs and AI in general lies among other techniques a learning approach called Reinforcement Learning with Human Feedback (RLHF). In this blog post, we will try to understand what RLHF is and why it offers a powerful approach to training AI models. However, before we do that, let's try to understand the concept of reinforcement learning (RL). What is Reinforcement Learning (RL)? RL, inspired by the principles of behavioral psychology, is a machine learning technique wherein the learner, called an agent , learns decision making by exploring an environment through a trial-and-error process to achieve its goal. Each action by the agent results in feedback in the form of a reward or punishment . While performing actions and receiving feedback, the a

In Bard's Own Words How is it Different from ChatGPT

Now that Google's Bard is available, I thought it will be fun to ask Bard its difference from ChatGPT. I think the response is pretty on target. What do you think? Please share your thoughts via comments.  

Exploring Large Language Models: Types and Applications

Large language models (LLMs) are currently the craze. Who hasn't heard of ChatGPT that can deliver all kinds of responses to user prompts, be a recipe or suggestions for vacation or an essay on a topic for a term paper. It is all possible because of the underlying large language models. So what are large language models? How do these models work? What can we do with these models? Let's try to answer these questions without going into much technical details. What are Large Language Models? We will begin by first trying to understand what is a language model. Think about using your cell phone for messaging. As you enter text, your cell phone tries to guess the word you are typing, see the figure below. Under the hood, a language model is computing probabilities for the next character/word and is displaying the top three or five most probable characters/words.  There are a few types of language models such as rule-based models, statistical language models, and the recurrent neura