Skip to main content


Embeddings Beyond Words: Intro to Sentence Embeddings

It wouldn't be an exaggeration to say that the recent advances in Natural Language Processing (NLP) technology can be, to a large extent, attributed to the use of very high-dimensional vectors for language representation. These high-dimensional, 764 dimensions is common, vector representations are called embeddings and are aimed at capturing semantic meaning and relationships between linguistic items. Although the idea of using vector representation for words has been around for many years, the interest in word embedding took a quantum jump with Tomáš Mikolov’s Word2vec algorithm in 2013. Since then, many methods for generating word embeddings, for example GloVe and BERT , have been developed. Before moving on further, let's see briefly how word embedding methods work. Word Embedding: How is it Performed? I am going to explain how word embedding is done using the Word2vec method. This method uses a linear encoder-decoder network with a single hidden layer. The input layer o

Claude 2: A New Member of the Growing Family of Large Language Models

AI has advanced rapidly in recent years, with large language models (LLMs) like ChatGPT creating enormous excitement. These models can generate remarkably human-like text albeit  with certain limitations. In this post, we'll look at a new member of the family of large language models, Anthropic's Claude 2 , and highlight some of its features. Claude 2 Overview Claude2 was released in February 2023.  Claude 2 utilizes a context window of approximately 4,000 tokens during conversations. This allows it to actively reference the last 1,000-2,000 words spoken in order to strengthen contextual awareness and continuity. The context window is dynamically managed, expanding or contracting slightly based on factors like conversation complexity. This context capacity exceeds ChatGPT's approximately 1,000 token window, enabling Claude 2 to sustain longer, more intricate dialogues while retaining appropriate context.  In addition to conversational context, Claude 2 can take in multiple

Difference Between Semi-Supervised Learning and Self-Supervised Learning

There are many styles of training machine learning models including the familiar supervised and unsupervised learning to active learning, semi-supervised learning and self-supervised learning. In this post, I will explain the difference between semi-supervised and self-supervised styles of learning. To get started, let us first recap what is  supervised learning, the most popular machine learning methodology to build predictive models. Supervised learning uses annotated or labeled data to train predictive models. A   label   attached to a data vector is nothing but the response that the predictive model should generate  for that data vector as input during the model training. For example, we will label pictures of cats and dogs with labels   cat   and   dog  to train  a Cat versus Dog classifier. We assume a large enough training data set with labels is available w hen building a classifier. When there are no labels attached to the training data, then the learning style is known as uns