Skip to main content


Showing posts with the label NLP

Embeddings Beyond Words: Intro to Sentence Embeddings

It wouldn't be an exaggeration to say that the recent advances in Natural Language Processing (NLP) technology can be, to a large extent, attributed to the use of very high-dimensional vectors for language representation. These high-dimensional, 764 dimensions is common, vector representations are called embeddings and are aimed at capturing semantic meaning and relationships between linguistic items. Although the idea of using vector representation for words has been around for many years, the interest in word embedding took a quantum jump with Tomáš Mikolov’s Word2vec algorithm in 2013. Since then, many methods for generating word embeddings, for example GloVe and BERT , have been developed. Before moving on further, let's see briefly how word embedding methods work. Word Embedding: How is it Performed? I am going to explain how word embedding is done using the Word2vec method. This method uses a linear encoder-decoder network with a single hidden layer. The input layer o

Exploring Large Language Models: Types and Applications

Large language models (LLMs) are currently the craze. Who hasn't heard of ChatGPT that can deliver all kinds of responses to user prompts, be a recipe or suggestions for vacation or an essay on a topic for a term paper. It is all possible because of the underlying large language models. So what are large language models? How do these models work? What can we do with these models? Let's try to answer these questions without going into much technical details. What are Large Language Models? We will begin by first trying to understand what is a language model. Think about using your cell phone for messaging. As you enter text, your cell phone tries to guess the word you are typing, see the figure below. Under the hood, a language model is computing probabilities for the next character/word and is displaying the top three or five most probable characters/words.  There are a few types of language models such as rule-based models, statistical language models, and the recurrent neura