Skip to main content

Posts

Showing posts from October, 2023

Mapping Nodes to Vectors: An Intro to Node Embedding

In an earlier post, I had stated that  the recent advances in Natural Language Processing (NLP) technology can be, to a large extent, attributed to the use of very high-dimensional vectors for language representation. These high-dimensional, 764 dimensions is common, vector representations are called   embeddings   and are aimed at capturing semantic meaning and relationships between linguistic items.  Given that graphs are everywhere, it is not surprising to see the ideas of word and sentence embeddings being extended to graphs in the form of node embeddings.   What are Node Embedding? Node embeddings are  encodings of the properties and relationships of nodes in a low-dimensional vector space.  This enables nodes with similar properties or connectivity patterns to have similar vector representations. Using node embeddings can improve performance on various graph analytics tasks such as node classification, link prediction, and clustering.    Methods for Node Embeddings There are seve

Google's Bard Can Code and Compute for You

Large language models (LLMs) continue to fascinate us with their capabilities to answer our questions, generate presentations and essays for us and many other assorted tasks. These models are also good at generating code for user specified tasks. However, almost all of them do not run the code for us; they simply give us the code that we can copy and execute.   Recently, Google has given its large language model, Bard , the computational capabilities as well. Bard thus not only provides the code but also executes it while answering user's questions. I wanted to check this feature of Bard. Below is what happened when I asked Bard a question that involved some computation. Not only generating the code for entropy calculation and running it, Bard went on to explain entropy and its answer. Google characterizes computing by Bard in response to user questions as "writing code on the fly" method. The company says, "So far, we've seen this method improve the accuracy of

Exploring Canonical Correlation Analysis (CCA): Uncovering Hidden Relationships

Canonical Correlation Analysis (CCA) is a statistical technique that enables us to uncover hidden associations between two sets of variables. Whether it's in the fields of psychology, economics, genetics, marketing or machine learning, CCA proves to be a powerful tool for gaining valuable insights. In this blog post, we will try to understand CCA. But first let’s take a look at two sets of observations, X and Y , shown below. These two sets of observations are made on the same set of objects and each observation represents a different variable. Let’s calculate pairwise correlation between the column vectors of X and Y . The resulting correlation values should give us some insight between the two sets of measurements. These values are shown below where the entry at (i,j) represents the correlation between the i-th column of X and the j-th column of Y . The correlation values show moderate to almost no correlation between the columns of the two datasets except a relatively higher