Skip to main content


Graph Embedding with GraphSAGE

Graph embedding refers to learning low-dimensional representations of nodes in a network; the representation encodes structural information and node features. Such a representation can then be used for various downstream machine learning tasks like node classification, link prediction, visualization, etc. The downstream tasks often work with dynamic or evolving networks such as social networks, recommendation networks etc. The majority of the graph embedding methods assume fixed graph structure and thus are unsuitable for dynamic networks. The GraphSAGE embedding method overcomes this limitation by  incorporating two changes, sampling and aggregating features, in the graph convolutional networks (GCNs) that are used for fixed structure networks. These changes, explained below, make GraphSAGE not only computationally efficient to scale to very large graphs but also permit embeddings to be generated for those nodes of a graph not seen before.  This inductive capability to work with unsee
Recent posts

An Intro to Graph Convolutional Networks

Graph neural networks (GNNs) are deep learning networks that operate on graph data. These networks are increasingly getting popular as numerous real-world applications are easily modeled as graphs. Graphs are unlike images, text, time-series that are used in deep learning models. Graphs are of arbitrary size and complex topological structure. We represent graphs as a set of nodes and edges. In many instances, each node is associated with a feature vector. The adjacency matrix of a graph defines the presence of edges between the nodes. The ordering of nodes in a graph is arbitrary. These factors make it hard to use the existing deep learning architectures and call for an architecture suited to graphs as inputs. Permutation Invariance Architecture Since the nodes in a graph are arbitrarily ordered, it is possible that two adjacency matrices might be representing the same graph. So whatever architecture we plan for graph computation, it should be invariant to the ordering of nodes. This r

Whose Model is Better?

You and your friend are training a neural network for classification. Both of you are using identical training data. The data has four classes with 40% examples of cat images, 10% images of dogs, and 25% each of horse and sheep images. Since the deadline for the project is nearing, both of you decide to run only a few epochs and get to report writing. At the same time, the two of you have a friendly wager of $10 going to the winner of the better model. At the end of training, you find out that your model, Net1, is making 30% recognition errors and the resulting distribution of assigned labels to the training data is 25% each for four classes. As luck would have it, your friend's model, Net2, is also yielding 30% error rate but the assigned labels in the training set are different with 40% cats, 10% dogs, 10% horse, and 40% sheep. Since the error rate by both models is identical, your friend declares a tie. You on the other hand are insisting that your model Net1 is slightly better

Mapping Nodes to Vectors: An Intro to Node Embedding

In an earlier post, I had stated that  the recent advances in Natural Language Processing (NLP) technology can be, to a large extent, attributed to the use of very high-dimensional vectors for language representation. These high-dimensional, 764 dimensions is common, vector representations are called   embeddings   and are aimed at capturing semantic meaning and relationships between linguistic items.  Given that graphs are everywhere, it is not surprising to see the ideas of word and sentence embeddings being extended to graphs in the form of node embeddings.   What are Node Embedding? Node embeddings are  encodings of the properties and relationships of nodes in a low-dimensional vector space.  This enables nodes with similar properties or connectivity patterns to have similar vector representations. Using node embeddings can improve performance on various graph analytics tasks such as node classification, link prediction, and clustering.    Methods for Node Embeddings There are seve

Google's Bard Can Code and Compute for You

Large language models (LLMs) continue to fascinate us with their capabilities to answer our questions, generate presentations and essays for us and many other assorted tasks. These models are also good at generating code for user specified tasks. However, almost all of them do not run the code for us; they simply give us the code that we can copy and execute.   Recently, Google has given its large language model, Bard , the computational capabilities as well. Bard thus not only provides the code but also executes it while answering user's questions. I wanted to check this feature of Bard. Below is what happened when I asked Bard a question that involved some computation. Not only generating the code for entropy calculation and running it, Bard went on to explain entropy and its answer. Google characterizes computing by Bard in response to user questions as "writing code on the fly" method. The company says, "So far, we've seen this method improve the accuracy of

Exploring Canonical Correlation Analysis (CCA): Uncovering Hidden Relationships

Canonical Correlation Analysis (CCA) is a statistical technique that enables us to uncover hidden associations between two sets of variables. Whether it's in the fields of psychology, economics, genetics, marketing or machine learning, CCA proves to be a powerful tool for gaining valuable insights. In this blog post, we will try to understand CCA. But first let’s take a look at two sets of observations, X and Y , shown below. These two sets of observations are made on the same set of objects and each observation represents a different variable. Let’s calculate pairwise correlation between the column vectors of X and Y . The resulting correlation values should give us some insight between the two sets of measurements. These values are shown below where the entry at (i,j) represents the correlation between the i-th column of X and the j-th column of Y . The correlation values show moderate to almost no correlation between the columns of the two datasets except a relatively higher

Embeddings Beyond Words: Intro to Sentence Embeddings

It wouldn't be an exaggeration to say that the recent advances in Natural Language Processing (NLP) technology can be, to a large extent, attributed to the use of very high-dimensional vectors for language representation. These high-dimensional, 764 dimensions is common, vector representations are called embeddings and are aimed at capturing semantic meaning and relationships between linguistic items. Although the idea of using vector representation for words has been around for many years, the interest in word embedding took a quantum jump with Tomáš Mikolov’s Word2vec algorithm in 2013. Since then, many methods for generating word embeddings, for example GloVe and BERT , have been developed. Before moving on further, let's see briefly how word embedding methods work. Word Embedding: How is it Performed? I am going to explain how word embedding is done using the Word2vec method. This method uses a linear encoder-decoder network with a single hidden layer. The input layer o