Word2Vec Galaxy

Embedding tool for high-dimensional word vectors and arithmetic operations

Word2Vec Galaxy
2025-07-18
View on GitHub
Word2Vec3D VisualizationVector Space

I built Word2Vec Galaxy to make abstract word embeddings feel real. It’s an interactive 3D tool that visualizes high-dimensional Word2Vec vectors, letting you explore semantic relationships and perform vector arithmetic with spatial intuition.

Learn how it works in a blog I wrote from understanding the research papers.

Word2Vec Galaxy Dashboard

Interactive 3D visualization Interface

How Word2Vec Works

Word2Vec creates dense vector representations of words by learning from their contextual relationships in large text corpora. The model captures semantic similarities by positioning related words closer together in high-dimensional space. This uses Google's pre-trained model to explore these relationships visually.

Application Interface

Two-dimensional PCA projection of the 1000-dimensional Skip-gram vectors.

Key Features

  • 3D Word Vector Visualization: Explore semantically similar words in interactive 3D space using PCA dimensionality reduction
  • Interactive Controls: Intuitive sidebar controls for customizing visualizations and selecting parameters
  • Real-time Processing: Instant visualization updates with progress indicators for model loading

Usage Examples

1. Similar Words Exploration

  1. Enter a word (e.g., "technology", "emotion", "animal")
  2. Adjust the number of similar words (5-50)
  3. Visualize the semantic neighborhood in 3D space
  4. Explore relationships through interactive rotation and zoom
3D Word Space Exploration

3D visualization of word clusters with color-coded similarity scores.

2. Vector Arithmetic Operations

  • king - man + woman = queen (Gender relationships)
  • walking - walk + run = running (Verb tense patterns)
Vector Arithmetic Visualization

Vector arithmetic operation showing a classic analogy

Installation & Setup

For detailed installation and setup instructions, please refer to the instructions in the Word2Vec Galaxy repository.

References

  • T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013. arXiv:1301.3781
  • T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," Advances in Neural Information Processing Systems 26, 2013.
  • Gensim Word2Vec - Word2Vec model

For full code, examples, and configuration, see the GitHub Repository.