Monday, May 29, 2023

Stack Overflown

Nomic Atlas is an online tool for visualizing and exploring large datasets. It enables users to store, update and organize multi-million point datasets of unstructured text, images and embeddings. Atlas organizes the text and data into interactive maps which can then be explored in a web browser, using the map to run semantic searches of the uploaded data.

You can view an example of an interactive map created using Atlas in this Map of Stack Overflow Posts. This map organizes questions posted to Stack Overflow by frustrated programmers.The map visualizes the relationships between different topics on Stack Overflow. 

The map is created by using Vertex AI to generate embeddings of Stack Overflow posts. Embeddings are a type of representation that captures the meaning of a text. The map is then created by using the embeddings to calculate the similarity between different Stack Overflow posts. Because the map visualizes the relationships between different topics on Stack Overflow it can therefore be used to identify related topics, to find new topics to learn about, and to discover specific questions and answers posted on Stack Overflow.

If you like the Stack Overflow map then you might also like the Map of GitHub. The Map of GitHub is a network graph of over 400,000 GitHub projects. Each dot on this interactive map is a Github project, mapped based on the number of 'common stargazers'. 

This map of GitHub projects is made based on GitHub users use of stars to save or like a repository. In simple terms it connects two different repositories based on the number of users who have starred both repositories. In slightly more detailed terms it organizes a database of 350 million stars awarded to repositories between 2020 and the end of March 2023 using a Jaccard Similarity algorithm.

No comments: