Wikimedia, the nonprofit behind Wikipedia and sister sites like Wikimedia Commons and Wikidata, just made it easier for AI models to tap into its massive knowledge base.

Wikimedia Deutschland, the organization’s German chapter, released a new resource called the Wikidata Embedding Project. It takes the roughly 120 million open data points stored in Wikidata and converts them into a format that’s simpler for large language models to actually use.

Even though Wikidata’s structured data is already machine-readable, it hasn’t been directly compatible with generative AI systems, which are built to work with natural language.

The new project translates Wikidata entries into vectors, which are basically numerical coordinates that show how different statements relate to each other.

Think of it

See Full Page