• Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
2026/01/14 02:51 Wednesday
  • Login
mashdigi-Technology, new products, interesting news, trends
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
mashdigi-Technology, new products, interesting news, trends
No Result
View All Result
Home Market dynamics

Wikimedia's "Wikipedia Embedding Project" makes its vast knowledge base more suitable for generative AI models.
Lower the threshold for small and medium-sized developers to use it, and reduce the situation where generative AI technology is monopolized by only a few technology giants

Author: Mash Yang
2025-10-06
in Market dynamics, Life, network, software, Topics
A A
0
Share to FacebookShare on TwitterShare to LINE

In an era where generative AI applications are becoming increasingly popular, the quality and openness of knowledge sources are becoming key to driving innovation.AnnounceThrough the Wikidata Embedding Project, the company will make the vast knowledge database more suitable for use in generative AI models, lower the threshold for small and medium-sized developers to introduce and use it, and reduce the situation where generative AI technology is monopolized by only a few technology giants.

Wikimedia's "Wikipedia Embedding Project" makes its vast knowledge base more suitable for generative AI models.

Wikipedia has previously structured its data through Wikidata, encompassing approximately 120 million entries, making it theoretically easier for machines to read. However, because generative AI prefers processing natural language content rather than raw structured data, Wikidata is difficult to use directly. The newly launched embedded project aims to convert Wikidata into a "vector" format that AI models can understand.

Vectorization maps the relationships between words into a coordinate space. For example, the relationship between "dog" and "puppy" will be closer, while the relationship between "dog" and "bank account" will be smaller or even unrelated. This data conversion allows AI to better understand the natural meaning and context of the data, thereby improving the accuracy of natural language processing.

More importantly, previous AI training often relied solely on static data, making it difficult to timely reflect subsequent updates to Wikipedia's content. However, through this project, Wikidata has also integrated a "RAG" (Retrieval Augmented Generation) mechanism, enabling AI models to access the latest data in real time, significantly improving the timeliness and reliability of answers.

Wikimedia Germany emphasized in a press release that the project's core goal is to "enable AI models to access high-quality information to enhance the credibility of their outputs." They also noted that most AI systems currently rely on opaque, proprietary data, lacking transparency and verifiability. Opening up vectorized Wikidata will not only promote fairness in AI development but also help smaller teams reduce the development burden, preventing generative AI technology from being monopolized by a few tech giants.

In reality, vectorizing massive amounts of data requires extremely high computing and storage resources, making it challenging for small and medium-sized enterprises and independent developers. The Wikipedia Embedded Project collaborates with German artificial intelligence startup Jina AI and IBM subsidiary DataStax. Jina AI will develop the vectorization system, while DataStax will store the data in its Astra DB vector database. This means developers can directly leverage Wikipedia's knowledge base for their applications without having to build complex infrastructure.

As Wikimedia Germany stated, "Powerful AI shouldn't be monopolized by a few companies." This project isn't just a technological upgrade; it's a declaration of open, collaborative AI development. As generative AI becomes more widespread, this open-source and shared model may become a key step in promoting a more diverse AI ecosystem.

Tags: AIWikidata Embedding ProjectWikimediaWikipediaArtificial wisdomWikipediaWikipedia Embedding Project
ShareTweetShare
Mash Yang

Mash Yang

Founder and editor of mashdigi.com, and student of technology journalism.

Leave a comment Cancel reply

Your e-mail address Will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

mashdigi-Technology, new products, interesting news, trends

Copyright © 2017 mashdigi.com

  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Follow us

Welcome back!

Login to your account below

Forgotten Password?

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu

Log In
No Result
View All Result
  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Copyright © 2017 mashdigi.com