ãŠã£ãããŒã¿ïŒå ±åäœæ¥ã§äœãããããªãŒãªç¥èããŒã¹
蚳泚ïŒãã㯠Wikidata: A Free Collaborative Knowledgebase by Denny VrandeÄiÄ, Markus Krötzsch (in Communications of the ACM, 2014) ã®æ¥æ¬èªèš³ã§ãããŠã£ãããŒã¿ïŒWikidataïŒã¯2012幎ã«ã¯ããŸãã2015幎ã«å ¬åŒã« Freebase ïŒã®äžéšïŒãåŒãç¶ããšããããå€§èŠæš¡ç¥èããŒã¹ïŒãã¬ããžããŒã¹ïŒã®ã²ãšã€ã§ãããŠã£ãããã£ã¢ããã³ãã®å§åйãããžã§ã¯ããšãã©ã³ãã£ã¢ããŒã¹ãå ±æãã¢ã¯ãã£ãã«æŽæ°ããã€ã¥ããŠããããšãã²ãšã€ã®ç¹åŸŽã§ããæ¬çš¿ã¯ Communications of ACM ã«æ²èŒããã解説ã®ãããæ³å®èªè ã¯ããæè¡ããã®äººããšæããŸãããŠã£ãããŒã¿å ã«ã解説ã¯ãããããããŸãã®ã§ããããŠã芧ãã ããã
Wikidata: A Free Collaborative Knowledgebase by Denny VrandeÄiÄ, Markus Krötzsch (in Communications of the ACM, 2014)
http://dx.doi.org/10.1145/2629489 http://korrekt.org/page/Wikidata:_A_Free_Collaborative_Knowledgebase
License
ã©ã€ã»ã³ã¹
The article text is published under the terms of Creative Commons CC-By 3.0. The "appropriate credit" required by the license (the "By" part) should have the form of a citation of the article mentioning at least the authors, original title, journal, and year. The article is published under the ACMs hybrid open access model and will also remain freely available from ACM servers. Image copyrights are separate: Tower of Babel engraving by Isaac Basire of 1733 is out of copyright; the Reasonator screenshot (Fig. 3) includes media with various licenses and is considered Fair Use (Reasonator is created by Magnus Manske). The fonts used in the PDF are not free, we do not own any rights for them, and we do not release these fonts under CC-By.
æ¬çš¿ã¯ã¯ãªãšã€ãã£ãã»ã³ã¢ã³ãº CC-By 3.0 ã®æ¡é ã®ããšã§çºè¡ãããŠããããã®ã©ã€ã»ã³ã¹ïŒã®ãByãéšåïŒãèŠæ±ãããé©åãªåž°å±è¡šç€ºãã¯ãå°ãªããšãèè ã»åé¡ã»ãžã£ãŒãã«åã»çºè¡å¹Žã«èšåããŠæ¬çš¿ãåç §ãã圢åŒããšãã¹ãã§ãããæ¬çš¿ã¯ACMã®ãªãŒãã³ã¢ã¯ã»ã¹ã¢ãã«ã®ããšã§çºè¡ãããŠãããACMãµãŒãäžã§ããªãŒã«å ¬éããã€ã¥ãããç»åã®èäœæš©ã¯ããããç°ãªãã1733幎㮠Isaac Basire äœããã«ã®å¡ã®èäœæš©ã¯åããŠãããReasonator ã®ã¹ã¯ãªãŒã³ã·ã§ããïŒå³3ïŒã¯ããŸããŸãªã©ã€ã»ã³ã¹ã®ç»åãå«ãã§ããããã§ã¢ãŠãŒã¹ãšèããããïŒReasonator ã®äœè 㯠Magnus ManskeïŒãPDFå ã®ãã©ã³ãã¯ããªãŒã§ã¯ãªããèè ã¯ãããã®ãã©ã³ãã«ã€ããŠãªããæš©å©ãæã£ãŠããããCC-Byã§ãªãªãŒã¹ããªãã
Unnoticed by most of its readers, Wikipedia1 continues to undergo dramatic changes, as its sister project Wikidata introduces a new multilingual "Wikipedia for data" (http://www.wikidata.org) to manage the factual information of the popular online encyclopedia. With Wikipedia's data becoming cleaned and integrated in a single location, opportunities arise for many new applications.
ã»ãšãã©ã®èªè ã«æ°ã¥ãããªããŸãŸããŠã£ãããã£ã¢2ã¯åçãªå€åãç¶ããŠãããå§åйãããžã§ã¯ãã§ãããŠã£ãããŒã¿ãããã®äººæ°ãããªã³ã©ã€ã³çŸç§äºå žå ã®äºå®çæ å ±ãã«ãªãæ°ããå€èšèªã®ãããŒã¿ã®ããã®ãŠã£ãããã£ã¢ããå°å ¥ããããã ããŠã£ãããã£ã¢ã®ããŒã¿ãæŽçããäžç®æã«çµ±åãããŠããã«ã€ããæ°ããå¿çšã®æ©äŒã倿°åºçŸããã
Originally conceived in 2001 as a mainly text-based resource, Wikipedia 3 has collected increasing amounts of structured data, including numbers, dates, coordinates, and many types of relationships, from family trees to the taxonomy of species. It has become a resource of enormous value, with potential applications across all areas of science, technology, and culture. This development is hardly surprising, given that Wikipedia is committed to "a world in which every single human being can freely share in the sum of all knowledge," according to its vision statement (https://wikimediafoundation.org/wiki/Vision). There is no question this must include data that can be searched, analyzed, and reused.
2001幎ã«ããšããšããã¹ãããŒã¹ã®ãªãœãŒã¹ãšããŠæ³å®ããããŠã£ãããã£ã¢ 4 ã¯ãæ°å€ã»æ¥ä»ã»åº§æšã»å®¶ç³»å³ãçç©åé¡çã®ããŸããŸãªçš®é¡ã®é¢ä¿ãªã©ãæ§é åããŒã¿ãã©ãã©ãéããŠãã£ããç§åŠã»æè¡ã»æåã®ãã¹ãŠã®é åã§å¿çšã®å¯èœæ§ããã€ãéæ¹ããªã䟡å€ã®ãããªãœãŒã¹ã«ãªã£ãããã®ããšã¯ããŠã£ãããã£ã¢ããã®ãŽã£ãžã§ã³ã¹ããŒãã¡ã³ãã«ããããã«ã人é¡ã®å šç¥èããã¹ãŠã®äººã ãå ±æã§ããäžçãã®å®çŸãç®æãããšããããŠé©ãã¹ãããšã§ã¯ãªãããã®ãªãã«ã¯æ€çŽ¢ã»åæã»è»¢çšãã§ããããŒã¿ããµããŸããããšã¯ééããªãã
It may be surprising that Wikipedia does not provide direct access to most of it, through either query services or downloadable data exports. Actual use of the data is rare and often restricted to specific pieces of information (such as geo-tags of Wikipedia articles used in Google Maps). The reason for this striking gap between vision and reality is that Wikipedia's data is buried in 30 million Wikipedia articles in 287 languages from which extraction is inherently very difficult.
ãŠã£ãããã£ã¢ã®ã»ãšãã©ã«ã€ããŠãã¯ãšãªãµãŒãã¹ãããŠã³ããŒãå¯èœãªããŒã¿ãšã¯ã¹ããŒãã®åœ¢ã§ããŠã£ãããã£ã¢èªèº«ãçŽæ¥çãªã¢ã¯ã»ã¹ãæäŸããŠããªããšããããšã¯é©ããããããããªãããã®ããŒã¿ã®å®éã®å©çšã¯ãŸãã§ãããå€ãã®å Žåãç¹å®ã®æ å ±ïŒããšãã° Google Maps ã§äœ¿ããããŠã£ãããã£ã¢ã®äœçœ®æ å ±ã¿ã°ãªã©ïŒã«ãããããããŽã£ãžã§ã³ãšçŸå®ã«ããã»ã©æ·±ãã®ã£ãããããã®ã¯ããŠã£ãããã£ã¢ã®ããŒã¿ã287èšèªã§æžããã3000äžä»¶ã®é ç®ã«åãŸã£ãŠããŠãä»çµã¿äžæœåºããšãŠãé£ããããã ã
This situation is unfortunate for anyone wanting to use the data but is also an increasing threat to Wikipedia's main goal of providing up-to-date, accurate, encyclopedic knowledge. The same information often appears in articles in many languages and in many articles within a single language. Population numbers for Rome, for example, can be found in English and Italian articles about Rome but also in the English article "Cities in Italy." The numbers are all different.
ãã®ç¶æ³ã¯ããŒã¿ãå©çšããããšãã人ã«ãšã£ãŠäžå¹žã§ããã ãã§ãªãããŠã£ãããã£ã¢ã®äž»èŠãªç®æšã§ãããææ°ã®ãæ£ç¢ºãªãçŸç§äºå žçç¥èã®æäŸããã³ããããã®ã«ããªãã€ã€ãããåãæ å ±ããã°ãã°å€æ°ã®èšèªçã®é ç®äžãã1èšèªçã®å¥ã®é ç®ã«ããããããããšãã°ããŒãã®äººå£ã¯ãããŒãã«ã€ããŠã®è±èªçãšã€ã¿ãªã¢èªçã®é ç®ã ãã§ãªããè±èªçã®ãã€ã¿ãªã¢ã®éœåžããšããé ç®ã«ãæ²èŒãããŠããããããã®æ°å€ã¯ãã¹ãŠãããã«ç°ãªã£ãŠããã
Wikidata aims to overcome such inconsistencies by creating new ways for Wikipedia to manage its data on a global scale; see the result at http://www.wikidata.org. The following essential design decisions characterize the Wikidata approach.
ãŠã£ãããŒã¿ã®ç®çã¯ãã°ããŒãã«ãªèŠæš¡ã§ãŠã£ãããã£ã¢ã®ããã®ããŒã¿ã管çããæ°ããããæ¹ãã€ãããããããéäžè²«æ§ã解決ããããšã ããã®çµæã http://www.wikidata.org ã§ãããäžèšã«ç€ºãæ¬è³ªçãªèšèšäžã®æ±ºå®ããŠã£ãããŒã¿ã®ã¢ãããŒããç¹åŸŽã¥ããŠããã
Open editing. As in Wikipedia, Wikidata allows every user to extend and edit the stored information, even without creating an account. A form-based interface makes editing easy.
ãªãŒãã³ãªç·šéããŠã£ãããã£ã¢ã«ãããŠãšåãããã«ããŠã£ãããŒã¿ã«ä¿ç®¡ãããããŒã¿ã¯å šå©çšè ãæ¡åŒµããç·šéã§ãããã¢ã«ãŠã³ãã®äœæãå¿ èŠãªãããã©ãŒã ããŒã¹ã®ã€ã³ã¿ãŒãã§ã€ã¹ããããç·šéã¯å®¹æã ã
Community control. Not only is the actual data controlled by the contributor community, so, too, is the schema of the data. Contributors edit the population number of Rome but also decide whether there is such a number in the first place.
ã³ãã¥ããã£ã«ãã管çãå®éã®ããŒã¿ãæçš¿è ã³ãã¥ããã£ã«ç®¡çãããŠããã ãã§ãªããããŒã¿ã®ã¹ããŒããåæ§ã ãæçš¿è ã¯ããŒãã®äººå£ãç·šéããã ãã§ãªãããããããããã£ãæ°å€ããããã©ããã®æ±ºå®ãããã
Plurality. It would be naive to expect global agreement on the "true" data, since many facts are disputed or simply uncertain. Wikidata allows conflicting data to coexist and provides mechanisms to organize this plurality.
å€å æ§ãäºå®ã®å€ãã«ã¯è«äºããã£ããäžç¢ºå®ã§ãã£ããããããããçã®ãããŒã¿ã«ã€ããŠã°ããŒãã«ãªåæãæåŸ ããã®ã¯ãã€ãŒãŽã ããŠã£ãããŒã¿ã¯è¡çªããããŒã¿ã®å ±åããããããã®å€å æ§ãæŽçããä»çµã¿ãæäŸããã
Secondary data. Wikidata gathers facts published in primary sources, together with references to these sources; for example, there is no "true population of Rome" but rather a "population of Rome as published by the city of Rome in 2011."
äºæ¬¡ããŒã¿ããŠã£ãããŒã¿ã¯äžæ¬¡æ å ±æºã§å ¬åãããäºå®ãããããæ å ±æºãžã®åç §ãšãšãã«åéãããããšãã°ããããŒãã®çã®äººå£ããšãããã®ã¯ãªããã2011幎ã«ããŒãåžã«ãã£ãŠå ¬åãããããŒãã®äººå£ããããã
Multilingual data. Most data is not tied to a single language; numbers, dates, and coordinates have universal meaning, so labels like "Rome" and "population" are translated into many different languages. Wikidata is multilingual by design. While Wikipedia has independent editions for each language, there is only one Wikidata site.
å€èšèªããŒã¿ãããŒã¿ã®å€ãã¯1ã€ã®èšèªã«ã²ãã¥ããããªããæ°å€ãæ¥ä»ã座æšã¯æ®éçãªæå³ããã£ãŠããããããŒãããã人å£ããšãã£ãã©ãã«ã¯å€ãã®ç°ãªãèšèªã«ç¿»èš³ãããããŠã£ãããŒã¿ã¯å€èšèªã«ãªãããèšèšãããŠããããŠã£ãããã£ã¢ã®åèšèªçã¯ç¬ç«ã ãããŠã£ãããŒã¿ã«ã¯1ãµã€ããããªãã
Easy access. Wikidata's goal is to allow data to be used both in Wikipedia and in external applications. Data is exported through Web services in several formats, including JavaScript Object Notation, or JSON, and Resource Description Framework, or RDF. Data is published under legal terms that allow the widest possible reuse.
ã¢ã¯ã»ã¹ã®å®¹ææ§ããŠã£ãããŒã¿ã®ç®æšã¯ããŠã£ãããã£ã¢ãšå€éšã¢ããªã±ãŒã·ã§ã³ã®äž¡æ¹ã§ããŒã¿ã䜿ããããã«ããããšã ãããŒã¿ã¯ãŠã§ããµãŒãã¹ãéããŠãJavaScript ãªããžã§ã¯ãèšæ³ïŒJSONïŒããªãœãŒã¹èšè¿°ãã¬ãŒã ã¯ãŒã¯ïŒRDFïŒãªã©ãããã€ãã®åœ¢åŒã§ãšã¯ã¹ããŒããããã転çšã®å¯èœæ§ãæå€§ã«ããå©çšèŠå®ã®ããšã§ããŒã¿ã¯å ¬åãããã
Continuous evolution. In the best tradition of Wikipedia, Wikidata grows with its community of editors and developers and the tasks they give it. Rather than develop a perfect system to be presented to the world in a couple of years, new features are deployed incrementally and as early as possible.
ç¶ç¶çãªé²åããŠã£ãããã£ã¢ã®æè¯ã®äŒçµ±ãã²ãã¥ãããŠã£ãããŒã¿ã¯ç·šéè ãšéçºè ã®ã³ãã¥ããã£ãããã³åœŒããäžããã¿ã¹ã¯ãšãšãã«æé·ãããæ°å¹Žã§å®ç§ãªã·ã¹ãã ãéçºããŠäžçã«çºè¡šããã®ã§ã¯ãªããæ°æ©èœã¯é次çã«ã§ããã ãã¯ãããããã€ãããã
These properties characterize Wikidata as a specific kind of curated database. 5
以äžã®ç¹åŸŽããŠã£ãããŒã¿ãç¹å®ã®çš®é¡ã®ãã¥ã¬ãŒã·ã§ã³ãããããŒã¿ããŒã¹6ã«ããŠããã
Data in Wikipedia
ãŠã£ãããã£ã¢ã®ããŒã¿
The value of Wikipedia's data has long been obvious, with many efforts to use it. The Wikidata approach is to crowd-source data acquisition, allowing a global community to edit the data. This extends the traditional wiki approach of allowing users to edit a website. Wiki is a Hawaiian word for fast; Ward Cunningham, who created the first wiki in 1995, used it to emphasize that his website could be changed quickly.7
ãŠã£ãããã£ã¢ã®ããŒã¿ã®äŸ¡å€ã¯æããæçœã§ããããããå©çšããããšå€ãã®åªåããªãããŠããããŠã£ãããŒã¿ã®ã¢ãããŒãã¯ããŒã¿ç²åŸãã¯ã©ãŠããœãŒã·ã³ã°ã«ããã°ããŒãã«ãªã³ãã¥ããã£ããã®ããŒã¿ãç·šéã§ããããã«ããããšã ãããã¯ãå©çšè ããŠã§ããµã€ããç·šéã§ããããã«ãããšããäŒçµ±çãªãŠã£ãã®ã¢ãããŒããæ¡åŒµããŠããããŠã£ãã¯ãã¯ã€èªã§éããšããæå³ã ã1995幎ã«ãã®èšèãã€ãã£ããŠã©ãŒãã»ã«ãã³ã¬ã ã¯ãèªåã®ãŠã§ããµã€ãããã°ãã倿Žã§ãããšããããšã匷調ããã®ã«ãã®èšèãã€ãã£ã8ã
The most popular such system is Semantic MediaWiki, or SMW,9 which extends MediaWiki, the software used to run Wikipedia, 10 with data-management capabilities. SMW was originally proposed for Wikipedia but was quickly used on hundreds of other websites as well. Unlike Wikidata, SMW manages data as part of its textual content, thus hindering creation of a multilingual, single knowledgebase supporting all Wikimedia projects. Moreover, the data model of Wikidata is more elaborate than that of SMW, allowing users to capture more complex information. In spite of these differences, SMW has had a great influence on Wikidata, and the two projects share code for common tasks.
ãã®ãããªã·ã¹ãã ãšããŠãã£ãšã人æ°ãããã®ã¯ããŠã£ãããã£ã¢ã®éçšã«äœ¿ããããœãããŠã§ã¢ã§ãã MediaWiki 11ãæ¡åŒµããŠããŒã¿ç®¡çæ©èœã远å ãã Semantic MediaWiki (SMW) ã 12ãSMW ã¯ããšããšã¯ãŠã£ãããã£ã¢åãã«ææ¡ãããããããã«äœçŸãã®ä»ã®ãŠã§ããµã€ãã«ã䜿ãããããã«ãªã£ãããŠã£ãããŒã¿ãšã¯éããSMW ã¯ããŒã¿ãããã¹ãã³ã³ãã³ãã®äžéšãšããŠç®¡çããŠããããã®ããå€èšèªã§å šãŠã£ãã¡ãã£ã¢ãããžã§ã¯ãããµããŒãããåäžã®ç¥èããŒã¹ã®äœæã«ã¯é害ããããããã«ããŠã£ãããŒã¿ã®ããŒã¿ã¢ãã«ã¯ SMW ã®ããããæŽç·ŽãããŠãããå©çšè ãããè€éãªæ å ±ããšãããããšãã§ããããã«ãªã£ãŠããããã®ãããªéãã¯ããã©ãSMW ã¯ãŠã£ãããŒã¿ã«å€å€§ãªåœ±é¿ããããããŠãããäž¡ãããžã§ã¯ãã¯å ±éã®ã¿ã¹ã¯ã«é¢ããã³ãŒããå ±æããŠããã
Other examples of free knowledgebase projects are OpenCyc and Freebase. OpenCyc is the free part of Cyc,13 which aims for a much more comprehensive and expressive representation of knowledge than Wikidata. OpenCyc is released under a free license and available to the public, but, unlike Wikidata, is not editable by the public. Freebase, acquired by Google in 2010, is an online platform that allows communities to manage structured data. 14 Objects in Freebase are classified by types that prescribe what kind of data an object can have; for example, Freebase classifies Einstein as a "musical artist" since it would otherwise not be possible to refer to recordings of his speeches. Wikidata supports the use of arbitrary properties on all objects. Other differences from Wikidata are related to multi-language support, source information, and the proprietary software used to run the site. The latter is critical for Wikipedia, which is committed to running on a fully open source software stack to allow all to fork, or copy and create one's own version of the project.
ããªãŒãªç¥èããŒã¹ã®ãããžã§ã¯ãã®äŸãšããŠã¯ä»ã« OpenCyc ãš Freebase ããããOpenCyc 㯠Cyc15 ã®ãã¡ããªãŒãªéšåã§ãããCyc ã¯ãŠã£ãããŒã¿ãããå æ¬çã§è¡šçŸåã®é«ãç¥è衚çŸãç®æããŠãããOpenCyc ã¯ããªãŒãªã©ã€ã»ã³ã¹ã§å ¬è¡ã«åããŠãªãªãŒã¹ãããŠãããããŠã£ãããŒã¿ãšéã£ãŠå ¬è¡ãç·šéããããšã¯ã§ããªãã2010幎ã«ã°ãŒã°ã«ã«è²·åããã Freebase ã¯ãªã³ã©ã€ã³ã®ãã©ããã©ãŒã ã§ãããFreebase ã§ã¯ã³ãã¥ããã£ãããŒã¿ã®æ§é ã管çã§ãã 16ãFreebase å ã®ãªããžã§ã¯ãã¯ããã®ãªããžã§ã¯ããã©ã®ãããªçš®é¡ã®ããŒã¿ãæãŠãããèŠå®ããåã§åé¡ããããããšãã°ãFreebase ã¯ã¢ã€ã³ã·ã¥ã¿ã€ã³ãã鳿¥œå®¶ããšããŠåé¡ãããããããªããã°ã圌ã®è¬æŒã®é²é³ãåç §ã§ããªãããã§ããããŠã£ãããŒã¿ã¯ä»»æã®ããããã£ãå šãããžã§ã¯ãã§äœ¿çšããããšããµããŒãããŠããããŠã£ãããŒã¿ãšç°ãªãä»ã®ç¹ã¯ãå€èšèªãµããŒããå žæ æ å ±ããµã€ãã®éçšã«äœ¿ããããã©ã€ãšã¿ãªãªãœãããŠã§ã¢ã«é¢é£ãããæåŸã®ç¹ã¯ãå®å šã«ãªãŒãã³ãœãŒã¹ãªãœãããŠã§ã¢ã®çµã§éçšããããšãçŽæããŠãããŠã£ãããã£ã¢ã«ãšã£ãŠéèŠã§ããããã®ããšã§ãŠã£ãããã£ã¢ã¯çãèªåã®ããŒãžã§ã³ã®ãããžã§ã¯ãããã©ãŒã¯ããè€è£œããäœæã§ããããã«ãªã£ãŠããã
Wikipedia's data is buried in 30 million Wikipedia articles in 287 languages from which extraction is inherently very difficult.
ãŠã£ãããã£ã¢ã®ããŒã¿ã¯287èšèªã§æžããã3000äžä»¶ã®é ç®ã«åãŸã£ãŠããŠãä»çµã¿äžæœåºããšãŠãé£ããã
Other approaches to creating knowledgebases from Wikipedia have aimed at extracting data from Wikipedia, most notably DBPedia 17 and Yago, 18 that extract information from Wikipedia categories and from infoboxes, the tabular summaries in the upper-right area of many Wikipedia articles. Additional mechanisms help improve extraction quality. Yago includes temporal and spatial context information, but neither DBpedia nor Yago extract source information.
ãŠã£ãããã£ã¢ããç¥èããŒã¹ãã€ããããšãããã®ä»ã®ã¢ãããŒãã¯ãŠã£ãããã£ã¢ããããŒã¿ãæœåºããããšãç®çãšããŠããããŠã£ãããã£ã¢ã®ã«ããŽãªãšã€ã³ãã©ããã¯ã¹ïŒãŠã£ãããã£ã¢é ç®ã®å€ãã®å³äžã«ãã衚圢åŒã®æŠèŠïŒããæ å ±ãæœåºãããèåãªã®ã¯ DBPedia 19ãš Yago 20 ã ã远å çãªä»çµã¿ãæœåºå質ãé«ããŠãããYago ã¯æéçã»ç©ºéçãªæèæ å ±ãå«ãã§ããããDBPedia ã Yago ãå žæ æ å ±ãæœåºããŠããªãã
Wikipedia data, obtained from these projects or through custom extraction methods, has been used to improve object search in Google's Knowledge Graph (based on Freebase) and Facebook's Open Graph and in answering engines, including Wolfram Alpha,21 Evi,22 and IBM's Watson.23 Wikipedia's geo-tags are also used by Google Maps. All these applications would benefit from up-to-date, machine-readable data exports (such as the way Google Maps shows India's Chennai district in the polar Kara Sea, next to Ushakov Island). Among these applications, Freebase and Evi are the only ones that also allow users to edit or to at least extend the data.
ãããã®ãããžã§ã¯ããç¬èªã®æœåºæ¹æ³ã«ãã£ãŠç²åŸããããŠã£ãããã£ã¢ã®ããŒã¿ã¯ãã°ãŒã°ã«ã®ïŒFreebase ã«ããšã¥ããïŒKnowledge Graph ã Facebook ã® Open GraphããŸããWolfram Apha24ãEvi25ãIBM ã® Watson26 ãªã©è³ªåå¿çãšã³ãžã³ã«ããããªããžã§ã¯ãæ€çŽ¢ãæ¹åããã®ã«äœ¿ãããŠããã以äžãã¹ãŠã®å¿çšããææ°ã®ãæ©æ¢°å¯èªãªããŒã¿ãšã¯ã¹ããŒãããæ©æµããããïŒããšãã°ã°ãŒã°ã«ãããã shows India's Chennai district in the polar Kara Sea, next to Ushakov IslandïŒããšã«ãªãããããå¿çšã®ãã¡ãFreebase ãš Evi ã ããå©çšè ã«ããç·šéãå°ãªããšãããŒã¿ã®æ¡åŒµãèš±ããŠããã
A Short History of Wikidata
ãŠã£ãããŒã¿å°å²
Wikimedia launched Wikidata in October 2012. Initially, features were limited, with editors only able to create items and connect them to Wikipedia articles. In January 2013, three Wikipediasâfirst Hungarian, then Hebrew and Italianâbegan to connect to Wikidata. Meanwhile, the Wikidata community had already created more than three million items. The English Wikipedia followed in February, and all Wikipedias were connected to Wikidata in March.
ãŠã£ãã¡ãã£ã¢ã¯2012幎10æã«ãŠã£ãããŒã¿ãå ¬éãããåœåãæ©èœã¯éå®ãããŠãããç·šéè ã¯ã¢ã€ãã ãäœæãããŠã£ãããã£ã¢é ç®å士ãã€ãªãããšããã§ããªãã£ãã2013幎1æããã³ã¬ãªãŒèªçãããã©ã€èªçãã€ã¿ãªã¢èªçã®3ã€ã®ãŠã£ãããã£ã¢ãããŠã£ãããŒã¿ãžã®æ¥ç¶ãéå§ããããã®äžæ¹ã§ããŠã£ãããŒã¿ã®ã³ãã¥ããã£ã¯300äžã¢ã€ãã ããã§ã«äœæããŠãããè±èªçãŠã£ãããã£ã¢ã2æã«ã€ã¥ããå šãŠã£ãããã£ã¢ã3æã«ãŠã£ãããŒã¿ã«æ¥ç¶ãããã
Wikidata received input from more than 40,000 contributors as of February 2014. Since May 2013, it has continuously had more than 3,500 active contributors, those making at least five edits per month. These numbers make it one of the most active Wikimedia projects.
ãŠã£ãããŒã¿ã¯2014幎2ææç¹ã§ã4äžäººã®æçš¿è ããå ¥åããããŠããã2013幎5æä»¥éãç¶ç¶çã«ãã¢ã¯ãã£ããªæçš¿è ïŒ1ã¶æéã«å°ãªããšã5åç·šéããŠãã人ïŒã3500人以äžããããã®æ°å€ã«ããããŠã£ãããŒã¿ã¯ãŠã£ãã¡ãã£ã¢ãããžã§ã¯ã矀ã®ãã¡ãã£ãšã掻çºãªãããžã§ã¯ãã®ã²ãšã€ã«äœçœ®ä»ããããã
In March 2013, Wikimedia introduced Lua as a scripting language for automatically creating and enriching parts of articles (such as the infoboxes mentioned earlier). Lua scripts can access Wikidata, allowing Wikipedia editors to retrieve, process, and display data.
2013幎3æããŠã£ãã¡ãã£ã¢ã¯é ç®ã®äžéšåïŒå ã«è¿°ã¹ãã€ã³ãã©ããã¯ã¹ãªã©ïŒãäœæããªããã«ããããã®ã¹ã¯ãªããèšèªãšã㊠Lua ãå°å ¥ãããLua ã¹ã¯ãªããã¯ãŠã£ãããŒã¿ã«ã¢ã¯ã»ã¹ã§ããããã«ãããŠã£ãããã£ã¢ç·šéè ã¯ããŒã¿ãååŸãå å·¥ã衚瀺ããããšãã§ããã
Many other features were introduced in 2013, and development is planned to continue for the foreseeable future.
2013幎ã«ã¯ä»ã«ã倿°ã®æ©èœãå°å ¥ãããäºæž¬ã§ããç¯å²ã§ä»åŸãéçºã¯ç¶ç¶ããäºå®ã ã
Out of Many, One
å€ããäžãž
The first challenge for the Wikidata community was to reconcile the 287 language editions of Wikipedia; for example, for Wikidata to be truly multilingual, the object representing "Rome" must be one and the same across all languages. Fortunately, Wikipedia already has a closely related mechanism: language links, displayed on the left side of each article, connecting articles in different languages. These links were created from user-edited text entries at the bottom of each article, leading to a quadratic number of links; for example, each of the 207 articles on Rome included a list of 206 links to all other articles on Romeâa total of 42,642 lines of text. By the end of 2012, 66 of the 287 language editions of Wikipedia included more text for language links than for actual article content.
ãŠã£ãããŒã¿ã®ã³ãã¥ããã£ã«ãšã£ãŠæåã®é£é¡ã¯ãŠã£ãããã£ã¢287èšèªçãå調ãããããšã ã£ããããšãã°ããŠã£ãããŒã¿ãçã«å€èšèªã«ãªãããšããã®ã§ããã°ããããŒããã衚ããªããžã§ã¯ãã¯å šèšèªãã€ãããŠ1ã€ã§ãªããã°ãªããªãã幞ãããŠã£ãããã£ã¢ã¯ããã«åŒ·ãé¢é£ããä»çµã¿ããã§ã«ãã£ãŠãããåé ç®ã®å·ŠåŽã«è¡šç€ºãããç°ãªãèšèªã®ããã ã§é ç®ãã€ãªãã§ããèšèªãªã³ã¯ã§ããããããã®ãªã³ã¯ã¯åé ç®ã®äžéšãå©çšè ãç·šéããããã¹ãããçæãããŠããããªã³ã¯ã®æ°ã¯äºä¹ã«æ¯äŸãããããšãã°ãããŒãã«ã€ããŠã®207é ç®ã¯ãããããä»ã®206é ç®ãžã®ãªã³ã¯ãå«ãã§ãããåèšããã°42642è¡ã®ããã¹ãã«ãªãã2012幎æ«ãŸã§ã«ããŠã£ãããã£ã¢287èšèªçã®ãã¡66èšèªçã§ã¯ãé ç®ã®å®éã®å 容ãããèšèªãªã³ã¯ã®ã»ãã«å€ãã®ããã¹ãããã£ãã
It would clearly be better to store and manage language links in a single location, and so became Wikidata's first task. For every Wikipedia article, a page has now been created on Wikidata for managing links to related Wikipedia articles in all languages; these pages are called "items." Initially, only a limited amount of data could be stored for each item: a list of language links, a label, a list of aliases, and a one-line description. Labels, aliases, and descriptions can now be specified individually for up to 358 languages.
èšèªãªã³ã¯ã1ç®æã§ä¿ç®¡ã管çããã»ããããããã«ããããã®ããããããŠã£ãããŒã¿ã®æåã®ã¿ã¹ã¯ã«ãªã£ããåãŠã£ãããã£ã¢é ç®ã«å¯ŸããŠãå šèšèªã®ãŠã£ãããã£ã¢é ç®ãžã®ãªã³ã¯ã管çããããã®1åã®ããŒãžããŠã£ãããŒã¿ã«äœæãããŠããããããã®ããŒãžã¯ãã¢ã€ãã ããšåŒã°ãããåœåãã¢ã€ãã ã«ã¯éå®ãããéã®ããŒã¿ããæ ŒçŽã§ããªãã£ããèšèªãªã³ã¯ã®ãªã¹ããã©ãã«ãå¥åã®ãªã¹ãã1è¡ã®èª¬æã§ãããã©ãã«ãå¥åã説æã¯æªã§ã¯æå€§ã§358èšèªã«ã€ããŠåå¥ã«æå®ããããšãã§ããã
The Wikidata community has created bots to move language links from Wikipedia to Wikidata, and more than 240 million links were removed from Wikipedia. Most language links displayed on Wikipedia are served from Wikidata. It is still possible to add custom links in an article, as needed in the rare cases where links are not bi-directional; some articles refer to more general articles in other languages, while Wikidata deliberately connects pages that cover the same subject. By importing language links, Wikidata gained a huge set of initial items "grounded" in actual Wikipedia pages.
ãŠã£ãããŒã¿ã®ã³ãã¥ããã£ã¯ãŠã£ãããã£ã¢ãããŠã£ãããŒã¿ãžèšèªãªã³ã¯ãç§»æ€ããããã®ããããäœæãã2å4000äžå以äžã®ãªã³ã¯ããŠã£ãããã£ã¢ããé€å»ãããããŠã£ãããã£ã¢ã§è¡šç€ºãããèšèªãªã³ã¯ã®ãã¡ã»ãšãã©ã¯ãŠã£ãããŒã¿ããäŸçµŠãããŠããããªã³ã¯ãåæ¹åçã§ãªããŸããªäŸã§å¿ èŠã«ãªã£ãå Žåãé ç®å ã«ç¬èªã®ãªã³ã¯ã远å ããããšã¯ä»ã§ãå¯èœã ããŠã£ãããŒã¿ãåäžã®äž»é¡ã«ã€ããŠã®ããŒãžãã€ãªãèšèšã«ãã£ãŠããã®ã«å¯ŸããŠãä»èšèªã®ããåºãæå³ã®é ç®ãåç §ããé ç®ãäžéšã«ããã®ã§ãããèšèªãªã³ã¯ãç§»å ¥ããããšã«ããããŠã£ãããŒã¿ã¯å®éã®ãŠã£ãããã£ã¢ããŒãžãžã®ãè¶³ããããã®ããã¢ã€ãã ã®å·šå€§ãªéåãåŸãã
Simple Data (Properties and Values)
åçŽãªããŒã¿ïŒããããã£ãšå€ïŒ
To store structured data beyond text labels and language links, Wikidata uses a simple data model. Data is basically described through property-value pairs; for example, the item for "Rome" might have a property "population" with value "2,777,979." Properties are objects and have their own Wikidata pages with labels, aliases, and descriptions. Unlike items, however, these pages are not linked to Wikipedia articles.
ããã¹ãã®ã©ãã«ãšèšèªãªã³ã¯ãè¶ ããæ§é åããŒã¿ãæ ŒçŽããããããŠã£ãããŒã¿ã¯åçŽãªããŒã¿ã¢ãã«ã䜿ããããŒã¿ã¯ååãšããŠãããããã£ãšå€ã®å¯ŸããšãããŠèšè¿°ããããããšãã°ããããŒããã¯ã人å£ãã®ããããã£ãæã¡ããã®å€ãšããŠã2777979ããæã€ããšãã§ãããããããã£ã¯ãªããžã§ã¯ãã§ãããåå¥ã«ãŠã£ãããŒã¿ã®ããŒãžãæã¡ãããã«ã¯ã©ãã«ãå¥åã説æããããã¢ã€ãã ãšã¯éãããããã®ããŒãžã¯ãŠã£ãããã£ã¢é ç®ã«ãªã³ã¯ãããªãã
On the other hand, property pages always specify a datatype that defines which type of values the property can have. "Population" is a number; "has father" relates to another Wikidata item; and "postal code" is a string. This information is important for providing adequate user interfaces and ensuring the validity of inputs. There are only a small number of datatypes, mainly quantity, item, string, date and time, geographic coordinates, and URL. Data is international, though its display may be language-dependent; for example, the number 1,003.5 is written "1.003,5" in German and "1 003.5" in French.
ãã®äžæ¹ã§ãããããã£ã®ããŒãžã§ã¯ããã®å€ããšãããå€ã®åãå®çŸ©ããããŒã¿åãåžžã«æå®ããããã人å£ãã¯æ°å€ããç¶ãã¯å¥ã®ãŠã£ãããŒã¿ã¢ã€ãã ãžã®é¢é£ã¥ãããéµäŸ¿çªå·ãã¯æååã ããã®æ å ±ã¯é©åãªãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ã€ã¹ãæäŸãããšãšãã«å ¥åã®åŠ¥åœæ§ã確ä¿ããããã«éèŠã ãããŒã¿åã®æ°ã¯å°ãªããããã«éãã¢ã€ãã ãæååãæ¥ä»ãšæå»ãå°å³åº§æšãURL ã ãããŒã¿ã¯åœéçã ãã衚瀺ã¯èšèªäŸåã«ãªããããããšãã°ãæ°å€ 1003.5 ã¯ãã€ãèªã§ã¯ã1.003,5ãããã©ã³ã¹èªã§ã¯ã1 003.5ããšæžãããã
Not-So-Simple Data
åçŽã§ãªãããŒã¿
Property-value pairs are too simple in many cases; for example, Wikipedia says the population of Rome was 2,651,040 "as of 2010" based on "estimations" published by the National Institute for Statistics, or Istat, in Italy (http://www.istat.it/); see Figure 1 for how Rome statistics can be represented in Wikidata. Even leaving source information aside, the information cannot be expressed easily in property-value pairs. One could use a property "estimated population in 2010" or create an item "Rome" in 2010 to specify a value for its "estimated population." However, either solution is clumsy and impractical. As suggested by Figure 1, we would like the data to contain a property "as of" with value "2010" and a property "method" with value "estimation." These property-value pairs do not refer to Rome but to the assertion that Rome has a population of 2,651,040. We thus arrive at a model where the property-value pairs assigned to items can have additional subordinate property-value pairs we call "qualifiers."
ããããã£ãšå€ã®å¯Ÿã§ã¯å€ãã®å ŽååçŽããããããšãã°ããŠã£ãããã£ã¢ã¯ã2010幎æç¹ãã®ããŒãã®äººå£ãã€ã¿ãªã¢ã®åœç«çµ±èšåŠç ç©¶æ (Istat) ã®çºè¡šãããæšèšãã«ãã£ãŠ265äž1040ãšããŠãããå³1ã«ãããŒãã®äººå£çµ±èšãã©ããŠã£ãããŒã¿ã§ãããããããã瀺ãããå žæ æ å ±ãå¥ãšããŠããããããã£ãšå€ã®å¯Ÿã§ãã®æ å ±ã衚çŸããã®ã¯ç°¡åã§ã¯ãªããã2010å¹Žã®æšèšäººå£ããšããããããã£ã䜿ã£ãããã2010幎ã®ããŒããïŒERRATA "Rome in 2010"ïŒãšããã¢ã€ãã ãäœããæšèšäººå£ãã®å€ãæå®ãããããããšã¯ã§ãããããããã©ã¡ãã®è§£æ³ãäžæ°å¥œã§éå®éçã ãå³1ã«ç€ºããããã«ããã®ããŒã¿ã«ã¯ã2012ããšããå€ã®ãããæç¹ããšããããããã£ãå ¥ãããæšèšããšããå€ã®ãããææ³ããšããããããã£ãå ¥ããã»ãã奜ãŸãããããããããããã£ãšå€ã®å¯Ÿã¯ããŒãã«èšåããã®ã§ã¯ãªããããŒãã«265äž1040人ã®äººå£ããããšãã䞻匵ã«èšåãããããã«ãããã¢ã€ãã ã«ä»äžãããããããã£ãšå€ã®å¯Ÿã坿¬¡çãªããããã£ãšå€ã®å¯ŸãæãŠãïŒããããéå®åããšåŒã¶ïŒã¢ãã«ã«å°éããã
Qualifiers can be used to state contextual information (such as the validity time for an assertion). They can also be used to encode ternary relations that elude the property-value model; for example, to say Meryl Streep played Margaret Thatcher in the movie The Iron Lady, one could add to the item of the movie a property "cast member" with value "Meryl Streep" and an additional qualifier "role = Margaret Thatcher."
éå®åã¯ïŒäž»åŒµãæç«ããæéãªã©ïŒæèæ å ±ãè¿°ã¹ãã®ã«ã䜿ãããããããã£ã»å€ã®ã¢ãã«ãè¶ ããäžé é¢ä¿ã笊å·åããã®ã«ã䜿ãããããšãã°ãã¡ãªã«ã»ã¹ããªãŒããæ ç»ãéã®å¥³ã®æ¶ãã§ããŒã¬ã¬ããã»ãµããã£ãŒãæŒãããšãããšããæ ç»ã®ã¢ã€ãã ã«ãåºæŒè ãã®ããããã£ã«ãã¡ãªã«ã»ã¹ããªãŒããã®å€ãã€ããŠå ¥ããã圹æ = ããŒã¬ã¬ããã»ãµããã£ãŒããšããéå®åãã€ããããšãã§ããã
Such qualifiers illustrate why we adopted an extensible set of qualifiers instead of restricting ourselves to the most common qualifiers (such as for temporal information). Qualifiers in their current form are indeed an almost direct representation of data found in Wikipedia infoboxes. This solution resembles known approaches to representing context information.2728 It should not, however, be misunderstood as a workaround to represent relations of higher arity in graph-based data models, since Wikidata statements do not have a fixed (or even bounded) arity in this sense.29
ãã®ãããªéå®åã¯ããã£ãšãããããéå®åïŒæéçæ å ±ãªã©ïŒã ãã§ãªããéå®åã®æ¡åŒµå¯èœãªéåãæã ãæ¡çšããã®ã¯ãªããã®ããäŸã ãããŸããå§¿ã§ã®éå®åã¯ãŸãã«ãŠã£ãããã£ã¢ã®ã€ã³ãã©ããã¯ã¹ã§ã¿ãããããŒã¿ã®ã»ãŒçŽæ¥çãªè¡šçŸã ããã®è§£æ³ã¯æèæ å ±ã衚çŸããæ¢åã®ã¢ãããŒããšäŒŒãŠãã3031ããã ããããã¯ãã°ã©ãããŒã¹ã®ã¢ãã«ã«ãããããªãå€åŒæ°ã®é¢ä¿ã衚çŸããããã®ãã®å Žãã®ãã®æ¹æ³ãšããŠèª€è§£ãããã¹ãã§ã¯ãªãããŠã£ãããŒã¿ã®ã¹ããŒãã¡ã³ãã«ããã®æå³ã§ã®åºå®ãããïŒãããã¯äžçã®ããïŒåŒæ°ã®æ°ã¯ãªã32ã
Wikidata also allows for two special types of statements: First, it is possible to specify that the value of a property is unknown; for example, one can say Ambrose Bierce's day of death is unknown rather than not say anything about it, clarifying he is certainly not among the living. Second, one can say a property has no value at all (such as in asserting Australia has no countries sharing its borders). It is important to distinguish this situation from the common case that information is simply incomplete. It would be wrong to consider these two cases as special values, becoming clear when considering queries that ask for items sharing the same value for a property; otherwise, one would have to conclude Australia and Iceland have a common neighbor.
ãŠã£ãããŒã¿ã¯2çš®ã®ç¹æ®ãªã¹ããŒãã¡ã³ãã蚱容ããã1ã€ç®ãšããŠãããããã£ã®å€ãæªç¥ã ãšæå®ããããšãã§ãããããšãã°ãã¢ã³ãããŒãºã»ãã¢ã¹ã®æ»äº¡æ¥ã«ã€ããŠãäœãèšããªããããã«æªç¥ã ãšèšãããšã«ãã£ãŠã圌ã確å®ã«çããŠããªããšããããšãæç¢ºã«ã§ããã2ã€ç®ãšããŠãããããã£ã®å€ããŸã£ãããªããšãèšããïŒããšãã°ãªãŒã¹ãã©ãªã¢ã¯ã©ã®åœãšãåœå¢ãæ¥ããŠããªããšäž»åŒµããå Žåãªã©ïŒãæ å ±ãäžå®å šã§ãããšããããããå Žåãšãã®ç¶æ³ãšãåºå¥ããããšã¯éèŠã ããã®2ã€ã®å Žåãç¹æ®ãªå€ãšããŠãã€ããããšã¯ãåäžã®å€ãå ±æããã¢ã€ãã ããã質åãæ³å®ããŠã¿ãã°ãããããªããã«ãééãã ããã®ããã«ããŠãããªããã°ããªãŒã¹ãã©ãªã¢ãšã¢ã€ã¹ã©ã³ããå ±éã®é£äººããã€ãšçµè«ã¥ããŠããŸãã
Further details on the Wikidata data model and its expression in Web Ontology Language in Resource Description Framework, or OWL/RDF, can be found in Erxleben et al. 33
ãŠã£ãããŒã¿ã®ããŒã¿ã¢ãã«ããã³ Web Ontology Language in Resource Description Framework (OWL/RDF) ã«ããããã®è¡šçŸã«ã€ããŠã®è©³çްã¯ãErxleben et al 34ã«ããã
Citation Needed
èŠåºå ž
Property assertions, possibly with qualifiers, provide a rich structure for expressing arbitrary claims. In Wikidata, every such claim can include a list of references to sources that support the claim. Including references agrees with Wikipedia's goal of being a secondary (or tertiary) source that does not publish its own research but rather gathers information published in other primary (or secondary) sources.
å Žåã«ãã£ãŠéå®åãã€ããªããããããã£ã宣èšããããšã¯ããããã䞻匵ã衚çŸããããã®è±ããªæ§é ãæäŸããããŠã£ãããŒã¿ã§ã¯ããããã£ã䞻匵ã²ãšã€ã²ãšã€ã«ããã®äž»åŒµãæ¯æããæ å ±æºãžã®åç §ã®ãªã¹ããã€ããããšãã§ãããåèæç®ãã€ããããšã¯ãç¬èªã®ç ç©¶çµæãçºè¡šãããã»ãã®äžæ¬¡ïŒãããã¯äºæ¬¡ïŒæ å ±æºã§çºè¡šãããæ å ±ããŸãšããäºæ¬¡ïŒãããã¯äžæ¬¡ïŒæ å ±æºã«ãªããšãããŠã£ãããã£ã¢ã®ç®æšã«é©åããã
There are many ways to specify a reference, depending on whether it is a book, a curated database, a website, or something else entirely. Moreover, some sources may be represented by Wikidata items, while others are not. In this light, a reference is simply a list of property-value pairs, leaving the details of reference modeling to the community. Note Wikidata does not automatically record provenance35 but does provide for the structural representation of references.
åèæç®ãç€ºãæ¹æ³ã¯å€æ°ããããããæ¬ã§ããããç·šçºãããããŒã¿ããŒã¹ã§ãããããŠã§ããµã€ãã§ãããããŸã£ããå¥ã®äœãã§ãããã«ãã£ãŠéããããã«ã察å¿ãããŠã£ãããŒã¿ã®ã¢ã€ãã ãããæ å ±æºããããããã§ãªãæ å ±æºããããããããç¹ãèžãŸããŠãåèæç®ã¯åçŽã«ããããã£ãšå€ã®å¯Ÿã®ãªã¹ãã«ãªã£ãŠãããåèæç®ã®ã¢ãã«åã®è©³çްã¯ã³ãã¥ããã£ã«ãã ããããŠããããŠã£ãããŒã¿ã¯èªåçã«åºæ36ãèšé²ã¯ããªãããåèæç®ã瀺ãããã®æ§é çè¡šçŸæ¹æ³ãæäŸããããšã«æ³šæããŠã»ããã
Sources are also important as context information. Different sources often make contradictory claims, yet Wikidata intends to represent all views rather than choose one "true" claim. Combined with the context information provided by qualifiers (such as for temporal context), many statements could be stored about a single property (such as population). To help manage this plurality, Wikidata allows contributors to optionally mark statements as "preferred" (for the most relevant, current statements) or "deprecated" (for irrelevant or unverified statements). Deprecated statements may be useful to Wikidata editors, to record erroneous claims of certain sources or to keep statements that still need to be improved or verified. As with all Wikidata content, these classifications are subject to community-governed editorial processes, similar to those of Wikipedia. 37
æ å ±æºã¯æèæ å ±ãšããŠãéèŠã ãç°ãªãæ å ±æºãççŸãã䞻匵ãããããšã¯ãããããããŠã£ãããŒã¿ã¯ã²ãšã€ã®ãçã®ã䞻匵ãéžã¶ã®ã§ã¯ãªãããã¹ãŠã®èгç¹ã衚çŸããããšãæå³ãããéå®åã«ãã£ãŠæäŸãããïŒæéçæèãªã©ã®ïŒæèæ å ±ãåãããããšã§ãã²ãšã€ã®ããããã£ïŒããšãã°äººå£ïŒã«ã€ããŠå€æ°ã®äž»åŒµãä¿æã§ããããã®å€æ¥µæ§ã管çããå©ããšããŠããŠã£ãããŒã¿ã¯ãæçš¿è ãããã¹ããŒãã¡ã³ãããåªå ãïŒãã£ãšãéèŠã§ãææ°ã®ã¹ããŒãã¡ã³ãã«ãããã€ããïŒããããã¯ãéæšå¥šãïŒããŸãéèŠã§ãªãããŸãã¯æ€èšŒãããŠããªãã¹ããŒãã¡ã³ãã«ãããã€ããïŒãšå°ã¥ãã§ããããã«ããŠãããéæšå¥šã®ã¹ããŒãã¡ã³ãã¯ãŠã£ãããŒã¿ç·šéè ãã«ãšã£ãŠã¯ãäžéšã®æ å ±æºã«ãã誀ã£ã䞻匵ãèšé²ããããä»åŸæ¹åãŸãã¯æ€èšŒãããå¿ èŠã®ããã¹ããŒãã¡ã³ããä¿æãããããªã©ã«ãå©çšäŸ¡å€ããããããããªãããŠã£ãããŒã¿ã®å šã³ã³ãã³ãããã§ããããã«ãããããåé¡ã¯ããŠã£ãããã£ã¢ãšåæ§ã®ã³ãã¥ããã£ã«çµ±æ²»ãããç·šéããã»ã¹ 38ã«ãã ããããŠããã
Wikidata by the Numbers
æ°å€ã§ã¿ããŠã£ãããŒã¿
Wikidata has grown significantly since its launch in October 2012; see the table here for key facts about its current content. It has also become the most edited Wikimedia project, with 150â500 edits per minute, or a half million per day, about three times as many as the English Wikipedia. Approximately 90% of these edits are made by bots contributors create for automating tasks, yet almost one million edits per month are still made by humans. Figure 2a shows the number of human edits during 14-day intervals. We highlight contributions of power users with more than 10 or even 100,000 edits, respectively, as of February 2014, as they account for most of the variation. The increase in March 2013 marked the official announcement of the site.
ãŠã£ãããŒã¿ã¯2012幎10æã®å ¬é以æ¥é¡èãªæé·ãããŠããããã®è¡šãèŠããšçŸç¶ã®ã³ã³ãã³ãã«é¢ããäž»ãªäºå®ãåããããŠã£ãããŒã¿ã¯ãŠã£ãã¡ãã£ã¢ãããžã§ã¯ãã®ãªãã§ãã£ãšãå€ãç·šéããããããžã§ã¯ãã«ãªã£ãã1åéã«150åãã500åã1æ¥ã«50äžåç·šéãããŠããããã®æ°ã¯è±èªçãŠã£ãããã£ã¢ã®ããã3åã ããããç·šéã®ãã¡ã»ãŒ90%ã¯äœæ¥èªååã®ããã«ã€ãããã(ERRATA contributors created for)ãããæçš¿è ã«ããç·šéã ãã1ã¶æã«ã»ãŒ100äžåã®ç·šéã¯äººéã«ãã£ãŠãªãããŠãããå³ 2a ã¯14æ¥éã®æéå ã®äººéã®ç·šéåæ°ã瀺ããŠããã10å以äžïŒäººã«ãã£ãŠ10äžå以äžïŒç·šéããŠãããã¯ãŒãŠãŒã¶ãŒã«ããæçš¿ãããããšã匷調ããã
Figure 2b shows the growth of Wikidata from its launch until February 2014. There were approximately 14.5 million items and 36 million language links. Essentially, every Wikipedia article is connected to a Wikidata item today, so these numbers grow slowly. In contrast, the number of labels, 45.6 million, as of February 2014, continues to grow; there are more labels than Wikipedia articles. Almost 10 million items have statements, and more than 30 million statements were created using more than 900 different properties. As expected, property use is skewed; the most frequent property is "instance of" P31 (5.6 million uses) for classifying items; one of the least-frequent properties is P485 (133 uses), which connects a topic (such as Johann Sebastian Bach) with the institution that archives the topic (such as the Bach-Archiv in Leipzig).
å³ 2b ã¯ãŠã£ãããŒã¿å ¬éãã2014幎2æãŸã§ã®æé·ã瀺ããŠããã1450äžã®ã¢ã€ãã ãš3600äžã®èšèªéãªã³ã¯ããããçŸåšã§ã¯äºå®äžå šãŠã£ãããã£ã¢é ç®ããŠã£ãããŒã¿ã«æ¥ç¶ãããŠããããããã®æ°ã¯ãã£ããå¢ããããã«ãªã£ãŠãããå¯Ÿç §çã«ãã©ãã«ã®æ°ã¯2014幎2ææç¹ã§4560äžã§ãããæé·ãã€ã¥ããŠãããã©ãã«ã¯ãŠã£ãããã£ã¢é ç®ããå€ããã»ãŒ1000äžã¢ã€ãã ã«ã¹ããŒãã¡ã³ããããã3000äžã¹ããŒãã¡ã³ã以äžã900ã®ç°ãªãããããã£ã䜿ã£ãŠäœãããŠãããäºæ³ãããããã«ããããã£ã®äœ¿çšã«ã¯ãããããããããã£ãšããã䜿ãããã®ã¯ã¢ã€ãã ãåé¡ããããã®P31ã ãis instance ofïŒä»¥äžã®å®äœïŒãïŒ560äžç®æã§äœ¿çšïŒã ããã£ãšã䜿çšç®æã®å°ãªããã®ã®ã²ãšã€ã¯ãããã¯ïŒããšãã°ãšãã³ã»ãŒãã¹ãã£ã¢ã³ã»ãããïŒãšæœèšïŒããšãã°ã©ã€ãããã®ãããã¢ãŒããïŒãã€ãªã P485 ã ã
The Web of Data
ããŒã¿ã®ãŠã§ã
One promising development in Wikidata is the volunteer community's reuse and integration of external identifiers from existing databases and authority controls, including the International Standard Name Identifier, or ISNI, China Academic Library and Information System, or CALIS, International Air Transport Association, or IATA, MusicBrainz for albums and performers, and North Atlantic Basin's Hurricane Database, or HURDAT. These external IDs allow applications to integrate Wikidata with data from other sources that remain under the control of the original publisher.
ãŠã£ãããŒã¿ã®çºå±ã®ãªãã§ææãªã®ã¯ãã©ã³ãã£ã¢ã»ã³ãã¥ããã£ãå€éšã®èå¥åãå€éšããŒã¿ããŒã¹ãšå žæ ã³ã³ãããŒã«ããåå©çšãçµ±åããŠããããšã ããã®ãªãã«ã¯åœéæšæºåç§°èå¥åïŒISNIïŒãäžåœé«çæè²æç®ä¿é系統ïŒCALISïŒãåœéèªç©ºééåäŒïŒIATAïŒãã¢ã«ãã ãšæŒå¥è ã® MusicBrainz ãåå€§è¥¿æŽæµ·åããªã±ãŒã³ã»ããŒã¿ããŒã¹ïŒHURDATïŒãªã©ããããããããå€éšèå¥åã«ãããåçºè¡è ãã³ã³ãããŒã«ããä»ã®æ å ±æºãšãŠã£ãããŒã¿ãšãçµ±åããã¢ããªã±ãŒã·ã§ã³ãã€ãããããã«ãªãã
Wikidata is not the first project to reconcile identifiers and authority files from different sources. Others include the Virtual International Authority File, or VIAF, in the bibliographic domain, 39 GeoNames in the geographical domain,40 and Freebase. 41 Wikidata is linked to many of these projects yet also differs in terms of scope, scale, editorial processes, and author community.
ããšãªãæ å ±æºããå žæ ãã¡ã€ã«ãšèå¥åããŸãšãããããžã§ã¯ãã¯ãŠã£ãããŒã¿ãã¯ãããŠã§ã¯ãªããæžèªåéã§ã¯ããŒãã£ã«åœéå žæ ãã¡ã€ã«ïŒVIAFïŒ42ãå°çåéã§ã¯ GeoNames43ãã»ãã« Freebase 44 ãããããŠã£ãããŒã¿ã¯ãããã®å€ãã«ãªã³ã¯ããŠããäžæ¹ã§ãåé²ç¯å²ãèŠæš¡ãç·šééçšãæçš¿è ã³ãã¥ããã£ã«éããããã
The collected data is exposed in various ways; for example, current per-item exports are available in JSON, XML, RDF, and several other formats. Full database dumps are created at intervals and supplemented by daily diffs. All data is licensed under a Creative Commons CC0 license, thus putting the data in the public domain.
åéãããããŒã¿ã¯æ§ã ãªåœ¢ã§é²åºããŠãããããšãã°çŸåšã¢ã€ãã ããšã®ãšã¯ã¹ããŒãã¯JSONãXMLãRDFãã»ãããã€ãã®åœ¢åŒã§æäŸãããŠãããå šäœã®ããŒã¿ããŒã¹ãã³ãã宿çã«äœæãããæ¯æ¥ã®å·®åã§è£å®ãããŠãããå šããŒã¿ãã¯ãªãšã€ãã£ãã»ã³ã¢ã³ãº CC0 ã©ã€ã»ã³ã¹ã§äœ¿çšèš±è«Ÿããããããªãã¯ãã¡ã€ã³ã«ããããŠããã
Every Wikidata entity is identified by a unique URI (such as http://www.wikidata.org/entity/Q42 for item Q42, Douglas Adams). By resolving this URI, tools are able to obtain item data in the requested format (through content negotiation). This follows Linked Data standards for data publication,45 making Wikidata part of the Semantic Web, 4647 while supporting integration of other semantic Web data sources with Wikidata.
ãŠã£ãããŒã¿ã®ãšã³ãã£ãã£ã¯ãã¹ãŠãŠããŒã¯ãªURIã§èå¥ãããïŒããšãã°ãã°ã©ã¹ã»ã¢ãã ãºã®ã¢ã€ãã Q42 㯠http://www.wikidata.org/entity/Q42 ïŒãURIã解決ããããšã«ãããããŒã«ã¯ïŒã³ã³ãã³ãããŽã·ãšãŒã·ã§ã³ã䜿ã£ãŠïŒèŠæ±ãã圢åŒã§ã¢ã€ãã ã®ããŒã¿ãååŸã§ãããããã¯ããŒã¿åºçã®ããã®ãªã³ã¯ãããŒã¿ã®æšæº48ã«åŸã£ããã®ã§ããããŠã£ãããŒã¿ãã»ãã³ãã£ãã¯ã»ãŠã§ã4950ã®äžéšã«ããã»ãã®ã»ãã³ãã£ãã¯ã»ãŠã§ãã®ããŒã¿æºãšã®çµ±åãå©ããŠããã
Wikidata Applications
ãŠã£ãããŒã¿ã®å¿çš
The data in Wikidata lends itself to manifold applications on very different levels of data integration.
ãŠã£ãããŒã¿ã«ããããŒã¿ã¯ããšãªãã¬ãã«ã®ããŒã¿çµ±åã«é¢ããŠå€æ§ãªã¢ããªã±ãŒã·ã§ã³ã«åœ¹ç«ã€ã
Language labels and descriptions. Wikidata provides labels and descriptions for many terms in different languages, possibly using them to present information to international audiences. Unlike common dictionaries, Wikidata covers many named entities (such as for places, chemicals, plants, and specialist terms) that may be very difficult to translate. Many data-centric views can be translated trivially term by termâthink maps, shopping lists, and ingredients of dishes on a menuâassuming all items are associated with suitable Wikidata IDs. The open source JavaScript library qLabel (http://googleknowledge.github.io/qLabel/) provides this functionality for any website.
èšèªã©ãã«ãšèª¬æããŠã£ãããŒã¿ã¯ããšãªãèšèªã§ã®å€æ°ã®çšèªã«ã©ãã«ãšèª¬æãæäŸãããããã¯æ å ±ãåœéçãªåãæã«æç€ºããããã«äœ¿ãããšãã§ãããæ®éã®èŸæžãšã¯ã¡ãã£ãŠãŠã£ãããŒã¿ã¯ç¿»èš³ããããããããšã®ããåºæåè©ïŒå°åãååŠç©è³ªãæ€ç©ãå°éçšèªãªã©ïŒã倿°åé²ããŠãããããŒã¿äžå¿çãªè¡šç€ºã®ãããã¯èªæãªããããã§çšèªããçšèªãžãšç¿»èš³ã§ãããããšãã°ãé©åãªãŠã£ãããŒã¿IDãé¢é£ã¥ãããããšããŠãå°å³ãè²·ãç©ãªã¹ããã¡ãã¥ãŒäžã®æçã®æå衚瀺ãªã©ã ïŒã¢ã€ãã ããã¹ãŠé©åãªãŠã£ãããŒã¿IDã«é¢é£ä»ãããããšããŠïŒããªãŒãã³ãœãŒã¹ã® JavaScript ã©ã€ãã©ãª qLabel (http://googleknowledge.github.io/qLabel/) ã§ãã®æ©èœããããããŠã§ããµã€ãã«æäŸãããã
Identifier reuse. Item IDs can be used as language-independent identifiers to facilitate data exchange and integration across application boundaries. Referring to Wikidata items, applications can provide unambiguous definitions for the terms they use that are also the entry points to a wealth of related information. Wikidata IDs thus resemble digital object identifiers, or DOIs, but emphasize (meta)data beyond online document locations and use another social infrastructure for ID assignment. Wikidata IDs are stable: IDs do not depend on language labels; items can be deleted, though IDs are never reused; and the links to other datasets and sites further increase stability. Besides providing a large collection of IDs, Wikidata also provides the means to support contributors in selecting the right ID by displaying labels and descriptions; external applications can use the same functionality through the same API.
èå¥åã®åå©çšãã¢ã€ãã IDã¯èšèªéäŸåã®èå¥åãšããŠå©çšã§ããã¢ããªã±ãŒã·ã§ã³ã®å¢çããããããŒã¿äº€æãšçµ±åãæ¯æŽã§ããããŠã£ãããŒã¿ã®ã¢ã€ãã ãåç §ããããšã§ãã¢ããªã±ãŒã·ã§ã³ã¯çšèªã«ææ§æ§ã®ãªãå®çŸ©ãããããããšãã§ãããã®å®çŸ©ã¯ããããªé¢é£æ å ±ãžã®ãããã¡ãšããªãããã®æå³ã§ãŠã£ãããŒã¿IDã¯ããžã¿ã«ãªããžã§ã¯ãèå¥åïŒDOIïŒãšäŒŒãŠããããææžã®ãªã³ã©ã€ã³ã§ã®äœçœ®ããããïŒã¡ã¿ïŒããŒã¿ã匷調ãIDä»äžã®ããã®ããã²ãšã€ã®ç€ŸäŒçåºç€ã䜿ãããŠã£ãããŒã¿IDã¯å®å®ããŠãããã€ãŸããèšèªã©ãã«ã«äŸåããŠããããåé€ã¯ãããããšããããåå©çšãããããšã¯ãªããã»ãã®ããŒã¿ã»ãããžã®ãªã³ã¯ã«ããããã«å®å®æ§ãé«ããããã倿°ã®IDã®éåãæäŸããã»ãããŠã£ãããŒã¿ã¯æçš¿è ãé©åãªIDãéžã¶ã®ãæ¯æŽããææ®µãšããŠã©ãã«ãšèª¬æã衚瀺ããæ©èœãæäŸããããã®åãæ©èœãå€éšã¢ããªã±ãŒã·ã§ã³ã䜿ãããšãã§ããã
Wikidata allows conflicting data to coexist and provides mechanisms to organize this plurality. ãŠã£ãããŒã¿ã¯ç«¶åããããŒã¿ãå ±åãããããšãã§ãããã® plurality ãæŽçããä»çµã¿ãæäŸããã
Accessing Wikidata. The information collected by Wikidata is interesting in its own right, and many applications can be built to access it more conveniently and effectively. Applications created as of early 2014 included generic data browsers like the one in Figure 3 and special-purpose tools, including two genealogy viewers, a tree of life, a table of the elements, and various mapping tools. Applications can use the Wikidata API to browse, query, and even edit data. If simple queries are not enough, a dedicated copy of the data is needed; that copy can be obtained from regular dumps and possibly be updated in real time by mirroring edits on Wikidata. The Wikidata Toolkit, an open source Java library (https://www.mediawiki.org/wiki/Wikidata_Toolkit), provides convenient access to the dumps.
ãŠã£ãããŒã¿ãžã®ã¢ã¯ã»ã¹ããŠã£ãããŒã¿ãåéããæ å ±ã¯ããåäœã§è峿·±ãããã®ã¢ã¯ã»ã¹ã䟿å©ãã€å¹æçã«ããããã®å€ãã®å¿çšãããããã2014幎åé ãŸã§ã«ã€ããããã¢ããªã±ãŒã·ã§ã³ã«ã¯ãå³3 ã®ãããªæ±çšããŒã¿ãã©ãŠã¶ã幎衚ãã¥ãŒã¢ã»çç©ç³»çµ±æš¹ã»å çŽ è¡šã»å€æ§ãªå°å³ããŒã«ãªã©ç¹æ®ç®çã®ããŒã«ããããã¢ããªã±ãŒã·ã§ã³ã¯ãŠã£ãããŒã¿APIã䜿ã£ãŠããŒã¿ãé²èŠ§ãæ€çŽ¢ãããã«ç·šéããããšããã§ãããåçŽãªã¯ãšãªã§äžååãªå ŽåãããŒã¿ã®è€è£œããšããããå¿ èŠããããè€è£œã¯å®æçãªãã³ãã§å ¥æã§ãããªã¢ã«ã¿ã€ã ã®ãŠã£ãããŒã¿ã®ç·šéãåæ ããããšã§æŽæ°ããããšãå¯èœã ããªãŒãã³ãœãŒã¹ã® Java ã©ã€ãã©ãª Wikidata Toolkit (https://www.mediawiki.org/wiki/Wikidata_Toolkit) ã¯ãã³ããžã®äŸ¿å©ãªã¢ã¯ã»ã¹ãæäŸããã
Enriching applications. Many applications can be enriched by embedding information from Wikidata directly into their interfaces; for example, a music player might want to fetch the portrait of the artist just being played in the audio file. Unlike earlier uses of Wikipedia data (such as in Google Maps), application developers need not extract and maintain the data themselves. Such lightweight data access is particularly attractive for mobile apps. In other cases, application developers preprocess data to integrate it into their applications; for example, it would be easy to extract a file of all German cities, together with their regions and post-code ranges, that could then be used in an application. Such derived data can be used and redistributed online or in software under any license, even in commercial contexts.
ã¢ããªã±ãŒã·ã§ã³ããªããã«ããããŠã£ãããŒã¿ããååŸããæ å ±ãã€ã³ã¿ãŒãã§ã€ã¹ã«ããããããšã«ãã£ãŠãªããã«ãªãã¢ããªã±ãŒã·ã§ã³ã¯å€ããããšãã°ã鳿¥œãã¬ã€ã€ãŒã¯åçäžã®é³æ¥œãã¡ã€ã«ã®ã¢ãŒãã£ã¹ãã®åçãååŸããããããããªããïŒããšãã°ã°ãŒã°ã«ããããªã©ã§ã®ïŒä»¥åã®ãŠã£ãããã£ã¢ã®ããŒã¿ã®å©çšãšã¯ããšãªããã¢ããªã±ãŒã·ã§ã³éçºè ã¯ããŒã¿ãæœåºãä¿å®ããå¿ èŠããªãããããã軜éããŒã¿ã¢ã¯ã»ã¹ã¯ã¢ãã€ã«ã¢ããªã«ç¹ã«é åçã ãã»ãã®å Žåãã¢ããªã±ãŒã·ã§ã³éçºè ã¯ããŒã¿ãååŠçããŠã¢ããªã±ãŒã·ã§ã³ã«çµ±åãããããšãã°ãããšã§ã¢ããªã±ãŒã·ã§ã³å ã§äœ¿ãããã«ãã€ãã®å šéœåžããã®å°åãšéµäŸ¿çªå·ã€ãã§ãã¡ã€ã«ã«ããŠæœåºããã¯å®¹æã ãããããæŽŸçããŒã¿ã¯ãªã³ã©ã€ã³ãŸãã¯ãœãããŠã§ã¢å ã§ã忥çšéãå«ãããããã©ã€ã»ã³ã¹ã§äœ¿çšã»åé åžããããšãã§ããã
Advanced analytics. Information in Wikidata can be further analyzed to derive new insights beyond what is already revealed on the surface. An important approach in this regard is logical reasoning, where information about general relationships is used to derive additional facts; for example, Wikidata's property "grandparent" is obsolete, since its value can be inferred from values of properties "father" and "mother." If an application developer is generally interested in ancestors, then a transitive closure must be computed. Such a closure is relevant for many hierarchical, spatial, and partonomical relations. Other types of advanced analytics include statistical evaluations of both the data and the incidental metadata collected in the system; for example, a researcher can readily analyze article coverage by language,51 as well as the gender balance of persons described in Wikipedia articles.52 As in Wikipedia, Wikidata provides plenty of material for researchers to study.
é«åºŠãªåæããŠã£ãããŒã¿ã®æ å ±ã¯è¡šé¢çã«ããããã«ãªã£ãŠãã以äžã®ãããããæŽå¯ãã²ãã ãããã«ããã«åæããããšãã§ãããéèŠãªã¢ãããŒãã®ã²ãšã€ã¯è«ççæšè«ã«ãããäžè¬çãªè€æ°ã®é¢ä¿ã䜿ã£ãŠãã¡ã¯ããå°ãã ãããšã ãããšãã°ããŠã£ãããŒã¿ã®ããããã£ãç¥ç¶ãŸãã¯ç¥æ¯ãã¯ããç¶ããæ¯ãã®ããããã£ããæšè«ã§ããå€ã§ããããã廿¢ããããã¢ããªã±ãŒã·ã§ã³éçºè ãå®¶ç³»å³äžè¬ã«èå³ããã£ãŠãããªããæšç§»çéå ãèšç®ããªããã°ãªããªãããããã£ãéå ã¯éå±€çã»ç©ºéçã»éšåå šäœçãªé¢ä¿ã«ã€ããŠéèŠã ãé«åºŠãªåæã®ä»ã®çš®é¡ã«ã¯ããŒã¿ãä»éããŠã·ã¹ãã ã«åéãããã¡ã¿ããŒã¿ã®çµ±èšçè©äŸ¡ã ãããšãã°ãç ç©¶è ã¯èšèªããšã®é ç®ã®ã«ããŒç53ããŠã£ãããã£ã¢ã«èšè¿°ããã人ç©ã®ãžã§ã³ããŒã®å衡床åã54ãªã©ã容æã«åæã§ããããŠã£ãããã£ã¢ãšããªãããã«ããŠã£ãããŒã¿ã¯è±å¯ãªç ç©¶çŽ æãç ç©¶è ã«æäŸããã
These are only the most obvious approaches to exploiting the data, and many as-yet unforeseen uses should be expected. Wikidata is young, and its data is far from complete. We look forward to new and innovative applications due to Wikidata and its development as a knowledgebase.55
以äžã¯ããŒã¿ãå©çšãããã£ãšãæçœãªã¢ãããŒãã®ããã€ãã«ãããããŸã ç¥ãããŠããªãå©çšæ³ã倿°æåŸ ãããããŠã£ãããŒã¿ã¯è¥ãããã®ããŒã¿ã¯å®å šã«ã¯ã»ã©ãšããããŠã£ãããŒã¿ãšç¥èããŒã¹ãšããŠã®ãã®çºå±ããæ°ãã驿°çãªã¢ããªã±ãŒã·ã§ã³ãçŸããããšã楜ãã¿ã«ããŠãã56ã
Prospects
屿
Features still missing include support for complex queries, which is now under development. However, in trying to predict the future of Wikidata, the development team's plans are probably less important than one would expect; for example, the biggest open questions concern the evolution and interplay of the many Wikimedia communities. Will Wikidata earn their trust? How will each of them, with its own language and culture, access, share, and co-evolve the way Wikidata is structured? And how will Wikidata respond to the demands of the communities beyond Wikipedia?
è€éãªã¯ãšãªã®ãµããŒãïŒéçºäžïŒãªã©ããã€ãã®æ©èœããŠã£ãããŒã¿ã«ã¯æ¬ ããŠããããããããŠã£ãããŒã¿ã®å°æ¥ãäºæž¬ããããšãããšããéçºããŒã ã®èšç»ã¯æãããã»ã©ã«éèŠã§ã¯ãªããããšãã°ããã£ãšã倧ããªæªè§£æ±ºåé¡ã¯å€æ°ã®ãŠã£ãã¡ãã£ã¢ã³ãã¥ããã£ã®é²åãšçžäºäœçšã«é¢é£ããããŠã£ãããŒã¿ã¯åœŒãã®ä¿¡é ŒããããããïŒ ããããèšèªã»æåã®ããåã ã®ã³ãã¥ããã£ãã©ã®ããã«ããŠãŠã£ãããŒã¿ã®æ§é ãžãšã¢ã¯ã»ã¹ã»å ±æã»å ±é²åããŠããã®ãïŒ ãŠã£ãããã£ã¢ããããã³ãã¥ããã£ã®èŠæã«ãŠã£ãããŒã¿ã¯ã©ãçããã®ãïŒ
The influence of the volunteer community extends to technical development of the website and its underlying software. Wikidata is based on an open development process that invites contributions, while the site itself provides many extension points for user-created add-ons. The community has designed and developed features (such as article badges for featured articles, image embedding and multi-language editing). The community has also developed ways to enrich the semantics of properties by encoding (soft) constraints, as reflected in the guideline "Items should have no more than one birthplace." External tools gather this information, analyze the dataset for constraint violations, and publish the list of violations on Wikidata to allow editors to check if they are valid exceptions or errors.
ãã©ã³ãã£ã¢ã³ãã¥ããã£ã®åœ±é¿ã¯ãŠã§ããµã€ããšãã®äžã«ãããœãããŠã§ã¢ã®æè¡ççºå±ã«ãããã¶ããŠã£ãããŒã¿ã¯è²¢ç®ãæè¿ãããªãŒãã³ãªéçºããã»ã¹ã«ããšã¥ããŠãããå©çšè ã補äœããã¢ããªã³ã®ããã®æ¡åŒµç®æããµã€ããã®ãã®ã«ã倿°ãããã³ãã¥ããã£ã¯è€æ°ã®æ©èœïŒããšãã°ç§éžãªé ç®ã«ã€ãããããžãç»åãããã¿ãå€èšèªç·šéãªã©ïŒãèšèšãéçºãããã³ãã¥ããã£ã¯ããããã£ã®æå³è«ããªããã«ããããã«ïŒãœãããªïŒå¶çŽããšã³ã³ãŒãããæ¹æ³ãéçºãããããã¯ãã¢ã€ãã ã¯ããã ãäžã€ããçèªå°ãããŠãªãããšããã¬ã€ãã©ã€ã³ã«åæ ãããŠãããå€éšããŒã«ããã®æ å ±ãéçŽããããŒã¿ã»ãããåæããŠå¶çŽéåãã¿ã€ãããŠã£ãããŒã¿ã«éåã®ãªã¹ããå ¬éããããããæ£åœãªäŸå€ãªã®ã誀ããªã®ããç·šéè ããã§ãã¯ã§ããããã«ããŠããã
These aspects of the Wikidata development process illustrate the close relationships among technical infrastructure, editorial processes, and content and the pivotal role the community plays in shaping Wikidata. However, the community is as dynamic as Wikidata itself, based not on status or membership but on the common goal of turning Wikidata into the most accurate, useful, and informative resource possible. This goal promises stability and continuity, even as it allows anyone to take part in defining the future of Wikidata.
ãŠã£ãããŒã¿éçºããã»ã¹ã®ãã®ãããªåŽé¢ã¯æè¡åºç€ã»ç·šéããã»ã¹ã»ã³ã³ãã³ããããã³ãŠã£ãããŒã¿ã圢æããäžã§ã³ãã¥ããã£ãã¯ããéèŠãªåœ¹å²ã®å¯ãªé¢ä¿ã瀺ããŠãããããããã³ãã¥ããã£ã¯ãŠã£ãããŒã¿ãã®ãã®ãšããªãããã«åçã§ãããå°äœãæå±ã§ã¯ãªãããŠã£ãããŒã¿ãå¯èœãªããããã£ãšãæ£ç¢ºã»æçšã»æ å ±éã®ããè³æºã«ããããšã®ç®æšã«ããšã¥ããŠãããã ãã«ã§ããŠã£ãããŒã¿ã®å°æ¥ãããã¡ã¥ããããšãèš±ããªãããããã®ç®æšã¯å®å®æ§ãšé£ç¶æ§ãçŽæããã
Wikipedia is by all accounts one of the most important websites today, a legacy Wikidata must live up to. In only two years, Wikidata is already an important platform for integrating information from many sources. In addition, it also aggregates large amounts of incidental metadata about its own evolution and contribution to Wikipedia. Wikidata thus has the potential to be a major resource for both research and development of new and improved applications. Wikidata, the free knowledgebase anyone can edit, may thus bring us all one step closer to a world that freely shares in the sum of all knowledge.
ãŠã£ãããã£ã¢ã¯ä»æ¥ããããæå³ã§ãã£ãšãéèŠãªãŠã§ããµã€ãã®ã²ãšã€ã§ããããŠã£ãããŒã¿ãããã€ãã¹ãéºç£ã®ã²ãšã€ã ãããã2幎ã§ããŠã£ãããŒã¿ã¯ãã§ã«å€æ°ã®æ å ±æºããã®æ å ±ãçµ±åããéèŠãªãã©ãããã©ãŒã ã«ãªã£ããããããŠããŠã£ãããŒã¿ã¯ããèªèº«ã®é²åãšãŠã£ãããã£ã¢ãžã®è²¢ç®ã«ä»éããã¡ã¿ããŒã¿ã倧éã«éçŽããŠããããã®ãããŠã£ãããŒã¿ã¯æ°ããæ¹åãããã¢ããªã±ãŒã·ã§ã³ã®ç ç©¶ã«ãéçºã«ãäž»èŠãªè³æºãšãªãå¯èœæ§ããããã ãã§ãç·šéã§ããç¥èããŒã¹ã§ãããŠã£ãããŒã¿ã¯ãããããç¥èã®éåãèªç±ã«å ±æããäžçãžãšããããã¡ãäžæ©ã¡ãã¥ããŠãããã®ã§ã¯ãªãã ãããã
Acknowledgments
è¬èŸ
The development team's work on Wikidata is funded through donations by the Allen Institute of Artificial Intelligence, Google, the Gordon and Betty Moore Foundation, and Yandex. Markus Krotzsch's research is supported by the German Research Foundation through the Data Integration and Access by Merging Ontologies and Databases, or DIAMOND, project (Emmy Noether grant KR 4381/1-1).
ãŠã£ãããŒã¿éçºããŒã ã®æ¥åã®è³é㯠the Allen Institute of Artificial Intelligence, Googleãthe Gordon and Betty Moore FoundationãYandex ããã®å¯ä»ã«ãã£ãŠãŸããªããããMarkus Krotzsch ã®ç 究㯠Data Integration and Access by Merging Ontologies and Databases (DIAMOND) ãããžã§ã¯ã (Emmy Noether grant KR 4381/1-1) ãéããŠãã€ãç ç©¶æ¯èåäŒã«æ¯æŽãããã
Wikidata entity id
Wikidata ãšã³ãã£ã㣠ID
This article has subsequently received its own Wikidata item Q18507561, which can be used, e.g., within references on Wikidata.
æ¬çš¿ã¯ãã®åŸãŠã£ãããŒã¿ã§ã¢ã€ãã Q18507561ãå²ãæ¯ãããããã®IDã¯ãŠã£ãããŒã¿å ã§æ å ±æºæå®çã«äœ¿ãããšãã§ããã
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Peter Buneman , James Cheney , Wang-Chiew Tan , Stijn Vansummeren, Curated databases, Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376916.1376918â©
Peter Buneman , James Cheney , Wang-Chiew Tan , Stijn Vansummeren, Curated databases, Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376916.1376918â©
Bo Leuf , Ward Cunningham, The Wiki way: quick collaboration on the Web, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2001â©
Bo Leuf , Ward Cunningham, The Wiki way: quick collaboration on the Web, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2001â©
Markus Krötzsch , Denny VrandeÄiÄ , Max Völkel , Heiko Haller , Rudi Studer, Semantic Wikipedia, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.4, p.251-261, December, 2007 http://doi.org/10.1016/j.websem.2007.09.001â©
Daniel J. Barrett, MediaWiki, O'Reilly Media, Inc., 2008â©
Daniel J. Barrett, MediaWiki, O'Reilly Media, Inc., 2008â©
Markus Krötzsch , Denny VrandeÄiÄ , Max Völkel , Heiko Haller , Rudi Studer, Semantic Wikipedia, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.4, p.251-261, December, 2007 http://doi.org/10.1016/j.websem.2007.09.001â©
Douglas B. Lenat , R. V. Guha, Building Large Knowledge-Based Systems; Representation and Inference in the Cyc Project, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989â©
Kurt Bollacker , Colin Evans , Praveen Paritosh , Tim Sturge , Jamie Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376616.1376746â©
Douglas B. Lenat , R. V. Guha, Building Large Knowledge-Based Systems; Representation and Inference in the Cyc Project, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989â©
Kurt Bollacker , Colin Evans , Praveen Paritosh , Tim Sturge , Jamie Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376616.1376746â©
Christian Bizer , Jens Lehmann , Georgi Kobilarov , Sören Auer , Christian Becker , Richard Cyganiak , Sebastian Hellmann, DBpedia - A crystallization point for the Web of Data, Web Semantics: Science, Services and Agents on the World Wide Web, v.7 n.3, p.154-165, September, 2009â©
Johannes Hoffart , Fabian M. Suchanek , Klaus Berberich , Gerhard Weikum, YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia, Artificial Intelligence, 194, p.28-61, January, 2013 http://doi.org/10.1016/j.artint.2012.06.001â©
Christian Bizer , Jens Lehmann , Georgi Kobilarov , Sören Auer , Christian Becker , Richard Cyganiak , Sebastian Hellmann, DBpedia - A crystallization point for the Web of Data, Web Semantics: Science, Services and Agents on the World Wide Web, v.7 n.3, p.154-165, September, 2009â©
Johannes Hoffart , Fabian M. Suchanek , Klaus Berberich , Gerhard Weikum, YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia, Artificial Intelligence, 194, p.28-61, January, 2013 http://doi.org/10.1016/j.artint.2012.06.001â©
Wolfram Research. Wolfram Alpha (launched 2009); https://www.wolframalpha.comâ©
Tunstall-Pedoe, W. True Knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine 31, 3 (Fall 2010), 80--92.â©
Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Schlaefer, N., and Welty, C.A. Building Watson: An overview of the DeepQA project. AI Magazine 31, 3 (Fall 2010), 59--79.â©
Wolfram Research. Wolfram Alpha (launched 2009); https://www.wolframalpha.comâ©
Tunstall-Pedoe, W. True Knowledge: Open-domain question answering using structured knowledge and inference. AI Magazine 31, 3 (Fall 2010), 80--92.â©
Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Schlaefer, N., and Welty, C.A. Building Watson: An overview of the DeepQA project. AI Magazine 31, 3 (Fall 2010), 59--79.â©
Guha, R.V., McCool, R., and Fikes, R. Contexts for the Semantic Web. In Proceedings of the Third International Semantic Web Conference, Vol. 3298 of LNCS, S.A. McIlraith, D. Plexousakis, and F. van Harmelen, Eds. (Hiroshima, Japan, Nov. 7--11). Springer, Berlin, 2004, 32--46.â©
MacGregor, R.M. Representing reified relations in Loom. Journal of Experimental and Theoretical Artificial Intelligence 5, 2--3 (1993), 179--183.â©
Noy, N. and Rector, A., Eds. Defining N-ary Relations on the Semantic Web. W3C Working Group Note, Apr. 12, 2006; http://www.w3.org/TR/swbp-n-aryRelations/â©
Guha, R.V., McCool, R., and Fikes, R. Contexts for the Semantic Web. In Proceedings of the Third International Semantic Web Conference, Vol. 3298 of LNCS, S.A. McIlraith, D. Plexousakis, and F. van Harmelen, Eds. (Hiroshima, Japan, Nov. 7--11). Springer, Berlin, 2004, 32--46.â©
MacGregor, R.M. Representing reified relations in Loom. Journal of Experimental and Theoretical Artificial Intelligence 5, 2--3 (1993), 179--183.â©
Noy, N. and Rector, A., Eds. Defining N-ary Relations on the Semantic Web. W3C Working Group Note, Apr. 12, 2006; http://www.w3.org/TR/swbp-n-aryRelations/â©
Erxleben, F., GÃŒnther, M., Krötzsch, M., Mendez, J., and VrandeÄiÄ, D. Introducing Wikidata to the Linked Data Web. In Proceedings of the 13th International Semantic Web Conference (Trentino, Italy, Oct. 19--23). Springer, Berlin, 2014.â©
Erxleben, F., GÃŒnther, M., Krötzsch, M., Mendez, J., and VrandeÄiÄ, D. Introducing Wikidata to the Linked Data Web. In Proceedings of the 13th International Semantic Web Conference (Trentino, Italy, Oct. 19--23). Springer, Berlin, 2014.â©
Luc Moreau, The Foundations for Provenance on the Web, Foundations and Trends in Web Science, v.2 n.2â3, p.99-241, February 2010 http://doi.org/10.1561/1800000010â©
Luc Moreau, The Foundations for Provenance on the Web, Foundations and Trends in Web Science, v.2 n.2â3, p.99-241, February 2010 http://doi.org/10.1561/1800000010â©
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Ayers, P., Matthews, C., and Yates, B. How Wikipedia Works: And How You Can Be a Part of It. No Starch Press, San Francisco, CA, 2008.â©
Bennett, R., Hengel-Dittrich, C., O'Neill, E.T., and Tillett, B.B. VIAF (Virtual International Authority File): Linking Die Deutsche Bibliothek and Library of Congress name authority files. In Proceedings of the World Library and Information Congress 72nd General Conference and Council (Seoul, South Korea, Aug. 20--24). IFLA, Den Haag, The Netherlands, 2006.â©
Unxos GmbH. GeoNames (launched 2005); http://www.geonames.orgâ©
Kurt Bollacker , Colin Evans , Praveen Paritosh , Tim Sturge , Jamie Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376616.1376746â©
Bennett, R., Hengel-Dittrich, C., O'Neill, E.T., and Tillett, B.B. VIAF (Virtual International Authority File): Linking Die Deutsche Bibliothek and Library of Congress name authority files. In Proceedings of the World Library and Information Congress 72nd General Conference and Council (Seoul, South Korea, Aug. 20--24). IFLA, Den Haag, The Netherlands, 2006.â©
Unxos GmbH. GeoNames (launched 2005); http://www.geonames.orgâ©
Kurt Bollacker , Colin Evans , Praveen Paritosh , Tim Sturge , Jamie Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, Proceedings of the 2008 ACM SIGMOD international conference on Management of data, June 09-12, 2008, Vancouver, Canada http://doi.org/10.1145/1376616.1376746â©
Bizer, C., Heath, T., and Berners-Lee, T. Linked data: The story so far. International Journal on Semantic Web and Information Systems 5, 3 (2009), 1--22.â©
Berners-Lee, T., Hendler, J., and Lassila, O. The Semantic Web. Scientific American (May 2001), 96--101.â©
Erxleben, F., GÃŒnther, M., Krötzsch, M., Mendez, J., and VrandeÄiÄ, D. Introducing Wikidata to the Linked Data Web. In Proceedings of the 13th International Semantic Web Conference (Trentino, Italy, Oct. 19--23). Springer, Berlin, 2014.â©
Bizer, C., Heath, T., and Berners-Lee, T. Linked data: The story so far. International Journal on Semantic Web and Information Systems 5, 3 (2009), 1--22.â©
Berners-Lee, T., Hendler, J., and Lassila, O. The Semantic Web. Scientific American (May 2001), 96--101.â©
Erxleben, F., GÃŒnther, M., Krötzsch, M., Mendez, J., and VrandeÄiÄ, D. Introducing Wikidata to the Linked Data Web. In Proceedings of the 13th International Semantic Web Conference (Trentino, Italy, Oct. 19--23). Springer, Berlin, 2014.â©
Scott A. Hale, Multilinguals and Wikipedia editing, Proceedings of the 2014 ACM conference on Web science, June 23-26, 2014, Bloomington, Indiana, USA http://doi.org/10.1145/2615569.2615684â©
Klein, M. and Kyrios, A. VIAFbot and the integration of library data on Wikipedia. code{4}lib Journal 22 (Oct. 2013); http://journal.code4lib.org/articles/8964â©
Scott A. Hale, Multilinguals and Wikipedia editing, Proceedings of the 2014 ACM conference on Web science, June 23-26, 2014, Bloomington, Indiana, USA http://doi.org/10.1145/2615569.2615684â©
Klein, M. and Kyrios, A. VIAFbot and the integration of library data on Wikipedia. code{4}lib Journal 22 (Oct. 2013); http://journal.code4lib.org/articles/8964â©
Denny VrandeÄiÄ, The Rise of Wikidata, IEEE Intelligent Systems, v.28 n.4, p.90-95, July 2013 http://doi.org/10.1109/MIS.2013.119â©
Denny VrandeÄiÄ, The Rise of Wikidata, IEEE Intelligent Systems, v.28 n.4, p.90-95, July 2013 http://doi.org/10.1109/MIS.2013.119â©














