NCBImediawiki

mediawiki  时间:2021-04-13  阅读:()
Wikidata:AplatformfordataintegrationanddisseminationforthelifesciencesandbeyondElviraMitraka1,AndraWaagmeester2,SebastianBurgstaller-Muehlbacher3,LynnM.
Schriml1,AndrewI.
Su3,BenjaminM.
Good3UniversityofMarylandSchoolofMedicine,Baltimore,USA{emitraka,lschriml}@som.
umaryland.
eduMicelio,Antwerp,Belgiumandra@micelio.
beDepartmentofMolecularandExperimentalMedicine,ScrippsResearchInstitute,LaJolla,USA{sburgs,asu,bgood}@scripps.
eduAbstract.
Wikidataisanopen,SemanticWeb-compatibledatabasethatanyonecanedit.
This'datacommons'providesstructureddataforWikipediaarticlesandotherapplications.
EveryarticleonWikipediahasahyperlinktoaneditableiteminthisdatabase.
Thisuniqueconnectiontotheworld'slargestcommunityofvolunteerknowledgeeditorscouldhelpmakeWikidataakeyhubwithinthegreaterSemanticWeb.
Thelifesciences,asever,facescrucialchallengesindisseminatingandintegratingknowledge.
OurgroupisaddressingtheseissuesbypopulatingWikidatawiththeseedsofafoundationalsemanticnetworklink-inggenes,drugsanddiseases.
Usingthiscontent,weareenhancingWikipediaarticlestobothincreasetheirqualityandrecruithumaneditorstoexpandandimprovetheunderlyingdata.
Weencouragethecommunitytojoinusaswecollaborativelycreatewhatcanbecomethemostusedandmostcentralseman-ticdataresourceforthelifesciencesandbeyond.
Keywords:Wikidata,Wikipedia,LinkedData,SemanticWeb,Crowdsourcing,KnowledgeManagement1StoneDataSoupIntheStoneSoupfolktale[1],agroupofhungrytravelersarriveinavillagewithitsinhabitantsunwillingtosharetheirfood.
Withakettleofwaterandastonethetravelersmanagetotouchthecuriosityofthevillagers.
Thecuriosityfinallyspawnsacollaborativeefforttomakeagreatsoup.
Thisstoryisnowadaysusedtoexpressthepowerofcrowdsourcingandcollaborativeprojects[2],suchasWikipedia,wheremanyindividualseachmakesmallcontributionsbutcollectivelyproducesomethinglargerthanthesumofitsparts.
WikidataextendsthiscollaborativemodeltotheWebofdata[3].
InthisarticlewewilldescribeWikidataandthewaysthatthisopenpublicplatformcantakeacentralroleindatasharingandmanagementforthelifesciencecommunity.
2WikidataandWikipediaWikipediaisamongthemostvisitedsitesontheInternet.
Articlesaboutmedicaltopicswereviewedmorethan4.
88billiontimesin2013,anumberonparwithhttp://nih.
govandsignificantlygreaterthanWebMD[4].
Thisincrediblyimportantresource,createdthroughvolunteerlabor,isnowtightlycoupledtoWikidata-anopen,SemanticWeb-compatibledatabasethatanyonecanedit[3].
Wikipediainfoboxes-thetablesofdataoftenappearingontherightsideofarticles-cannowrendercontentstoredinWikidataandeachWikipediaarticlenowhasadirectlinktothecorrespondingWikidataitem,thusencouragingthecollaborativeeditingofthedata(Fig.
1).
Fig.
1.
Wikidataprovidesacentralizedresourceforstructureddata.
Applicationsincluding,butnotlimitedto,WikipediacannowreadandwritetoWikidata.
Infoboxesprovidethebridgebetweenmachine-readablestructureddataandtheunstructuredtextthatformsthemainbodyofeacharticle.
Since2008,theGeneWikiprojecthasautomaticallycreatedandmaintainedtheinfoboxesforaround10000articlesabouthumangenes[5].
Now,thisinitiativeisfocusedongeneratingafoundationofbiomedicalknowledgeinWikidatathatwillbeusedtoimproveinfoboxcontentonWikipediaandhelpdrivenewapplications.
Todate,wehaveloadedWikidatawithitemsabout:56451humanand73086mousegenesfromNCBIGene[6],6562conceptsintheDiseaseOntology[7],and1830FDA-approveddrugs.
ThisinitialdataloadgeneratedWikidataitemsforthesekeybiomedicalconcepts,mappedthemtoWikipediaarticlesandlinkedthemtothecorrespondingidentifiersinauthori-tativepublicdatabases.
Theidentifier-levelconnectionstothesourcedatabasesen-surethatWikidatacontentcanbeeasilyintegratedintotheexistingWebofbiomedi-caldata.
Moreover,theprovenanceofallWikidataclaimscanbeassessedthroughinspectionofthesupportingreferences.
Thedataiskeptuptodatebyperiodicallyrunning'bots'thatpropagatechangesfromauthoritativesourcestoWikidata.
WhenconflictsarisefromhumaneditstoWikidataitems,theseareflaggedformanualre-view.
Thenextphaseoftheprojectwillstitchtheseconceptsintoarichlyintercon-nectedsemanticnetwork.
3Takingasipofthedatasoup–WikidataandtheSemanticWebThefirstapplicationtouseWikidataextensivelyisWikipediabutthiscouldbethetipoftheiceberg.
TogiveapreviewofwhatWikidatacouldbecome,it'suse-fultobrieflyexamineitsclosestancestor,DBpedia.
TheDBpediaprojectminescon-tentfromWikipediabyparsinginfoboxes,mapsthiscontenttotheirownontology,andprovidesaccesstothisdataintheformofalargeRDFdatabaseavailablebothforbulkdownloadandSPARQLquery.
Whileenablinginterestingqueriesonitsown,itsmostimportantfunctionisasagloballinkinghubfortheSemanticWeb[8].
IncomparisontoDBpedia,Wikidatahasanumberofadvantages.
First,itcanbeediteddirectlyandchangesarereflectedinrealtime.
Second,itdoesnotrequireanyparsingbecausealldataismanagedinadatabasefromtheoutset.
Third,itcontainslargeamountsofcontentthatisnotpresentinWikipedia,suchasitemsforeverymousegene.
Finally,itsqueryAPIsupportsnotonlyqueriesalongitsassertedknowledgegraph,butalsoalongreferences,qualifiersandevenedithistories.
Theseadditionalcapabilities,viewedinlightofthesuccessoftheDBpediaproject,portendavitalfutureforWikidatainthecontextoftheSemanticWeb.
Withinthebiomedicaldomain,usefulqueriesarealreadypossibleasaresultofthe'single-pot'natureofWikidata.
Forexample,itispossibletouseWikidata'sSPARQLendpoint(https://query.
wikidata.
org/)toanswerquestionssuchas"whatclinicallyrelevantdrug-druginteractionsareknownforthedrugmethadone(CHEMBL651)"[9].
Importantly,thedatausedtoanswerthisquerycamefromtwogroupsworkingcompletelyindependently.
Our'drug_bot'botaddedtheCHEMBLidentifiers(aswellasmanyotheridentifiers)whileanotherbotdevelopedbyateamattheMedicalUniversityofViennaaddedthedrug-druginteractions[10].
Thishap-penedwithoutanydirectcoordinationbetweenourgroups.
Thiskindofserendipitous,automatic,cross-continentaldataintegrationistheprimarygoaloftheSemanticWeb,butisnotyetcommonplace.
ThekeybeautyandmainchallengeoftheSemanticWebisitsdistributednature.
InorderforthiskindofintegrationtohappenintheabsenceofacentralizedresourcelikeWikidata,severalmajorhurdleswouldneedtobeleaped.
First,bothteamswouldneedtoknowenoughaboutthefairlycomplexstackofsemantictechnologiestoprovidetheirdataasRDFthroughastable,publicSPARQLendpoint.
Second,theywouldhavetoworkwithoverlappingidentifiersystems.
Third,thewould-beconsumeroftheirdatawouldneedtodiscoverbothoftheirendpointsandbesophisticatedenoughwithSPARQLtoidentifyandissuetheappropriatedistributedquery.
Allofthisispossi-bleandcanwork,butitisnoteasy.
Byintegratingdatainacentralized,singlecommunitypot,Wikidatapro-videsaplatformthataddresseseachoftheseproblems.
DataprovidersdonothavetosetupandmaintaintheirownSPARQLendpoint–achallengethatveryfewteamshavesucceededatdoingforanylengthoftime[11].
Byvirtueofworkinginthesamedatabase,itisfarlesslikely-thoughnotimpossible-forindependentteamstogener-ateandpublishdifferentidentifiers,asthefirststepinworkingwithWikidataistoqueryittoseewhatisalreadythere.
Finally,thechallengeoffindingarelevantend-pointisnegatedwhenthereisonlyone.
NotethatWikidatacanbequeriedusingSPARQLortheWikidataQueryLanguage[12].
4ManyCooks.
.
.
ThefactthatWikidataisonecentralized,communityresourceimmediatelysurfacesthechallengesincurredinanycollaborativeontologydevelopmentpro-cess.
InWikidata,the'ontology'correspondstoitscollectionoflinkingpropertiesusedtodescribeitems.
AnewpropertyinWikidatahastobeproposedforcommuni-tydiscussionandisonlycreatedafteraconsensusregardingthevalueofthepropertyanditsrelationtoexistingpropertieshasbeenestablished.
Forthoseusedtocontrol-lingtheirowndataanddatamodels,thisprocesscanfeeltedious.
Butthissamefun-damentalprocessmustbeundertakeninanyattemptatdataintegration.
Thefactthatithappensupfront,whendataisfirstbeingloaded,shouldhelptokeepthedatacon-sistentandreducethedownstreamidentifierandontologicalmappingproblemsthatcontinuetoplaguebioinformatics.
ImaginethepowerofcombiningthestructureddatainWikidata,thehighaccessibilityanddedicatedcommunityofWikipediaandtheknowledgeofthescien-tificcommunity.
Contemplatefurtherthatallofthisdataisfreelyavailableandac-cessiblethroughastablequeryinterfaceandrobust,read/writeAPI.
Thismakesim-portant,high-qualityinformationeasilyaccessiblebyanyoneandopensupscientificknowledgeforpublicscrutiny.
Further,thebuilt-inprovenancetrackingcanprovidedetailedchainsofevidencetosupportorrefuteeachclaimandallofthiscanbedis-cussedusingthemanysocialtools,suchas'talkpages'foreverydataitem,bakedintotheMediaWikiinfrastructure.
Asidefromcreatingusefulwaystodisseminatedata,thissociotechnicalstructureprovidesaframeworkforthebroadcommunitytobroadcastfeedbackbacktotheoriginaldataowners.
Evenatthisearlystageofthisproject,thisprocesshasalreadyledtoimprovementsinsourcedata.
Forexample,intheDiseaseOntologytheterm'Ollierdisease'hadthesynonym'Maffuccisyndrome'.
UponimportingtheDiseaseOntologyintoWikidata,membersoftheWikidatacommunitypointedoutthatthetwoterms,thoughputativesynonyms,linkedtotwodifferentextantWikidataitems.
Uponcloserreviewitwasdeterminedthatthesetwotermsrepresenttwodif-ferent,albeitcloselyrelated,diseases,leadingtothecreationofanewtermintheDiseaseOntology.
AsWikidataexpandsitistobeexpectedthatadditionaldiffer-encesinrepresentationbetweenitandotherknowledgeresourceswillsurface.
ThesewillfirstbetriagedbytheWikidatacommunitytocheckforerrorsand,ifconsensusisachievedthatthereisanerrorintheoriginalsource,thiswillberelayedforconsid-eration.
Inthisway,theWikidatacommunitycanbecomethe'manyeyes'thatmakeallontologybugsshallow.
5.
.
.
CanMakeaDeliciousSoupWecancreateapowerfulcommonsofbiomedicalknowledgebybuildingonestablishedresourcesandthededicatedcommunitytoconnectgenes,proteins,drugs,diseases,phenotypesandsymptoms.
WikipediawillbethefirstapplicationtousethecontentinWikidata,butcertainlynotthelast.
Thefireisreadyandthepotisstartingtoheatup.
Somevillagersarealreadypeekingoutoftheirwindowsreadytojoinusaroundthepot,butitwilltaketheeffortofthewholecommunitytomakeadeliciousbiomedicaldatasoup.
Weinviteyoutojoinusinthiseffort.
References1.
HistoryoftheStoneSoupStoryfrom1720tonow.
Availablefrom:http://www.
stonesoup.
com/history-of-the-stone-soup-story-from-1720-to-now/.
2.
Taylor.
J.
TheStoneSoupofData.
20078May;Availablefrom:https://km.
aifb.
kit.
edu/ws/ckc2007/StoneSoup-www2007.
pdf.
3.
Vrandei,D.
andM.
Krtzsch,Wikidata:AFreeCollaborativeKnowledgebase,inCommunicationsoftheACM.
2014,ACM.
p.
78-85.
4.
Heilman,J.
M.
andA.
G.
West,Wikipediaandmedicine:quantifyingreadership,editors,andthesignificanceofnaturallanguage.
JMedInternetRes,2015.
17(3):p.
e62.
5.
Huss,J.
W.
,3rd,etal.
,Agenewikiforcommunityannotationofgenefunction.
PLoSBiol,2008.
6(7):p.
e175.
6.
Brown,G.
R.
,etal.
,Gene:agene-centeredinformationresourceatNCBI.
NucleicAcidsRes,2015.
43(Databaseissue):p.
D36-42.
7.
Kibbe,W.
A.
,etal.
,DiseaseOntology2015update:anexpandedandupdateddatabaseofhumandiseasesforlinkingbiomedicalknowledgethroughdiseasedata.
NucleicAcidsRes,2015.
43(Databaseissue):p.
D1071-8.
8.
Bizer,C.
,etal.
,DBpedia-AcrystallizationpointfortheWebofData.
WebSemantics:Science,ServicesandAgentsontheWorldWideWeb,2009.
7(3):p.
154-165.
9.
Getallthedrug-druginteractionsforMethadonebasedonitsCHEMBLidCHEMBL651.
2015[cited2015Sep.
14];Availablefrom:https://bitbucket.
org/sulab/wikidatasparqlexamples/overview#markdown-header-get-all-the-drug-drug-interactions-for-methadone-based-on-its-chembl-id-chembl651.
10.
Pfundner,A.
,etal.
,UtilizingtheWikidatasystemtoimprovethequalityofmedicalcontentinWikipediaindiverselanguages:apilotstudy.
JMedInternetRes,2015.
17(5):p.
e110.
11.
Buil-Arand,C.
,etal.
SPARQLWeb-QueryingInfrastructure:ReadyforActionin12thInternationalSemanticWebConference.
2013.
Sydney,Australia.
12.
WikidataQueryEditor.
[cited2015;Availablefrom:https://wdq.
wmflabs.
org/wdq/.

OneTechCloud香港/日本/美国CN2 GIA月付9折季付8折,可选原生IP或高防VPS

OneTechCloud(易科云)是一家主打CN2等高端线路的VPS主机商家,成立于2019年,提供的产品包括VPS主机和独立服务器租用等,数据中心可选美国洛杉矶、中国香港、日本等,有CN2 GIA线路、AS9929、高防、原生IP等。目前商家针对全场VPS主机提供月付9折,季付8折优惠码,优惠后香港VPS最低季付64元起(≈21.3元/月),美国洛杉矶CN2 GIA线路+20Gbps防御型VPS...

棉花云1折起(49元), 国内BGP 美国 香港 日本

棉花云官网棉花云隶属于江西乐网科技有限公司,前身是2014年就运营的2014IDC,专注海外线路已有7年有余,是国内较早从事海外专线的互联网基础服务提供商。公司专注为用户提供低价高性能云计算产品,致力于云计算应用的易用性开发,并引导云计算在国内普及。目前公司研发以及运营云服务基础设施服务平台(IaaS),面向全球客户提供基于云计算的IT解决方案与客户服务(SaaS),拥有丰富的国内BGP、双线高防...

PQS彼得巧 年中低至38折提供台湾彰化HiNet线路VPS主机 200M带宽

在六月初的时候有介绍过一次来自中国台湾的PQS彼得巧商家(在这里)。商家的特点是有提供台湾彰化HiNet线路VPS主机,起步带宽200M,从带宽速率看是不错的,不过价格也比较贵原价需要300多一个月,是不是很贵?当然懂的人可能会有需要。这次年中促销期间,商家也有提供一定的优惠。比如月付七折,年付达到38折,不过年付价格确实总价格比较高的。第一、商家优惠活动年付三八折优惠:PQS2021-618-C...

mediawiki为你推荐
深圳市腾讯计算机三星iphone现有新的ios更新可用请从ios14be苹果手机更新不了最新14系统是怎么回事?iprouteEigrp 的管理距离是多少啊企业建网站我想建立一个企业网站,需要多少钱??sqlserver2000挂起安装sqlserver2000时总提示有挂起操作!重庆400年老树穿楼生长生长百年的老树,仍能不断生长,是因为主要有什么组织支付宝账户是什么支付宝的账号是什么啊internetexplorer无法打开internet explorer网页打不开netshwinsockreset在cmd中输入netsh winsock reset显示系统找不到指定文件怎么办
域名劫持 vps教程 动态ip的vps 68.168.16.150 创宇云 亚洲小于500m ftp教程 国外代理服务器软件 中国电信宽带测速网 hktv 域名与空间 空间登录首页 德讯 杭州电信宽带优惠 测速电信 1美元 深圳主机托管 新疆服务器 google搜索打不开 accountsuspended 更多