filesnanosim

nanosim  时间:2021-01-17  阅读:()
SOFTWAREOpenAccessNanoDJ:aDockerizedJupyternotebookforinteractiveOxfordNanoporeMinIONsequencemanipulationandgenomeassemblyHéctorRodríguez-Pérez1,TamaraHernández-Beeftink1,JoséM.
Lorenzo-Salazar2,JoséL.
Roda-García3,CarlosJ.
Pérez-González4,MarcosColebrook3*andCarlosFlores1,2,5*AbstractBackground:TheOxfordNanoporeTechnologies(ONT)MinIONportablesequencermakesitpossibletousecutting-edgegenomictechnologiesinthefieldandtheacademicclassroom.
Results:WepresentNanoDJ,aJupyternotebookintegrationoftoolsforsimplifiedmanipulationandassemblyofDNAsequencesproducedbyONTdevices.
Itintegratesbasecalling,readtrimmingandqualitycontrol,simulationandplottingroutineswithavarietyofwidelyusedalignersandassemblers,includingproceduresforhybridassembly.
Conclusions:WiththeuseofJupyter-facilitatedaccesstoself-explanatorycontentsofapplicationsandtheinteractivevisualizationofresults,aswellasbyitsdistributionintoaDockersoftwarecontainer,NanoDJisaimedtosimplifyandmakemorereproducibleONTDNAsequenceanalysis.
TheNanoDJpackagecode,documentationandinstallationinstructionsarefreelyavailableathttps://github.
com/genomicsITER/NanoDJ.
Keywords:Genomeanalysis,Nanoporesequencing,Jupyter,DockerBackgroundIthasneverbeenbeforesoeasyandaffordabletoaccessandutilizegeneticvariationofanyorganismandpurpose.
Thishasbeenmotivatedbythecontinuousdevelopmentofhigh-throughputDNAsequencingtechnologies,mostcommonlyknownasNextGenerationSequencing(NGS).
Akeyimprovementisthepossibilityofobtaininglongsin-glemoleculesequenceswiththefastandcost-efficiencytechnologyreleasedbyOxfordNanoporeTechnologies(ONT)andthemarketingin2014oftheMinION,aport-able,pocket-size,nanopore-basedNGSplatform[1].
Sincethen,severalalgorithmsandsoftwaretoolshaveflourishedspecificallyforONTsequencedata.
Despiteitssize,itpro-videsmulti-kilobasereadswithathroughputcomparabletootherbenchtopsequencersinthemarket(1–10Gbasesby2017),thereforestillnecessitatingofefficientandinte-gratedbioinformaticstoolstofacilitatethewidespreaduseofthetechnology.
WhileMinIONhasshownpromiseindistinctapplica-tions[2],becauseofthelowcost,laptopoperability,andtheUSB-poweredcompactdesignofMinION,cutting-edgeNGStechnologyisnotanymorenecessarilylinkedtotheestablishedideaofalargemachinewithhighcostthatmustbelocatedincentralizedsequencingcentersorinalabora-torybench.
Asaconsequence,theutilityofMinIONinfieldexperimentstomovefromsample-to-answersonsitehavebeendemonstratedwithinfectiousdiseasestudies[3,4],off-Earthgenomesequencing[5],andspeciesidentifica-tioninextremeenvironments[6–8],amongothers.
Lever-agingofMinIONcapabilitiesintheacademicclassroomisanaturalextensionofthesefieldstudiestofacilitateTheAuthor(s).
2019OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.
0InternationalLicense(http://creativecommons.
org/licenses/by/4.
0/),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicense,andindicateifchangesweremade.
TheCreativeCommonsPublicDomainDedicationwaiver(http://creativecommons.
org/publicdomain/zero/1.
0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated.
*Correspondence:mcolesan@ull.
edu.
es;cflores@ull.
edu.
esHéctorRodríguez-PérezandTamaraHernández-Beeftinkcontributedequallytothiswork.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,SpainFulllistofauthorinformationisavailableattheendofthearticleRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234https://doi.
org/10.
1186/s12859-019-2860-zeducationofgenomicsinundergraduateandgraduatestu-dents[9].
Todate,thereisnospecificsoftwaresolutionaimedtofacilitateONTsequenceanalysesbyintegratingcapabilitiesfordatamanipulation,sequencecomparisonandassemblyinfieldexperimentsorforeducationalpurposestohelpfacilitatelearningofgenomics[9].
WehavedevelopedNanoDJ,aninteractivecollectionofJupyternotebookstointegrateavarietyofsoftware,advancedcomputercode,andplaincontextualexplanations.
Inaddition,NanoDJisdistributedasaDockersoftwarecontainertosimplifyin-stallationofdependenciesandimprovethereproducibilityofresults.
ImplementationNanoDJisdistributedasaDockercontainerbuiltunder-neathJupyternotebooks,whichisincreasinglypopularinlifesciencestosignificantlyfacilitatetheinteractiveex-plorationofdata[10],andhasbeenrecentlyintegratedinthewidelyusedGalaxyportal[11].
TheDockercontainerallowsNanoDJtoruninanisolated,self-containedpack-age,thatcanbeexecutedseamlesslyacrossawiderangeofcomputingplatforms[12],havinganegligibleimpactontheexecutionperformance[13].
NanoDJintegratesdi-verseapplications(Additionalfile1:TableS1)organizedinto12notebooksgroupedonthreesections(Fig.
1;Table1).
Mainresultsarepresentedasembeddedobjects.
Inaddition,oneofthenotebookswasconceivedforedu-cationalpurposesbysettingaparticularlysimpleproblemandtheinclusionoflow-levelexplanations.
Tofacilitatetheuseoftheeducationalnotebookandbypassingthein-stallationofDockerandNanoDJ,alightweightversionofthisnotebookandsmallsetsofONTreadscanbeutilizedfromaweb-browserusingBinder(https://mybinder.
org)intheNanoDJGitHubrepository.
Inaddition,aspartoftheCyVerseproject(https://www.
cyverse.
org/),NanoDJhasbeenincorporatedintoVICE,avisualandinteractivecomputingenvironmentthatfacilitatestrainingofONTdataanalysis.
WeillustratetheversatilityofNanoDJindistinctscenariosbyprovidingresultsfromfourcasestud-ies(Additionalfile1:TextS1).
Input,basecalling,andsimulationsInputdatacanbealistofFAST5filesfrompreviousbase-calledruns(e.
g.
aMetrichoroutput)orevent-levelsignaldatatobebasecalledusingthelatestONTcaller.
TheusercanalsosimulatereadswithNanoSimandpre-computedmodelparameters.
Thispossibilityisimportantindifferentscenariosastohelpdesigninganexperiment,ortobypasstechnicaldifficultiesinacademicsetups[9].
Summary,qualitycontrolandfilteringEitherforasimulatedoranempiricalrun,theuserwillobtainsummarydataandplotsinformingofreadlengthdistribution,GCcontentvs.
length,andreadlengthvs.
qualityscore(whenavailable).
Ifbarcodeswereusedintheexperiment,Porechopcanbeusedfordemultiplexing,barcodetrimmingandtofilteroutreads.
GenomeassemblyandcomparisonDependingontheapplication,sequencedatacanbealignedagainstreferencesequencesorusedforgenomeas-semblyusingdiversemethods.
Alignmentisperformedei-theragainstone(BWAandRebaler)ormultiple(BLAST)referencesequences,providingthegenerationofBAMfilesfordownstreamapplications(e.
g.
,variantidentification)orinformationofspeciescomposition.
Alternatively,theusermayoptforadenovoassembly.
NanoDJallowstheuseofsomeofthebest-performingalgorithms(Canu,Flye,andMiniasm),ortocombineONTreadswithothersobtainedwithsecond-generationNGSplatformsforahybridassem-bly(UnicyclerandMaSuRCA).
ThelatterprovidesmoreFig.
1SimplifiedschemeofallNanoDJfunctionalitiesRodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page2of4effectiveassembliesandreducederrorratecomparedtoas-sembliesbasedonlyonONTreads[14].
NanoDJincludesthepossibilityofcontigcorrection(Racon,Nanopolish,andPilon).
Assembliescanbeevaluatedwiththeembed-dedversionofQUAST,andrepresentedwithBandage.
LimitationsandfuturedirectionsFornon-expertusers,itwouldhavebeenbetterifNanoDJwasenvisagedasanon-lineapplicationtofacilitateitsuse.
However,ourmainobjectivewastointegratemajortoolsfortheanalysisofONTsequencesinaninteractivesoft-wareenvironmenttofacilitatelearningthebasicsbehindONTsequenceanalysiswhileprovidingausefultoolforprofessionals.
ProvidingitasaDockerizedsolutionsimplybolstersthefocusontheuseofthetool,reducingthebur-denofinstallingalldependenciesbytheuser.
Atthemo-ment,NanoDJissetfortheanalysisofsmallgenomesandtargetedNGSstudies,althoughfocusingonprimaryandsecondaryanalysisofDNAsequences.
Theintegrationoftoolsforvariantidentificationandtertiaryanalysis(anno-tationofvariantsorsequenceelements,interpretation,etc.
)[15,16],aswellasforepigenetics[17]anddirectRNAsequencing[18]willbethefocusoffurtherdevelop-mentsofNanoDJ.
ConclusionsWepresentNanoDJasanintegratedJupyter-basedtool-boxdistributedasaDockersoftwarecontainertofacili-tateONTsequenceanalysis.
NanoDJisbestsuitedfortheanalysesofsmallgenomesandtargetedNGSstudies.
WeanticipatethattheJupyternotebook-basedstructurewillsimplifyfurtherdevelopmentsinotherapplications.
AvailabilityandrequirementsProjectname:NanoDJProjecthomepage:https://github.
com/genomicsITER/NanoDJOperatingsystem(s):Windows,Linux,MacOSProgramminglanguage:Bash/PythonOtherrequirements:DockerinstallationLicense:GPLAnyrestrictionstousebynon-academics:NoneAdditionalfileAdditionalfile1:TableS1.
ApplicationsintegratedinNanoDJ.
TextS1.
Testingoncasestudydatasets.
TableS2.
DatasetsforillustrativeusesofNanoDJ.
TableS3.
Comparisonofdenovoassembliesusingdifferentinputsorwithanassemblycorrector.
TableS4.
Comparisonofthreedenovoassemblersinahigh-coverageONTdataset.
TableS5.
Comparisonofresultsfromtwohybriddenovoassemblers.
FigureS1.
Humanmito-chondrialDNAvariantrepresentationagainstthereferencesequence.
TableS6.
SourceofmitochondrialDNAgenomes,simulationsandclassi-ficationresults.
(DOCX1544kb)AcknowledgementsNotapplicableFundingThisresearchwasfundedbytheInstitutodeSaludCarlosIII(grantsPI14/00844andPI17/00610),theSpanishMinistryofScience,InnovationandUniversities(grantRTC-2017-6471-1;MINECO/AEI/FEDER,UE),theSpanishMinistryofEconomyandCompetitiveness(grantMTM2016–74877-P),whichwereco-financedbytheEuropeanRegionalDevelopmentFunds'AwayofmakingEurope'fromtheEuropeanUnion,AreaTenerife2030fromCabildodeTenerife(CGIEU0000219140),andbytheagreementOA17/008withInsti-tutoTecnológicoydeEnergíasRenovables(ITER)tostrengthenscientificandtechnologicaleducation,training,research,developmentandinnovationinGenomics,PersonalizedMedicineandBiotechnology.
Thefoundingen-titieshadnoroleinthedesignofthestudy,analysis,interpretationofdataorinmanuscriptwriting.
AvailabilityofdataandmaterialsAlldatageneratedoranalysedduringthisstudyareincludedinthispublishedarticleanditssupplementaryinformationfiles.
RawreadsfromMinIONandIlluminaareavailablefromtheSRAdatabase(accessionnumber(s)PRJNA451111,PRJNA451107).
Authors'contributionsHRPscriptedandtestedthesoftware,andcontributedtodataanalysis;THBwasinvolvedindataanalysisandinterpretation;JLSwasinvolvedindataanalysis;JRGrevisedandtestedthesoftwareandrevisedthemanuscript;CPGwasinvolvedinvisualization,dataanalysisandrevisedthemanuscript;MCconceivedtheproject,revisedandtestedthesoftware,andrevisedthemanuscript;CFconceivedtheproject,designedthesoftware,interpretedthedata,andcriticallyrevisedthemanuscript.
Allauthorshavereadandapprovedthefinalmanuscript.
Table1SummaryofNanoDJnotebooksNameFunctionality0.
0_QualityControl.
ipynbEvaluatethequalitycontrolandsequencehandling1.
0_Basecalling.
ipynbTranslatestheeventsortherawelectricalsignalfromanONTsequencer(FAST5format)toaDNAsequencetoobtainaFASTAoraFASTQfile1.
1_Trim+Demux.
ipynbPerformsequencetrimminganddemultiplexing2.
0_DeNovo_Canu-Miniasm.
ipynbDenovoassemblywithCanuorMiniasm,andpolishwithRaconandPilon3.
0_DeNovo_Canu+polish.
ipynbNanopolishmodulestoimprovetheCanuassembly4.
0_DeNovo_Flye.
ipynbDenovoassemblywithFlyesoftware5.
0_DeNovo_Hybrid.
ipynbPerformdenovoassemblyofNanoporereadsinconjunctionwithIlluminareadsusingMaSuRCAand/orUnicyclersoftware6.
0_AssemblyCompare.
ipynbComparedistinctassemblyresultsbasedonQUASTsoftware7.
0_SimulateReads.
ipynbObtainsimulatedreadsmadewithNanosimsoftwareandtheNanosim-hforkwithprecomputedmodels8.
0_Alignment.
ipynbReference-basedassemblyusingeitherBWA,BLASTorRebalersoftware9.
0_AssemblyGraph.
ipynbAssemblygraphvisualizationEducational.
ipynbPerformsbasecalling(withAlbacore),qualitycontrolsteps,andaBLAST-basedclassificationofthereads(foreducationalpurposes)Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page3of4EthicsapprovalandconsenttoparticipateNotapplicable.
ConsentforpublicationNotapplicable.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Publisher'sNoteSpringerNatureremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations.
Authordetails1ResearchUnit,HospitalUniversitarioNuestraSeoradeCandelaria,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
2GenomicsDivision,InstitutoTecnológicoydeEnergíasRenovables(ITER),SantaCruzdeTenerife,Spain.
3DepartamentodeIngenieríaInformáticaydeSistemas,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
4DepartamentodeMatemáticas,EstadísticaeInvestigaciónOperativa,UniversidaddeLaLaguna,SantaCruzdeTenerife,Spain.
5CIBERdeEnfermedadesRespiratorias,InstitutodeSaludCarlosIII,Madrid,Spain.
Received:22June2018Accepted:29April2019References1.
BrownCG,ClarkeJ.
NanoporedevelopmentatOxfordNanopore.
NatBiotechnol.
2016;34:810–1.
2.
JainM,OlsenHE,PatenB,AkesonM.
TheOxfordNanoporeMinION:deliveryofnanoporesequencingtothegenomicscommunity.
GenomeBiol.
2016;17:239.
3.
QuickJ,LomanNJ,DuraffourS,etal.
Real-time,portablegenomesequencingforEbolasurveillance.
Nature.
2016;530:228–32.
4.
FariaNR,QuickJ,ClaroIM,ThézéJ,deJesusJG,GiovanettiM,KraemerMUG,HillSC,BlackA,daCostaAC,FrancoLC,SilvaSP,WuC-H,RaghwaniJ,CauchemezS,duPlessisL,VerottiMP,deOliveiraWK,CarmoEH,CoelhoGE,SantelliACFS,VinhalLC,HenriquesCM,SimpsonJT,LooseM,AndersenKG,GrubaughND,SomasekarS,ChiuCY,Muoz-MedinaJE,Gonzalez-BonillaCR,AriasCF,Lewis-XimenezLL,BaylisSA,ChieppeAO,AguiarSF,FernandesCA,LemosPS,NascimentoBLS,MonteiroHAO,SiqueiraIC,deQueirozMG,deSouzaTR,BezerraJF,LemosMR,PereiraGF,LoudalD,MouraLC,DhaliaR,FranaRF,MagalhesT,MarquesETJr,JaenischT,WallauGL,deLimaMC,NascimentoV,deCerqueiraEM,deLimaMM,MascarenhasDL,NetoJPM,LevinAS,Tozetto-MendozaTR,FonsecaSN,Mendes-CorreaMC,MilagresFP,SeguradoA,HolmesEC,RambautA,BedfordT,NunesMRT,SabinoEC,AlcantaraLCJ,LomanNJ,PybusOG.
EstablishmentandcryptictransmissionofZikavirusinBrazilandtheAmericas.
Nature.
2017;546:406–10.
5.
Castro-WallaceSL,ChiuCY,JohnKK,StahlSE,RubinsKH,McIntyreABR,DworkinJP,LupisellaML,SmithDJ,BotkinDJ,StephensonTA,JuulS,TurnerDJ,IzquierdoF,FedermanS,StrykeD,SomasekarS,AlexanderN,YuG,MasonCE,BurtonAS.
NanoporeDNAsequencingandgenomeassemblyontheinternationalSpaceStation.
SciRep.
2017;7:18022.
6.
JohnsonSS,ZaikovaE,GoerlitzDS,BaiY,TigheSW.
Real-timeDNAsequencingintheAntarcticdryvalleysusingtheOxfordNanoporesequencer.
JBiomolTech.
2017;28(1):2–7.
7.
PomerantzA,PeafielN,ArteagaA,BustamanteL,PichardoF,ColomaLA,Barrio-AmorosCL,Salazar-ValenzuelaD,ProstS.
Real-timeDNAbarcodinginaremoterainforestusingnanoporesequencing.
Gigascience.
2018;7(4):giy033.
8.
MenegonM,CantaloniC,Rodriguez-PrietoA,CentomoC,AbdelfattahA,RossatoM,BernardiM,XumerleL,LoaderS,DelledonneM.
OnsiteDNAbarcodingbynanoporesequencing.
PLoSOne.
2017;12:e0184741.
9.
ZaaijerS,ColumbiaUniversityUbiquitousgenomics2015class,ErlichY:usingmobilesequencersinanacademicclassroom.
Elife.
2016,5:e14258.
10.
AlmugbelR,HungLH,HuJ,AlmutairyA,OrtogeroN,TamtaY,YeungKY.
ReproducibleBioconductorworkflowsusingbrowser-basedinteractivenotebooksandcontainers.
JAmMedInformAssoc.
2018;25:4–12.
11.
GrüningBA,RascheE,Rebolledo-JaramilloB,EberhardC,HouwaartT,ChiltonJ,CoraorN,BackofenR,TaylorJ,NekrutenkoA.
Jupyterandgalaxy:easingentrybarriersintocomplexdataanalysesforbiomedicalresearchers.
PLoSComputBiol.
2017;13:e1005425.
12.
BoettigerC.
AnintroductiontoDockerforreproducibleresearch.
OperSystRev.
2015;49:71–9.
13.
DiTommasoP,PalumboE,ChatzouM,PrietoP,HeuerML,NotredameC.
TheimpactofDockercontainersontheperformanceofgenomicpipelines.
PeerJ.
2015;3:e1273.
14.
WickRR,JuddLM,GorrieCL,HoltKE.
CompletingbacterialgenomeassemblieswithmultiplexMinIONsequencing.
MicrobGenom.
2017;3:e000132.
15.
CookDE,Valle-InclanJE,PajoroA,RovenichH,ThommaB,FainoL.
Long-readannotation:automatedeukaryoticgenomeannotationbasedonlong-readcDNAsequencing.
PlantPhysiol.
2019;179:38–54.
16.
SedlazeckFJ,ReschenederP,SmolkaM,FangH,NattestadM,vonHaeselerA,SchatzMC.
Accuratedetectionofcomplexstructuralvariationsusingsingle-moleculesequencing.
NatMethods.
2018;15:461–8.
17.
StoiberMH,QuickJF,EganR,LeeJE,CelnikerSE,NeelyRK,LomanNJ,PennacchioLA,BrownJO.
DenovoidentificationofDNAmodificationsenabledbygenome-guidedNanoporesignalprocessing.
bioRxiv.
.
https://doi.
org/10.
1101/094672.
18.
GaraldeDR,SnellEA,JachimowiczD,SiposB,LloydJH,BruceM,PanticN,AdmassuT,JamesP,WarlandA,JordanM,CicconeJ,SerraS,KeenanJ,MartinS,McNeillL,WallaceEJ,JayasingheL,WrightC,BlascoJ,YoungS,BrocklebankD,JuulS,ClarkeJ,HeronAJ,TurnerDJ.
HighlyparalleldirectRNAsequencingonanarrayofnanopores.
NatMethods.
2018;15:201–6.
Rodríguez-Pérezetal.
BMCBioinformatics(2019)20:234Page4of4

UCloud年度大促活动可选香港云服务器低至年134元

由于行业需求和自媒体的倾向问题,对于我们个人站长建站的方向还是有一些需要改变的。传统的个人网站建站内容方向可能会因为自媒体的分流导致个人网站很多行业不再成为流量的主导。于是我们很多个人网站都在想办法进行重新更换行业,包括前几天也有和网友在考虑是不是换个其他行业做做。这不有重新注册域名重新更换。鉴于快速上手的考虑还是采用香港服务器,这不腾讯云和阿里云早已不是新账户,考虑到新注册UCLOUD账户还算比...

火数云 55元/月BGP限时三折,独立服务器及站群限时8折,新乡、安徽、香港、美国

火数云怎么样?火数云主要提供数据中心基础服务、互联网业务解决方案,及专属服务器租用、云服务器、专属服务器托管、带宽租用等产品和服务。火数云提供洛阳、新乡、安徽、香港、美国等地骨干级机房优质资源,包括BGP国际多线网络,CN2点对点直连带宽以及国际顶尖品牌硬件。专注为个人开发者用户,中小型,大型企业用户提供一站式核心网络云端服务部署,促使用户云端部署化简为零,轻松快捷运用云计算!多年云计算领域服务经...

GigsGigsCloud:$16/月KVM-1GB/30GB/1TB/1.6T高防/洛杉矶CN2 GIA+AS9929

GigsGigsCloud是一家成立于2015年老牌国外主机商,提供VPS主机和独立服务器租用,数据中心包括美国洛杉矶、中国香港、新加坡、马来西亚和日本等。商家VPS主机基于KVM架构,绝大部分系列产品中国访问速度不错,比如洛杉矶机房有CN2 GIA、AS9929及高防线路等。目前Los Angeles - SimpleCloud with Premium China DDOS Protectio...

nanosim为你推荐
国外域名注册国外域名注册什么好的推荐国内ip代理谁能推荐一款最快的ip代理。海外域名外贸网站如何选择合适的海外域名?网站空间价格域名空间一般几钱?万网虚拟主机万网虚拟主机可以做几个网站大连虚拟主机大连横展网络科技有限公司怎么样?下载虚拟主机虚拟机下载完之后如何安装西安虚拟主机如何评价虚拟主机的优劣广西虚拟主机虚拟主机哪里的好?虚拟主机提供商那个提供商的虚拟主机比较便宜,不要小牌子,服务要好
漂亮qq空间 便宜域名 softbank官网 typecho 天互数据 卡巴斯基官方免费版 789电视网 免费智能解析 1美金 网游服务器 闪讯官网 美国迈阿密 国内空间 hosting 西部数码主机 免费php空间申请 主机游戏 网络存储服务器 web服务器软件 iis配置web服务器 更多