assumingwww.qqq147.com

www.qqq147.com  时间:2021-03-20  阅读:()
Scheubertetal.
JournalofCheminformatics2013,5:12http://www.
jcheminf.
com/content/5/1/12REVIEWOpenAccessComputationalmassspectrometryforsmallmoleculesKerstinScheubert1*,FranziskaHufsky1,2andSebastianB¨ocker1AbstractTheidenticationofsmallmoleculesfrommassspectrometry(MS)dataremainsamajorchallengeintheinterpretationofMSdata.
Thisreviewcoversthecomputationalaspectsofidentifyingsmallmolecules,fromtheidenticationofacompoundsearchingareferencespectrallibrary,tothestructuralelucidationofunknowns.
Indetail,wedescribethebasicprinciplesandpitfallsofsearchingmassspectralreferencelibraries.
Determiningthemolecularformulaofthecompoundcanserveasabasisforsubsequentstructuralelucidation;consequently,wecoverdierentmethodsformolecularformulaidentication,focussingonisotopepatternanalysis.
Wethendiscussautomatedmethodstodealwithmassspectraofcompoundsthatarenotpresentinspectrallibraries,andprovideaninsightintodenovoanalysisoffragmentationspectrausingfragmentationtrees.
Inaddition,thisreviewshortlycoversthereconstructionofmetabolicnetworksusingMSdata.
Finally,welistavailablesoftwarefordierentstepsoftheanalysispipeline.
Keywords:Massspectrometry,Metabolomics,Spectrallibrary,Molecularformulaidentication,Structureelucidation,Fragmentationtrees,NetworksIntroductionMassspectrometry(MS)isakeyanalyticaltechnologyfordetectingandidentifyingsmallbiomoleculessuchasmetabolites[1-3].
Itisordersofmagnitudemoresen-sitivethannuclearmagneticresonance(NMR).
Severalanalyticaltechniqueshavebeendeveloped,mostnotablygaschromatographyMS(GC-MS)andliquidchromatog-raphyMS(LC-MS).
Bothanalyticalsetupshavetheiradvantagesanddisadvantages,seeSection"Experimentalsetups"fordetails.
Inrecentyears,ithasbeenrecognizedthatoneofthemostimportantaspectsofsmallmoleculeMSistheautomatedprocessingoftheresultingdata.
Inthisreview,wewillcoverthedevelopmentofcomputationalmethodsforsmallmoleculemassspectrometryduringthelastdecades.
Here,theterm"smallmolecule"referstoallsmallbiomoleculesexcludingpeptides.
Obviously,ourreviewcannotbecomplete:Inparticular,wewillnotcoverthe"earlyyears"ofcomputationalmassspectrom-etryofsmallmolecules.
Firstrule-basedapproachesforpredictingfragmentationpatterns,aswellasexplaining*Correspondence:kerstin.
scheubert@uni-jena.
de1ChairofBioinformatics,FriedrichSchillerUniversity,Ernst-Abbe-Platz2,Jena,GermanyFulllistofauthorinformationisavailableattheendofthearticleexperimentalmassspectrawiththehelpofamolecu-larstructure,weredevelopedaspartoftheDENDRALprojectthatstartedbackin1965[4-7];seealsoChapter7of[8].
CitingGasteigeretal.
[9]:"However,itissadtosaythat,intheend,theDENDRALprojectfailedinitsmajorobjectiveofautomaticstructureelucidationbymassspectraldata,andresearchwasdiscontinued.
"Wewillnotcovermethodsthatdealwithprocess-ingtherawdata,suchasde-noisingandpeakpicking,asthisisbeyondthescopeofourreview;seeSection"Softwarepackages"foralistofavailablesoftwarepack-agesforthistask.
Furthermore,wedonotcovertheproblemofaligningtwoormoreLC-MSorGC-MSruns[10-13].
Finally,wewillnotcovercomputationalmethodsthatdealwiththechromatographypartoftheanalysis,suchaspredictingretentionindices[14,15].
Structureconrmationofanunknownorganiccom-poundisalwaysperformedwithasetofindependentmethods,inparticularNMR.
Theterm"structureelucida-tion"usuallyreferstofulldenovostructureidenticationofacompound,includingstereochemicalassignments.
Itiscommonlybelievedthatstructureelucidationisimpos-sibleusingMStechniquesalone,atleastwithoutusingstrongbackgroundinformation.
Wewillnotcoverthis2013Scheubertetal.
;licenseeChemistryCentralLtd.
ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http://creativecommons.
org/licenses/by/2.
0),whichpermitsunrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited.
Scheubertetal.
JournalofCheminformatics2013,5:12Page2of24http://www.
jcheminf.
com/content/5/1/12aspect,butconcentrateontheinformationthatMSexper-imentscangive.
"Computationalmassspectrometry"dealswiththedevelopmentofcomputationalmethodsfortheauto-matedanalysisofMSdata.
Overthelasttwodecades,muchresearchhasbeenfocusedonmethodsforanalyz-ingproteomicsMSdata,withliterallyhundredsofarticlesbeingpublishedinscienticjournals[16-21].
Thepro-teomicseldhasbenetedtremendouslyfromthisdevel-opment;oftenonlytheuseoftheseautomatedmethodsenableshigh-throughputproteomicsexperiments.
Com-putationalmethodsfortheanalysisofproteinsandpep-tides,aswellasDNAandRNA[22,23],glycans[24-26],orsyntheticpolymers[27,28]arealsopartofcomputa-tionalmassspectrometry,butoutsidethescopeofthisreview.
Finally,disclosingmethodsisimportantforrepro-duciblescience.
Thus,wewillalsonotcover"anecdo-tal"computationalMSwhereanautomatedmethodismentionedinapaper,butnodetailsofthemethodareprovided.
ReviewofreviewsExistingreviewsoncomputationalMSforsmallmolecules,usuallyfocusonamuchmorenarrowareaoftheeldsuchasrawdataprocessing[29],metabolomicsdatabasesandlaboratoryinformationmanagementsys-tems[30],ormetaboliteidenticationthroughreferencelibraries[31].
Otherreviewssimplylistavailabletoolsforprocessingthedatawithoutdiscussingtheindividualapproaches[32].
Abroadoverviewonexperimentalaswellastheoreticalstructureelucidationtechniquesforsmallmoleculesusingmassspectrometryisgivenin[33].
MethodsspecicforqualitativeandquantitativemetabolomicsusingLC-MS/MSarecoveredin[34].
Methodsspecicformetabo-liteprolingbyGC-MSarecoveredin[35].
Anoverviewofisotopepatternsimulationisgivenin[36].
Annotationandidenticationofsmallmoleculesfromfragmenta-tionspectrausingdatabasesearchaswellasdenovointerpretationtechniquesiscoveredin[37].
Forageneralintroductiontometabolomicsandmetabolomicprolingsee[2,3,38];forrecentworkintheeldsee[39].
ExperimentalsetupsAnalysisofsmallmoleculesbyGC-MSisusuallyper-formedusingElectronIonization(EI).
Historicallyseen,EIistheoldestionizationtechniqueforsmall-moleculeinvestigations.
Becauseoftheselectedconstantioniza-tionenergyat70eV,resultingfragment-richmassspectraare,ingeneral,consistentacrossinstruments,andspe-cicforeachcompound.
AmajordisadvantageofmassspectraobtainedunderEIconditionsisthelowabundantormissingmolecularionpeak;tothisend,themassofthecompoundisoftenunknown.
GC-MSrequiresthatananalyteisvolatileandthermallystable.
Fornon-volatileanalytessuchaspolarcompounds,chemicalderivatiza-tionhastobeperformed.
Recently,LC-MShasbeenincreasinglyusedfortheanalysisofsmallmolecules.
Here,compoundsarefrag-mentedusingtandemMS,forexamplebyCollisionInducedDissociation(CID).
Thishastheadvantagethatthemassofallmolecularionsisknown,whichispartic-ularlybenecialfordenovoapproachesdiscussedbelow.
Unfortunately,tandemmassspectraarenotasrepro-ducibleasEIspectra,inparticularacrossdierentinstru-mentsoreveninstrumenttypes[40].
Furthermore,usingdierentcollisionenergiescanmaketandemmassspec-trahardtocompare.
Comparingspectrafromdier-entinstrumenttypes,only64–89%ofthespectrapairsmatchwithmorethan60%identity,dependingontheinstrumentpair[41].
Finally,tandemmassspectrausuallycontainmuchlessfragmentsthanEIfragmentationspec-tra.
ChemicalderivatizationcandramaticallyincreasethesensitivityandspecicityofLC-MSforlesspolarcompounds[42].
SeveralmethodshavebeenproposedtocreatemorereproducibleandinformativetandemMSspectra.
Forexample,toincreasethenumberoffragments,tandemMSspectraareoftenrecordedatmorethanonefrag-mentationenergy.
Alternatively,"CIDvoltageramping"continuouslyincreasesthefragmentationenergyduringasingleacquisition[43].
Also,someprogresshasbeenmadetonormalizefragmentationenergiesacrossinstrumentsandinstrumenttypes[40,44,45].
Besidesthetwo"standard"experimentalsetupsdescribedabove,manyothersetupshavebeendeveloped:Thisincludes"alternative"ionizationtechniquessuchasMatrix-AssistedLaserDesorption/Ionization[46],AtmosphericPressureChemicalIonization[47],Atmo-sphericPressurePhotoionization[48],andDesorptionElectrosprayIonization[49].
Alsoseveralchromato-graphicmethodssuchasHighPerformanceLC[50]andUltraHighPerformanceLC(UHPLC)[51]havebeendeveloped.
Inparticular,asensitivecapillaryUHPLCshowsgoodresultsinlipididentication[52].
Coveringthedetailsofthesemodiedsetupsisfarbeyondthescopeofthisreview.
Fromthecomputationalside,wecanusuallyclassifythesemodiedsetupswithregardstothetwo"standard"setups:Forexample,isthemassofthemolecularionknown(LC-MS/MS)orunknown(GC-EI-MS)Isthefragmentationspectrumrich(GC-EI-MS)orsparse(LC-MS)Whatisthemassaccuracyofthemeasurement(seebelow)GiventhatnewMStechnologiesandexperimentalsetupsareconstantlybeingdeveloped,weseeitasaprerequisitefora"good"methodfromcomputationalMSthatitisnottargetedatoneparticularexperimentalsetup.
Note,though,thattheScheubertetal.
JournalofCheminformatics2013,5:12Page3of24http://www.
jcheminf.
com/content/5/1/12eortrequiredforadaptingamethodcandiersigni-cantly:Forexample,methodsforidentifyingmolecularformulasfromisotopepatterns(seeSection"Molecularformulaidentication")canbeappliedtoanyexperi-mentalsetupwhereisotopepatternsarerecorded.
Incontrast,rule-basedpredictionoffragmentationspectra(seeSection"Insilicofragmentationspectrumpredic-tion")requiresexpert-curated"learning"offragmentationrules.
ManymethodsforthecomputationalanalysisofsmallmoleculeMS,thatgobeyondthestraightforwardlibrarysearch,requirethatmassesinthemassspectraaremea-suredwithanappropriatemassaccuracy.
Itappearsthatthismassaccuracyismuchmoreimportantforthecomputationalanalysisthantheoften-reportedresolv-ingpowerofMSinstruments.
Historically,GC-MSisoftenperformedoninstrumentswithrelativelybadmassaccuracy(worsethan100ppm,partspermillion).
Incontrast,LC-MSandtandemMSareoftenperformedoninstrumentalplatforms(suchasOrbitrapororthog-onalQuadrupoleTime-of-FlightMS)thatresultinamuchbettermassaccuracy,oftenbelow10ppmorbet-ter.
Thisreferstothemassaccuracythatwecanexpectineverydayuseoftheinstrument,nottothe"anecdotalmassaccuracy"ofasinglemeasurement[53].
Itmustbeunderstood,though,thatthisisnotafundamentalprob-lemofGC-MS;infact,GC-MSmeasurementsofhighmassaccuracyareincreasinglyreportedintheliterature[54-56].
ReportingstandardsformetabolomicsanalysisForthematurationofmetabolomicsthelackofstan-dardsforpresentingandexchangingdataneedstobelled.
MIAMET(MinimumInformationAboutaMETabolomicsexperiment)[57]suggestsreportingstan-dardsregardingexperimentaldesign,samplepreparation,metabolicprolingdesignandmeasurements.
ArMet[58]isadatamodelthatallowsformaldescriptiontospecifythefullexperimentalcontext.
TheMetabolomicsStan-dardsInitiative(MSI)[59]developsguidelinesandstan-dardsforsharinghigh-quality,structureddatafollowingtheworkoftheproteomicscommunity.
TheDataAnalysisWorkingGroup(DAWG)[60]aspartoftheMSIproposedreportingstandardsformetabolomicsstudiesthatincludeareportingvocabularyandwillhelpreproducingthesestudiesanddrawingconclusionsfromtheresultingdata.
TheChemicalAnalysisWorkingGroup(CAWG)estab-lishedcondencelevelsfortheidenticationofnon-novelchemicalcompounds[61],rangingfromlevel1forarig-orousidenticationbasedonindependentmeasurementsofauthenticstandards,tounidentiedsignalsatlevel4.
TheNIHMetabolomicsFundrecentlysupportedanini-tiativetocreatearepositorythatenforcesthesubmissionofmetadata.
DatastorageandspectrallibrariesToallowdata-drivendevelopmentofalgorithmsforsmallmoleculeidentication,massspectrometricreferencedatasetsmustbemadepubliclyavailableviareferencedatabases.
ExamplesofsuchdatabasesincludeMassBank[62,63],METLIN[64,65],MadisonMetabolomicsConsortiumDatabase(MMCD)[1],GolmMetabolomeDatabase(GMD)[66],thePlatformforRIKENMetabolomics(PRiMe)[67],orMeltDB[68].
Unfortunately,makingavailableexperimentaldataismuchlesspronouncedinthemetabolomicsandsmall-moleculeresearchcommunity,thanitisinproteomicsorgenomics.
Forexample,severaloftheabove-mentioneddatabasesdonotallowforthebatchdownloadofthedatabase.
Citing[69],"tomakefulluseofresearchdata,thebiosciencecommunityneedstoadopttechnologiesandrewardmechanismsthatsupportinteroperabilityandpromotethegrowthofanopen'datacommoning'culture.
"Possibly,theMetaboLightsdatabasethatispartoftheISA(Investigation,Study,Assay)commonsframe-workcanllthisgap.
NotethatthePubChemdatabaseallowsfreeaccesstomorethan35millionmolecularstructures,andthisincludesbatchdownloadofthedata.
Besidestheopen(orpartlyopen)librariesmentionedabove,thereexisttwoimportantcommerciallibraries:TheNationalInstituteofStandardsandTechnology(NIST)massspectrallibrary(version11)containsEIspec-traofmorethan200000compounds;theWileyRegistry(9thedition)containsEIspectraofalmost600000uniquecompounds.
Forcomparison,theGMD[66]containsEIfragmentationmassspectraofabout1600compounds;andtheFiehnLiblibrarycontainsEIspectraformorethan1000metabolites[70].
ThesizeoftandemMSlibrariesisstillsmall,comparedtoEIlibraries(seeFigure1).
TheNIST11containscol-lisioncellspectraforabout4000compounds.
TheWileyRegistryofTandemMassSpectralData[71,72]com-prisespositiveandnegativemodespectraofmorethan1200compounds.
AsforEIspectra,bothdatabasesarecommerciallyavailable.
Aseventhecommerciallibrariesaresmall,therehavebeenseveralattemptstomaketandemmassspectrapub-liclyavailable.
METLIN[64]containshighresolutiontan-demmassspectraformorethan10000metabolitesfordiagnosticsandpharmaceuticalbiomarkerdiscoveryandallowstobuildapersonalizedmetabolitedatabasefromitscontent[73].
MassBank[62,63]isapublicrepositorywithmorethan30000spectraofabout4000compoundscol-lectedfromdierentconsortiummembers.
TheMMCD[1]isahubforNMRandMSspectraldatacontain-ingabout2000massspectrafromtheliteraturecol-lectedunderdenedconditions.
Somedatabasesaddressspecicresearchinterests.
TheHumanMetabolomeDB[74,75]comprisesreferenceMS-MSspectraformorethanScheubertetal.
JournalofCheminformatics2013,5:12Page4of24http://www.
jcheminf.
com/content/5/1/12Figure1NumberofEIspectra(top)andtandemmassspectra(bottom)inNISTandWileyRegistryfrom2000until2011.
2500metabolitesfoundinthehumanbody.
ThePlatformforRIKENMetabolomics(PriMe)[67,76]collectsMSnspectraforresearchonplantmetabolomics.
SearchingspectrallibrariesTheusualapproachforidenticationofametaboliteislookingitupinaspectrallibrary.
Databasesearchrequiresasimilarityordistancefunctionforspectrummatch-ing.
Themostfundamentalscoringsarethe"peakcount"familyofmeasuresthatbasicallycountthenumberofmatchingpeaks.
Aslightlymorecomplexvariantistak-ingthedotproductofthetwospectra,takingintoaccountpeakintensities.
Establishingthecondenceisthemoredicultpartofcompoundidenticationusinglibrarysearch[31].
Falsenegativeidenticationsoccurifthespectrumofthequerycompounddiersfromthespectruminthelibrary,forexampleduetocontaminations,noise(espe-ciallyinlowsignalspectra),ordierentcollisionenergies(CID).
Areliableidenticationofacompounddependsontheuniquenessofitsspectrum,butthepresenceandintensityofpeaksacrossspectraishighlycorrelated,asthesedependonthenon-randomdistributionofmolec-ular(sub-)structures.
Therefore,structurallyrelatedcom-poundsgenerallyhavesimilarmassspectra.
Hence,falsepositivehitsmayhintatcorrect"classidentications",seeSection"Searchingforsimilarcompounds"below.
Dier-entfromproteomics,FalseDiscoveryRates(FDR)cannotbeestimatedasnoappropriatedecoydatabasescanbeconstructed.
Usually,condenceinsearchresultsmustbemanuallyassessedbytheuser,basedontheusedsearchalgorithmandthequalityofspectrumandlibrary[77].
Anothermethodthatovercomesthislimitationisthecal-culationoffragmentationtreesfromfragmentationspec-tra,seeSection"Fragmentationtrees"below.
Forareviewonusingspectrallibrariesforcompoundidentication,see[31].
ElectronionizationfragmentationspectraTocompareEImassspectra,ahugenumberofscorings(orsimilaritymeasures)havebeendevelopedovertheyears.
In1971,theHertzsimilarityindexwasintroduced[78],representingtheweightedaverageratioofthetwospectra.
TheProbabilityBasedMatching(PBM)[79,80]Scheubertetal.
JournalofCheminformatics2013,5:12Page5of24http://www.
jcheminf.
com/content/5/1/12takesintoaccountthatsomepeaksaremoreinforma-tivethanothers.
Atwateretal.
[81]statisticallyevaluatedtheeectsofseveralparametersonthePBMsystem,toprovideaquantitativemeasureofthepredictedreliabilityofthematch.
SISCOM[82]encodesspectrabyselectingthemostinformativepeakswithinhomologousionseries.
Computingthedotproductcosineoftwomassspectra(thatis,theinversecosineofthedotproductofthenor-malizedspectra)wasusedintheINCOSdatasystem[83].
SteinandScott[84]evaluatednormalizedEuclideandis-tances[85],PBM,Hertzsimilarityindex,anddotproductforsearchingEIdatabases.
Amongthese,theyfoundthedotproducttoperformbest.
Theyproposedacompositesearchalgorithmthatoptimizesthecosinescorebyvary-ingthescalingandmassweightingofthepeakintensities.
Kooetal.
[86]introducednovelcompositesimilaritymea-suresthatintegratewaveletandFouriertransformcoe-cients,butfoundonlyaslightimprovementovercosinecorrelationorthecompositesimilaritymeasure.
Kimetal.
[87]showedhowtondoptimalweightfactorsforfragmentmassesusingareferencelibrary.
Regardingthedierentiationbetweentrueandbogushitsinthedatabase,notmuchprogresshasbeenmade:Probabilisticindicatorsofcorrectidenticationsusing"matchfactors"wereintroducedin[88].
Jeongetal.
[89]usedanempiricalBayesmodeltoimprovetheaccuracyofidenticationsandgaveafalsepositiveestimate.
Forthispurpose,acompetitionscorewasaddedtothesimilarityscore,basedonthesimilarityscoretootherspectrainthelibrary.
TandemmassspectraWenotedabovethatLC-MS/MSismuchlessrepro-duciblethanfragmentationbyGC-MS(seeFigure2).
Reliablelibraryidenticationscanbeachievedwhenaspectrumisacquiredunderthesameconditionsasthereferencespectrum[90].
Foreachcompound,librariesmustcontaintandemmassspectraatdierentcolli-sionenergiesandreplicatesondierentinstruments,toallowforaneectiveidentication[91].
Forexample,Oberacherandcoworkers[71,72,92]presentedaninter-instrumentandinter-laboratorytandemmassspectralreferencelibraryobtainedusingmultiplefragmentationenergysettings.
Forsearchingintandemmassspectrallibrariesitispos-sibletostartwithaprecursorionmasslteringwithaspecicm/zormDarange.
Incasetheactualcompoundisnotinthedatabase,itcanbebenecialtoomitthisl-teringstep.
Thismayrevealvaluableinformationaboutstructurallysimilarcompounds[92].
Subsequently,simi-larapproachesasforEImassspectracanbeenapplied,suchasPBM[79,80]ordotproductcosine[84,93].
Again,intensitiescanbeweightedusingpeakmasses[62,63].
Thescoringin[92]extendsthecommonpeakcount.
Zhou1000010050010050010050010050010050010020030050040010050relativesignalintensity[%](a)30eV30eV30eV30eV17%30%m/zprecursorionremoved40eV(b)referenceQqQ(c)QqLIT-pi(d)QqLIT-epi(e)(g)LIT-FTICR(f)LITQqTOF-epiFigure2Inter-instrumentcomparabilityofdixyrazine-specictandemmassspectracollectedondierentinstrumentalplatforms.
FigureprovidedbyHerbertOberacher,comparetoFigureoneinOberacheretal[71].
etal[94]proposedasupportvectormachine(SVM)-basedspectralmatchingalgorithmtocombinemultiplesimilaritymeasures.
HansenandSmedsgaard[95]usedtheJerey-Matusitasdistance[96]tondauniquecorre-spondencebetweenthepeaksinthetwospectra.
X-Rankreplacespeakintensitiesbytheirrank,thenesti-matestheprobabilitythatapeakinthequeryspectrummatchesapeakinthereferencespectrumbasedontheseranks[97].
Oberacheretal[71,72]tackledtheproblemoflowreproducibilityofmetaboliteCIDfragmentationusingadynamicintensitycut-o,countingneutrallosses,andoptimizingthescoringformula.
Toimproverunningtimes,thedatabasecanbelteredusingthemostintensepeaksanduser-denedconstraints[98].
MolecularformulaidenticationOneofthemostbasic—butneverthelesshighlyimpor-tant—stepswhenanalyzinganunknowncompound,istodetermineitsmolecularformula,oftenreferredtoasScheubertetal.
JournalofCheminformatics2013,5:12Page6of24http://www.
jcheminf.
com/content/5/1/12the"elementalcomposition"ofthecompound.
Commonapproachesrstcomputecandidatemolecularformulasusingasetofpotentialelements.
Thesixelementsmostabundantinmetabolitesarecarbon(C),hydrogen(H),nitrogen(N),oxygen(O),phosphorus(P),andsulfur(S)[99].
Foreachcandidatemolecularformula,anisotopepatternissimulatedandcomparedtothemeasuredone,todeterminethebestmatchingmolecularformula.
Forthispurpose,highmassaccuracyisrequiredandisnowa-daysavailablefromamultitudeofMSplatforms.
Themolecularformulaofthecompoundcanserveasabasisforsubsequentstructureelucidation.
Somesoftwarepack-agesformolecularformulaidenticationusingisotopepatternsaresummarizedinTable1.
Table1SoftwareforthethreebasicstepsofmolecularformulaidenticationusingisotopepatternsDecomposingmonoisotopicpeaksDecomp[100,101]forarbitraryalphabetsofelementsrequiresonlylittlememoryswiftinpracticeSIRIUS[102,103]implementingDecompapproachforMSdecomposingreal-valuedmasses"SevenGoldenRules"[104]toltermolecularformulasSimulatingisotopepatternsIsoPro[105]multinomialexpansiontopredict"centermasses"memory-andtime-consumingMercury[106]pruningbyprobabilitythresholdsand/ormassrangereducedmemoryandtimeconsumptionreducedaccuracyofthepredictionsEmass[107]&SIRIUS[102]iterative(stepwise)computationofisotopepatternprobability-weightedcentermassesprobabilitiesandmassesareupdatedasatomsareaddedIsoDalton[108]modelsthefoldingprocedureasaMarkovprocessBRAIN[109]Newton-GirardtheoremandVietesformulaetocalculateintensitiesandmassesFourier[110]2DFastFourierTransformthatsplitsupthecalculationinacoarseandanestructurerunningtimeimprovementforlargecompoundsScoringcandidatecompoundsSigmaFitcommercialsoftwarebyBrukerDaltonicsSIRIUS[102]BayesianstatisticsforscoringintensitiesandmassesoftheisotopepatternMZmine[111]simplescoringbasedonlyonintensities*Recommendedtools.
Dierentfromtheabove,someauthorsproposetousemolecularstructuredatabasestodeterminethecandidatemolecularformulas[112].
This"simplies"theproblemasthesearchspaceisseverelyrestricted;butonlythosemolecularformulascanbedeterminedwhereacom-poundisavailableinthestructuredatabase.
Tothisend,wewillignorethissomewhatarbitraryrestrictionofthesearchspace.
Inthefollowing,weassumethatelementsareunlabeledoronlypartiallylabeled.
Ifcertainelementsare(almost)completelylabeledbyheavyisotopessuchas13C,andboththeunlabeledandthelabeledcompoundarepresent,thisallowsustodirectly"read"thenumberofatomsfromthespectrumusingthemassdierence.
WewillcomebacktothisparticulartypeofdatainSection"Isotopelabeling".
DecomposingmonoisotopicpeaksHere,"decomposingapeak"referstondingallmolec-ularformulas(overthexedalphabetofelements)thataresucientlyclosetothemeasuredpeakmass.
Robert-sonandHamming[113]andDromeyandFoyster[114]proposedana¨vesearchtreealgorithmforthispurpose.
Onecanshowthattherunningtimeofthisalgorithmlinearlydependsonmk1wheremisthemassofthepeakwewanttodecompose,andkisthenumberofele-ments[102].
Thismeansthatdoublingthepeakmasswewanttodecompose,willincreasetherunningtimeofthealgorithm32-foldforthealphabetofelementsCHNOPS.
Hence,runningtimecaneasilygetprohibitive,inparticu-larifweconsiderlargeralphabetsofelements,orhavetoperformmanydecompositions.
In1989,F¨urstetal[115]proposedafasterdecompositionalgorithmwhich,unfor-tunately,islimitedtothefourelementsCHNO.
In2005,B¨ockerandLiptak[100,101]presentedanalgorithmthatworksforarbitraryalphabetsofelements,requiresonlylittlememory,andisswiftinpractice.
Initiallydevelopedfordecomposingintegermasses,thisalgorithmwaslateradaptedtoreal-valuedmasses[102,103,116].
Decomposingaloneisnotsucienttoexcludeenoughpossiblemolecularformulasinhighermassregionsevenwithveryhighmassaccuracy[117].
KindandFiehn[104]proposed"SevenGoldenRules"toltermolecu-larformulasbasedonchemicalconsiderations.
However,forlargermasses,manymolecularformulaspasstheserules.
Asthemonoisotopicmassofacompoundisinsu-cienttodetermineitsmolecularformula,wecanusethemeasuredisotopepatternofthecompoundtorankallremainingmolecularformulacandidates.
KindandFiehn[117]estimatedthatmassspectrometerscapableof3ppmaccuracyand2%errorforisotopicabundances,canoutperformmassspectrometerswithhypotheticalmassaccuracyof0.
1ppmthatdonotincludeisotopicScheubertetal.
JournalofCheminformatics2013,5:12Page7of24http://www.
jcheminf.
com/content/5/1/12information.
Tothisend,wenowconsidertheproblemsofsimulatingandmatchingisotopepatterns.
SimulatingisotopepatternsDuetolimitedresolutionofmostMSinstrumentstheiso-topicvariantsarenotfullyseparatedinthespectrabutpooledinmassbinsofapproximately1Dalength.
Thisiscalledtheaggregatedisotopicdistribution[36]andinthefollowingwewillrefertoitas"isotopepattern".
Mostelementshaveseveralnaturallyoccurringiso-topes.
Combiningelementsintoamolecularformulaalsomeanstocombinetheirisotopedistributionsintoaniso-topedistributionoftheentirecompound.
Massesofallisotopesareknownwithveryhighprecision[118,119].
Thisis,toamuchlesserextendandwithcertainexcep-tions,alsotrueforthenaturalabundancesoftheseiso-topesonearth[120].
(Forexample,theabundancesofboronisotopesvarystrongly.
)Tothisend,wecansimu-latethetheoreticalisotopepatternofamolecularformula,andcomparethesimulateddistributiontothemeasuredpatternofacompound.
SeeValkenborgetal[36]foranintroduction.
Theintensityofapeakinanisotopepatternisthesuperpositionofallisotopevariants'abundancesthathaveidenticalnominalmass(nucleonnumber)[36].
Intheearly1960's,massaccuracyofMSinstrumentswasrelativelylow.
Thus,rstapproachesforsimulatingiso-topepatternsignoredtheexactmassoftheisotopepeaks,andconcentratesolelyonisotopepeakinten-sities,thatis,theisotopedistribution[121].
In1991,Kubinyi[122]suggestedaveryecientalgorithmforthisproblem,basedonconvolutingisotopedistributionsof"hyperatoms".
Asinstrumentswithimprovedmassaccuracybecamecommerciallyavailable,focusshiftedtowardsalsopre-dictingmassesofisotopepeaks,named"centermasses"byRoussisandProulx[123].
Forthispurpose,methodsbasedonpolynomial[124]andmultinomialexpansion[105,125]weredeveloped.
IsoProisanimplementationof[105]byM.
W.
Senko.
Unfortunately,theseexpansionapproachesareverymemory-andtime-consuming.
Prun-ingbyprobabilitythresholdsormassrangeorbothwasintroducedtoreducememoryandtimeconsumption;butthiscomesatthepriceofreducedaccuracyofthepredictions[106,126-128].
Theapproachof[106]wasimplementedinthesoftwarepackageMercury.
Startingin2004,methodsthatuseaniterative(step-wise)computationofisotopepatternweredeveloped[107,116,123].
Thesealgorithmsaresimilarinspirittotheearlyalgorithmsforcomputingpeakintensities[121,122].
Butforthenewalgorithms,probabilitiesandmassesofisotopepeaksareupdatedasatomsareadded.
Thisresultsinprobability-weightedcentermasses.
Twoimplemen-tationsareEmass[107]andSIRIUS[102].
Tospeedupcomputations,bothapproachescombinethiswithasmartRussianmultiplicationscheme,similartoKubinyi[122].
LaterapproachesmodelthefoldingprocedureasaMarkovprocess[108,129,130].
IsoDaltonimplementstheapproachofSnider[108].
Allapproacheshaveincommonthatatruncationmechanismmustbeappliedduetotheexponentialgrowthofstates.
In2012,Claesenetal[109]appliedtheNewton-GirardtheoremandVietesformulaetocalculatetheintensitiesandmassesofanisotopepattern.
Thismethodisimple-mentedinthesoftwaretoolBRAIN.
Theycomparedtheirmethodagainstveothersoftwaretools:IsoPro,Mercury,Emass,NeutronCluster[131],andIsoDalton.
Inthiseval-uation,BRAINoutperformedallothersoftwaretoolsbutEmassinmassaccuracyoftheisotopepeaks.
RunningtimeswerecomparableforBRAIN,Emass,Mercury,andNeutronCluster,whereasIsoProandIsoDaltonrequiredmuchhighercomputationtimes.
Later,B¨ocker[132]showedthatSIRIUSandBRAINhavepracticallyidenticalqualityofresultsandrunningtimesforsimulatingisotopepatterns.
ThecurrentlyfastestalgorithmwaspresentedbyFernandez-de-CossioDiazandFernandez-de-Cossio[110].
Thisalgorithmimprovesonearlierworkwerea2DFastFourierTransformisappliedthatsplitsupthecalculationinacoarseandanestructure[133].
Fourier[110]showsasignicantlybetterperformancethanBRAINand,hence,EmassandSIRIUS.
Itmustbenoted,though,thatthisrunningtimeimprovementisonlyrelevantforlargecompounds:Thesmallestcompoundconsideredin[109,110,132]hasmassabove1000Da,andsignicantrunningtimedierencesforFourierareobservedonlyforcompoundswithmassabove10kDa.
Forcompoundsofmassabove,say,50kDatheproblemofsimulatingisotopepatternsbecomessomewhatmean-ingless:Theabundancesofisotopespeciesareknownwithlimitedprecision,andvarydependingonwhereasampleistaken.
Thesesmalldeviationsintheisotopicdistributionofelementscausehugedeviationsintheaggregateddistribution,ifthecompoundissucientlylarge[134].
Fortheecientandaccuratesimulationofisotopepat-ternsofsmallcompound,itisrecommendedtouseoneoftheapproachesbehindFourier[110],BRAIN[109],Emass[107],orSIRIUS[102].
ScoringcandidatecompoundsbycomparingisotopepatternsDecomposingthemonoisotopicpeakcanresultinalargenumberofcandidatemolecularformulasthatarewithinthemeasuredmass[117].
Wecanrankthesecandidatesbasedonevaluatingtheirsimulatedisotopepatterns.
Foreachcandidatemolecularformula,theisotopedistribu-tionissimulatedandcomparedwiththemeasuredone.
Scheubertetal.
JournalofCheminformatics2013,5:12Page8of24http://www.
jcheminf.
com/content/5/1/12Thebestmatchingformulaisconsideredtobethecorrectmolecularformulaofthecompound.
SeeFigure3.
Initially,massspectrometerswerelimitedinmassaccu-racyandresolution.
Tothisend,rstattemptsofscoringisotopepatternsonlyconsideredtheintensityoftheiso-topicpeaksbutnottheirmasses.
KindandFiehn[117]calculatedarootmeansquareerrorforthedierencesbetweenmeasuredandtheoreticalisotopicintensities.
Stolletal[135]lteredcandidatesusingdouble-bondequivalentsandnumberofvalences,thenrankcandi-datesbasedoncorrelatingtheisotopedistributions[136].
Commercialsoftwareforthesamepurposewasalsopro-videdbyinstrumentvendors,suchasSigmaFitbyBrukerDaltonics.
Tal-Aviv[137]targetsGC-MSEIdatausingasupersonicmolecularbeam,whichresultsinhighlyabundantmolecularions.
B¨ockeretal[102]introducedSIRIUS,rstsuggestedin[116].
Here,boththeintensitiesandmassesoftheisotopepatternareusedtoscorecandidatemolecularformulasusingBayesianstatistics:Theauthorsestimatethelike-lihoodofaparticularmolecularformulatoproducetheobserveddata.
Foradatasetof86compoundsmeasuredonanoa-TOFMSinstrument,thecorrectformulawasidentiedinmorethan91%ofthecases.
Ipsenetal[138]developedamethodtodeterminecondenceregionsforisotopepatterns,tailoredtowardsTOFMSdata.
TheyemploythattherateofionarrivalsatthedetectorplateisgovernedbythePoissondistribution.
AtestonthreeFigure3Metaboliteidenticationpipelinebasedonelementalcompositioncalculation,isotopepatternscoringandsubsequentdatabasequeries.
FigureredrawnfromKindandFiehn[117].
Scheubertetal.
JournalofCheminformatics2013,5:12Page9of24http://www.
jcheminf.
com/content/5/1/12compoundsshowedthatthemethodrejectsabout70%ofthecandidateformulas(forpooleddata)butkeepsthetrueformula,atthe5%signicancelevel.
IsotopelabelingLabelingcompoundsbyisotope-enrichedelementssuchas13Cor15N,helpstoidentifythecorrectmolecularformula.
Theshiftinthemassspectrumbetweentheunla-beledcompoundandthelabeledcompoundindicatesthenumberofatomsinthecompounds.
Oncethenumberofatomsforthelabeledelementsisknown,thenum-berofpossiblemolecularformulaissignicantlyreduced.
Rodgersetal[139]showedthatenrichmentwith99%13Cisotopesreducesthenumberofpossiblemolecularformu-lasfora851Daphospholipidfrom394toone.
Hegemanetal[140]usedisotopiclabelingformetaboliteidentica-tion.
Theyimprovedthediscriminatingpowerbylabelingwith13Cand15Nisotopes.
Giavaliscoetal[141]addi-tionallylabeledcompoundswith34Sisotopes.
Bythis,thenumberofcarbon,nitrogenaswellassulfuratomscanbedeterminedupfront,andthenumberofpotentialmolecularformulathatwehavetoconsider,isreducedconsiderably.
Baranetal[142]appliedthisapproachtountargetedmetaboliteprolingandshoweditspotentialtouniquelyidentifymolecularformulas.
OtherapproachesformolecularformulaidenticationTandemormultiple-stageMScangiveadditionalinforma-tionaboutthemolecularformulaoftheintactcompound:Wecanexcludeallmolecularformulasofthecompoundif,foroneofthefragment(production)peaks,wecan-notndasub-formulathatexplainsthispeak[143-146].
Unfortunately,suchapproachesaresusceptibletonoisydata.
Tothisend,Konishiandcoworkers[143,144]sug-gestedtouseonlyproductionsbelowacertainthreshold,e.
g.
,200Da,thathaveauniquedecomposition.
Pluskaletal[111]combinedmatchingisotopepat-ternswithlteringbasedonthemolecularformulasofproductions.
For79%ofthe48compoundscon-sidered,theyidentiedthecorrectmolecularformula.
Thereexistcommercialtoolsthatfollowthesamelineofthought:Forexample,SmartFormula3D[146](com-mercial,BrukerDaltonics)appearstoimplementasimilarapproach.
Pluskaletal[111]alsoevaluatedtheirnew,sim-plescoringofisotopepatternsagainstSIRIUS[102],andreportedthatitperformsbetter.
Ageneralizationofthisconceptarefragmentationtreeswhichwereinitiallyintroducedtocomputemolecularfor-mulas[147].
Foreachpotentialmolecularformulaoftheintactcompound,afragmentationtreeanditsscorearecomputed.
Potentialmolecularformulasofthecompoundarethensortedwithrespecttothisscore.
Rascheetal[148]combinedthiswithisotopepatternanalysis[102],andforthe79consideredcompoundsmeasuredontwoinstruments,theycouldidentifythecorrectmolecularformulainallcases.
Formoredetailsonfragmentationtrees,seeSection"Fragmentationtrees"below.
Alloftheaboveapproachesassumethatonlythemonoisotopicpeakisselectedfordissociation.
Selectinganon-monoisotopicpeakcanrevealvaluableinforma-tionaboutthemolecularformulasoftheproductions.
Singletonetal[149]developedanapproachtopredicttheexpectedisotopepatternfortandemmassspectraforprecursorionsthatcontainonlyoneelementwithoneheavyisotope.
Rockwoodetal[150]generalizedthisanddevelopedanalgorithmthatcanbeappliedtoarbitraryprecursorions.
Itisbasedontheconvolutionofisotopedistributionsoftheproductionandtheloss.
Again,com-paringtheoreticalandexperimentalisotopepatternsshedlightonthecorrectproductionformula.
RamaleyandHerrera[151]modiedthealgorithmfrom[149]toapplyittoarbitraryprecursorions;resultsarecomparableto[150].
Rogersetal[152]usedtheinformationofpotentialmetabolicpathwaystoidentifythecorrectmolecularformula.
Ifthereisaputativechemicaltransformationbetweentwomolecularformulas,theseformulasgetabetterscorethanotherexplanationsofthepeak.
Thisdoesnotonlyimprovemolecularformulaidentica-tion,butcanpotentiallybeusedtoreconstructbio-chemicalnetworks.
SeeSection"Networkreconstruction"fordetails.
IdentifyingtheunknownsToyieldinformationbeyondthecompoundmassandmolecularformula,theanalyteisusuallyfragmented,andfragmentationmassspectraarerecorded.
Usingspectralcomparisononecanidentifyhugenumbersofmetabo-litesthatarecatalogedinlibraries.
However,wherethecompoundisunknown,comparingthespectrumobtainedtoaspectrallibrarywillresultinimpreciseorincorrecthits,ornohitsatall[33,35,99].
Thelimitedcapabilityformetaboliteidenticationhasbeennamedoneofthemajordicultiesinmetabolomics[117].
Manualanalysisofunidentiedspectraiscumbersomeandrequiresexpertknowledge.
Therefore,automatedmethodstodealwithmassspectraofunknownunknowns(thatis,"unexpected"compoundsthatarenotpresentinspectrallibraries[31])arerequired.
Someapproachesforanalyzingfragmenta-tionmassspectraofunknownunknownsaresummarizedinTable2.
SearchingforsimilarcompoundsIncaseadatabasedoesnotcontainthesamplecompoundanobviousapproachistosearchforsimilarspectra,assumingthatspectralsimilarityisbasedonstruc-turalsimilarityofthecompounds.
Backin1978,Damenetal[82],alreadysuggestedthatSISCOMcanalsobeScheubertetal.
JournalofCheminformatics2013,5:12Page10of24http://www.
jcheminf.
com/content/5/1/12Table2Approachesforanalyzingfragmentationmassspectraofunknownunknownsthatis,"unexpected"compoundsthatarenotpresentinspectrallibraries[31]InsilicofragmentationSearchingforsimilarcompoundsMassspectralclassiersRule-basedspectrumpredictionCombinatorialfragmentationFragmentationtreessearchingforsimilarspectrainalibrary,assumingthatspectralsimilarityisbasedonstructuralsimilaritypredictingsubstructuresorcompoundclassesbylearningspectralclassierspredictingspectrabyapplyingfragmentationrulestoknownmolecularstructuresmappingthefragmentationspectrumtothecompoundstructuretoexplainthepeakscomputingafragmenta-tiontreethatexplainsthepeaks;aligningfragmenta-tiontreestondsimilarcompoundsNISTMSInterpreter[153]FingerID[169]MassFrontier,ACD/MSFragmenter,MOLGEN-MS[196]MetFrag[179]SIRIUS[147,221]usedtodetectstructuralsimilaritiessuchascommonsubstructures.
TheNISTMSInterpreter[153]forEIspectrausesanearest-neighborapproachtogeneratesubstructureinformation.
Alibrarysearchprovidesalistofsimilarspectra.
Structuralfeaturesoftheunknowncompound,suchasaromaticringsorcarbonylgroups,arededucedfromcommonstructuralfeaturesofthehits.
Demuthetal[154]proposedasimilarapproach,andevaluatedwhetherspectralsimilarityiscorrelatedwithstructuralsimilarityofacompound.
Basedonthisevaluation,theyproposedathresholdforspectralsimilaritythatsupposedlyyieldshitlistswithsignicantlysimilarstructures.
FormultipleMSdata,Sheldonetal[155]usedprecursorionngerprints(PIF)andspectraltreesforndingsimilarcompoundsandutilizedpreviouslycharacterizedionstructuresforthestructuralelucidationoftheunknowncompounds.
MassspectralclassiersAnothernaturalapproachtodealwithmassspectraofcompoundsthatcannotbefoundinaspectrallibrary,istondpatternsinthefragmentationspectraofrefer-encecompounds,andtousethedetectedpatternsfortheautomatedinterpretationoftheunidentiedspectrum.
Initially,thiswasaccompaniedbyknowledgeaboutthefragmentationprocesses;butthisappliesonlyforfrag-mentationbyEI,whereasfragmentationbyCIDislessreproducibleandnotcompletelyunderstood[156].
Tocharacterizeanunknowncompound,wehavetocomeupwith"classiers"thatassigntheunknowntoacertainclass:suchclassescanbebasedonthepres-enceorabsenceofcertainsubstructures,ormoregeneralstructuralpropertiesofthecompound.
AsEIfragmen-tationisalreadywellunderstood,manymassspectralclassiershavebeenprovidedtodate.
Alreadyin1969,Venkataraghavanetal[157]presentedanautomatedapproach"toidentifythegeneralnatureofthecompoundanditsfunctionalgroups.
"TheSelf-TrainingInterpretiveandRetrievalSystem(STIRS)[158]mixesarule-basedapproachwithsomeearlymachinelearningtechniquestoobtainstructuralinformationfromrelatedEIspectra.
Further,STIRScanpredictthenominalmolecularmassofanunknowncompound,evenifthemolecularionpeakismissingfromtheEIspectrum.
Scottandcoworkers[159-161]proposedanimprovedmethodforestimat-ingthenominalmolecularmassofacompound.
Usingpatternrecognitionthecompoundisclassied,andclass-specicrulesareappliedtoestimatethemolecularmass.
Structuraldescriptors(thatis,fragmentsofacer-tainintegralmass)havebeenusedtoretrievecom-poundclassesformanydecades[162].
TheVarmuzafeature-basedclassicationapproachforEIspectra[163]usesasetofmassspectralclassierstorecognizethepresence/absenceof70substructuresandstruc-turalpropertiesinthecompound.
ThisapproachisintegratedtoMOLGEN-MSandAMDIS.
Forexample,Schymanskietal[164]combinedmassspectralclassi-erswithmethodsforstructuregeneration(seeSection"Molecularisomergenerators")tointerpretEIspectraclassiersfromMOLGEN-MSandtheNIST05software.
FurtherMSclassiersforsubstructuresareprovidedin[165,166].
Hummeletal[167]usedstructuralfeaturestosubdividetheGolmMetabolomeDatabaseintoseveralclasses.
Theyproposedadecisiontree-basedpredictionofthemostfrequentsubstructures,basedonmassspectralfeaturesandretentionindexinformation,forclassicationofunknownmetabolitesintodierentcompoundclasses.
In2011,Tsugawaetal[168]usedSoftIndependentMod-elingofClassAnalogy(SIMCA)tobuildmultipleclassmodels.
However,backin1996,VarmuzaandWerther[163]observedthatSIMCA(whichisbasedonthePrin-cipleComponentAnalysis)performedworstamongallinvestigatedmethods.
WhereasalloftheabovemethodsaretargetedtowardsGC-MSandEIfragmentation,fewmethodstargetLC-MSandCIDfragmentation.
AnovelapproachbyHeinonenetal[169]predictsmolecularpropertiesoftheunknownmetabolitefromthemassspectrumusingasupportvectormachine,thenusesthesepredictedpropertiesformatch-ingagainstmolecularstructuredatabasessuchasKEGG(KyotoEncyclopediaofGenesandGenomes)andPub-Chem(seeFigure4).
Tothisend,wecanreplacethesmallScheubertetal.
JournalofCheminformatics2013,5:12Page11of24http://www.
jcheminf.
com/content/5/1/12Figure4Predictingchemicalproperties(moleculengerprints)fromtandemMSdatausingasupportvectormachine(SVM)asdonebyHeinonenetal[169].
Thepredictedngerprintsareusedtosearchamolecularstructuredatabaseformetaboliteidentication.
FigureredrawnfromHeinonenetal[169].
spectralibrariesbythemuchlargerstructuredatabases.
UsingQqQMSdataandsearchingthesmallerKEGGdatabase,theycouldidentifythecorrectmolecularstruc-tureinabout65%ofthecases,fromanaverageof25candidates.
MolecularisomergeneratorsMolecularisomergeneratorssuchasMOLGEN[170-172],SMOG[173],andAssemble[174]havehelpedwiththestructuralelucidationofunknownsformanyyears[175,176].
Recently,theopensourcesoftwareOMGwasintroduced[177].
Molecularisomergeneratorsenumerateallmolecularstructuresthatarechemicallysound,foragivenmolecularformulaormass.
Inaddition,thespaceofgeneratedstructurescanbeconstrainedbythepresenceorabsenceofcertainsubstructures,seeSection"Massspectralclassiers".
AnoverviewongeneratingstructuralformulasisgivenbyKerberetal[172].
Enumeratingallpossibleisomersallowsustoovercomethebound-ariesofdatabasesearching:Simplygenerateallmolecularstructurescorrespondingtotheparentmassormolecularformula,andusetheoutputofthestructuregeneratorasa"privatedatabase".
Unfortunately,thisapproachisonlyvalidforrelativelysmallcompounds(say,upto100Da):FormolecularformulaC8H6N2Owithmass146Dathereexist109240025dierentmolecularstructures[172].
InsilicofragmentationspectrumpredictionInsilicofragmentationaimstoexplain"whatyousee"inafragmentationspectrumofametabolite.
Initially,thiswastargetedatamanualinterpretationoffragmentationspectra;butrecently,thisapproachhasbeenincreasinglyusedforanautomatedanalysis[178,179].
Here,searchinginspectrallibrariesisreplacedbysearchinginmolecu-larstructuredatabases.
Wementionedabovethatspectrallibrariesare(andwillbe)severalordersofmagnitudesmallerthanmolecularstructuredatabases:Forexam-ple,theCASRegistryoftheAmericanChemicalSocietyandPubChemcurrentlycontainabout25millioncom-poundseach.
Wecanalsousemolecularstructuregen-erators(see"Molecularisomergenerators")tocreatea"privatedatabase".
However,whereasstructuregenera-torscanenumeratemillionsofstructuresinamatterofseconds,itisalreadyahardproblemtorankthetensorhundredsofmolecularstructuresfoundinmolecularstructuredatabasesforaparticularparentmass[178,179].
Insilicofragmentationhasbeensuccessfullyappliedtocompoundswithconsistentfragmentationpattern,suchScheubertetal.
JournalofCheminformatics2013,5:12Page12of24http://www.
jcheminf.
com/content/5/1/12aslipids[180],oligosaccharides[181],glycans[182],pep-tides[183-185]ornon-cyclicalkanesandalkenes[186].
However,generalfragmentationpredictionofarbitrarysmallmoleculeremainsanactiveeldofresearch,duetothestructuraldiversityofmetabolitesandthecomplexityoftheirfragmentationpatterns.
Basicallytherearetwotypesofinsilicofragmentationmethods.
Rule-basedfragmentersarebasedonfragmenta-tionrulesthatwereextractedfromtheMSliteratureovertheyears.
Combinatorialfragmentersuseabonddiscon-nectionapproachtodissectacompoundintohypotheticalfragments.
Rule-basedfragmentersAlthoughmuchisknownaboutEIfragmentation,itisahardionizationtechniquethatcanresultinverycomplexrearrangementsandfragmentationevents[187]whicharehardtopredict.
FortandemMS,thefragmentationbehaviorofsmallmoleculesundervaryingfragmentationenergiesisnotcompletelyunderstood[156],andhasbeeninvestigatedinmanystudiestondgeneralfragmentationrules[188,189].
MassFrontier(seebelow)currentlycon-tainsthelargestfragmentationlibrary,manuallycuratedfromseveralthousandpublications[33].
Therstrule-basedapproachesforpredictingfragmen-tationpatternsandexplainingexperimentalmassspectrawiththehelpofamolecularstructureweredevelopedaspartoftheDENDRALproject.
Forexample,Grayetal[190]introducedCONGENthatpredictsmassspectraofgivenmolecularstructuresusinggeneralmodelsoffrag-mentation,aswellasclass-specicfragmentationrules.
IntensitiesforEIspectraweremodeledwithequationsfoundbymultiplelinearregressionanalysisofexperimen-talspectraandmoleculardescriptors[191].
Gasteigeretal[9]introducedMASSIMO(MAssSpec-traSIMulatOr)toautomaticallyderiveknowledgeaboutmassspectralreactiontypesdirectlyfromexperimentalmassspectra.
PartofMASSIMOistheFragmentationandRearrangementANalyZer(FRANZ)thatrequiresasetofstructure-spectrum-pairsasinput.
TheMAssSpectrumSImulationSystem(MASSIS)[192-194]com-binescleavageknowledge(McLaertyrearrangement,retro-Diels-Alderreaction,neutrallosses,oxygenmigra-tion),functionalgroups,smallfragments(end-pointandpseudoend-pointfragments)andfragment-intensityrelationshipsforsimulatingelectronionizationspectra.
Unfortunately,thesethreesoftwarepackageswerenei-thersucientlyvalidatednormadepubliclyavailable.
Asaconsequence,theywereneverusedorappliedbythebroadcommunityandshouldbeconsideredwithcaution.
MassFrontier(HighChem,Ltd.
Bratislava,Slovakia;versionsafter5.
0availablefromThermoScientic,Waltham,USA)containsfragmentationreactionscollectedfrommassspectrometryliterature.
Besidespredictingaspectrumfromamolecularstructure,itcanalsoexplainameasuredfragmentationspectrum.
TheACD/MSFragmenter(AdvancedChemistryLabs,Toronto,Canada)canonlyinterpretagivenfragmen-tationspectrumusingaknownmolecularstructure[195].
Initially,theseprogramsweredesignedforthepredictionandinterpretationoffragmentationbyEI,butrecently,therehasbeenatendencytointerprettan-demMSdatawiththesesprograms,too.
Bothprogramsarecommercial,andnoalgorithmicdetailshavebeenpublished.
AthirdcommercialtoolisMOLGEN-MS[196,197]thatusesgeneralmassspectralfragmenta-tionrulesbutcanalsoacceptadditionalfragmentationmechanisms.
Fortheinterpretationoftandemmassspectra,Hilletal[178]proposeda"rule-basedidenticationpipeline".
First,theyretrievedcandidatemolecularstructuresfromPubChemusingexactmass.
Next,MassFrontier4wasusedtopredictthetandemmassspectraofthecandi-dates,whichwerematchedtothemeasuredspectrum,countingthenumberofcommonpeaks.
Inthisway,arule-basedfragmentercanbeusedtosearchinamolecu-larstructuredatabase.
Pelanderetal[198]usedACD/MSFragmenterfordrugmetabolitescreeningbytandemMS.
ForthesimulationofEIfragmentationspectra,Schyman-skietal[195]comparedthethreecommercialprograms,andindicatedthatatthetimeofevaluation,massspec-tralfragmentpredictionforstructureelucidationwasstillfarfromdailypracticalusability.
TheauthorsalsonotedthatACDFragmenter"shouldbeusedwithcautiontoassessproposedstructures[.
.
.
]astherankingresultsareveryclosetothatofarandomnumbergenerator.
"Later,Kumarietal[199]implementedapipelineforEIspectraintegratingMassFrontierthatissimilartotheonefortandemMSdata[178],butintegratesretentiontimeprediction.
TheyretrievedcandidatestructuresfromPubChemusingmolecularformulaspredictedfromtheisotopepattern[104].
TheylteredmolecularstructuresusingKovatsretentionindexprediction[15].
UsingMassFrontier6forspectrumprediction,thecorrectstructurewasreportedin73%withintheTOP5hits.
Itisworthmentioningthatrule-basedsystemsdidnothavemuchsuccessinproteomics:There,itisappar-entfromtheverybeginningthat,inviewofthehugesearchspace,onlyoptimization-andcombinatorics-basedmethodscanbesuccessful.
CombinatorialFragmentersTheproblemwithrule-basedfragmentersisthateventhebestcommercialsystemscoveronlyatinypartoftherulesthatshouldbeknown.
Constantly,newrulesaredis-coveredthathavetobeaddedtothefragmentationruledatabases.
However,alloftheserulesdonotnecessarilyapplytoanewlydiscoveredcompound.
Scheubertetal.
JournalofCheminformatics2013,5:12Page13of24http://www.
jcheminf.
com/content/5/1/12Sweeney[200]observedthatmanycompoundscanbedescribedinamodularformat,thatis,substructureswhichaccountformostofthefragmentsobservedinthefragmentationspectrum(seeFigure5).
Combinatorialfragmentersusebonddisconnectiontoexplainthepeaksintheobservedfragmentationspectrum.
Fragmentsresultingfromstructuralrearrangementsareinitiallynotcoveredbythisapproach.
Usually,suchrearrangementshavetobeindividually"woven"intothecombinato-rialoptimization;thisisoftencomplicatedanddoneonlyforafew,particularlyimportantrearrangements.
Notethathandlingrearrangementreactionsisprob-lematicforbothcombinatorialandrule-basedmethods[200-202].
EPIC(elucidationofproductionconnectivity)[201]wastherstsoftwareusingsystematicbonddisconnec-tionandrankingoftheresultingsubstructures.
Itwastestedonlyagainsttwohandannotatedspectrafromtheliteratureandisnotpubliclyavailable.
TheFrag-mentiDenticator(FiD)[202,203]enumeratesallpos-siblefragmentcandidatesusingaMixedIntegerLinearProgrammingapproach,andranksthecandidatesaccord-ingthecostofcleavingafragment.
Duetothecom-putationalcomplexityoftheunderlyingproblem[204],runningtimescanbeprohibitiveevenformedium-sizecompounds.
ThemostrecentapproachisMetFrag[179],asomewhatgreedyheuristictomatchmolecularstructurestomea-suredspectrathatmakesnoattempttocreateamechanis-ticallycorrectpredictionofthefragmentationprocesses.
Itisthereforefastenoughtoscreendozenstothousandsofcandidatesretrievedfromcompounddatabases,andtosubsequentlyrankthembytheagreementbetweenmea-suredandinsilicofragments(seeFigure6).
HilletalOnthesametestsetthatwasusedby[178],MetFragperformedbetterthanthecommercialMassFrontier4.
MetFragpredictionswereincludedintherecentMETLINdatabaserelease[65].
MetFraghasalsobeenextendedtoanalyzeEIfragmentation[205].
Recently,GerlichandNeumann[206]introducedMetFusionthatcombinestheFigure5Modularstructureofxemiloban.
FigureredrawnfromSweeney[200].
MetFragapproachwithasimilarityngerprinttore-rankthemolecularstructures.
Otherexperimentalmeasuressuchasretentionindicesordrifttime,canbeusedforcandidateltering[205,207].
Ridderetal[208]presentedacloselyrelatedapproachforsubstructurepredictionusingmultistageMSdata.
Oneproblemofcombinatorialfragmentersishowtochoosethecostsforcleavingedges(bonds)inthemolec-ularstructuregraph.
Forthis,MetFragusesbonddisso-ciationenergieswhereas"unitweights"areusedin[208].
Kangasetal[180]usedmachinelearningtondbondcleavagerates.
TheirInsilicoidenticationsoftware(ISIS)currentlyworksonlyforlipidsandisnotmodelingrear-rangementsofatomsandbonds.
Dierentfromtheotherapproaches,ISISsimulatesthespectrumofagivenlipid,anddoesnotrequireexperimentaldatatodoso.
ConsensusstructureapproachesManyoftheabovementionedtechniquesarerathercomplementaryyieldingdierentinformationontheunknowncompound.
Combiningthedierentresultswillthereforegreatlyimprovetheidenticationrates.
ForEIfragmentationdata,[205]usedaconsensusscoringtoselectedcandidates.
Thesestructuralcandidatesaregen-eratedusingmolecularformulaandsubstructureinfor-mationretrievedfromMOLGEN-MSandMetFrag,andfurthercharacteristics(e.
g.
,retentionbehavior).
Ludwigetal[209]proposedagreedyheuristictondthecharac-teristicsubstructurethatis"embodied"inalistofdatabasesearchresults;seealsoSection"Fragmentationtrees".
NonribosomalpeptidesUsuallythestructureofsmallmoleculescannotbededucedfromthegenomicsequence.
However,forpartic-ularmoleculessuchasnonribosomalpeptides(NRPs)acertainpredictabilityhasbeenestablished[210].
NRPsareexcellentleadcompoundsforthedevelopmentofnovelpharmaceuticalagentssuchasantibiotics,immunosup-pressors,orantiviralandantitumoragents[211].
Theydierfromribosomalpeptidesinthattheycanhaveanon-linearstructures(forexample,cyclicortree-like)andmaycontainnon-standardaminoacids[211].
Thisincreasesthenumberofpossiblebuildingblocksfrom20tosev-eralhundreds,andcertainaminoacidmassesnotevenknowninadvance.
Tothisend,commonapproachesforsequencingribosomalpeptidesusingtandemmassspec-trometryarenotapplicabletoNRPs.
Forcyclicpeptides,fragmentationstepsbeyondtandemMSarerequired,astandemMSsimplyresultsinthelinearizationofthecyclicpeptide.
Nevertheless,NRPsarestructurallymuchmorerestrictedthanthevastvarietyofmetabolitesknownfromplantsormicrobes.
ComputationalmethodsfordenovosequencinganddereplicationofNRPshavebeenestab-lished[17,211-214].
Unfortunately,thesecomputationalScheubertetal.
JournalofCheminformatics2013,5:12Page14of24http://www.
jcheminf.
com/content/5/1/12Figure6MetFragwebinterfacewithanexamplespectrumfromNaringenin.
SearchingKEGGascompoundlibrarywithan10ppmwindowreturns15hits,andthecorrectmoleculeisrankedatrstposition.
methodsrelyonthe"polymericcharacter"ofNRPsand,hence,cannotbegeneralizedforanalyzingotherclassesofmetabolites.
FragmentationtreesIfwewanttoassignmolecularformulastotheprecursorandproductions,wemayusetheformulaofthepre-cursortolterbogusexplanationsoftheproductions,andviceversa.
Thisfacthasbeenexploitedrepeatedly,seeforexample[111,146]andSection"Molecularfor-mulaidentication"above.
Thisisonlythemostsimplisticdescriptionofthefragmentationprocess:Itisobviousthatallproductionsmustbefragmentsoftheprecursor;butwhatisthedependencybetweenthefragmentsInfact,MSexpertshavedrawnfragmentationdiagramsfordecades.
Forthistask,theMSexpertusuallyhastoknowthemolecularstructureofthecompoundanditstandemMSfragmentationspectrum.
Fragmentationtreesmustnotbeconfusedwithspectraltreesformultiplestagemassspectrometry[155],orthecloselyrelatedmultistagemassspectraltreesofRojas-Chertoetal[145](referredtoas"fragmentationtrees"Scheubertetal.
JournalofCheminformatics2013,5:12Page15of24http://www.
jcheminf.
com/content/5/1/12in[145,215,216]).
Spectraltreesareaformalrepre-sentationoftheMSsetupanddescribetherelation-shipbetweentheMSnspectra,butdonotcontainanyadditionalinformation.
WestressthatallcomputationalapproachesdescribedbelowtargettandemMS,unlessexplicitlystatedotherwise.
Tocomputeafragmentationtree,weneedneitherspectrallibrariesnormolecularstructuredatabases;thisimpliesthatthisapproachcantarget"trueunknowns"thatarenotcontainedinanymolecularstructuredatabase.
B¨ockerandRasche[147]introducedfragmentationtrees(seeFigure7)tondthemolecularformulaofanunknown,withoutusingdatabases:Here,thehighest-scoringfragmentationtreeforeachmolecularformulacandidateisusedasthescoreofthemolecularfor-mulaitself.
Onlylater,fragmentationtreeswereconceivedasameansofstructuralelucidation[148].
Algorithmicaspectsofcomputingfragmentationtreeswereconsid-eredin[217].
Hufskyetal[56]computedfragmentationtreesfromEIfragmentationspectrawithhighmassaccu-racy,andusedthistoidentifythemolecularionpeakandthemolecularformulaofcompounds.
Fragmenta-tiontreescomputedfrombothtandemMS[148]andEIfragmentationdata[218]werefoundtobeofgood"struc-turalquality"byexpertevaluation.
Finally,Scheubertetal[219,220]computedfragmentationtreesfrommultipleMSdata.
Tofurtherprocessfragmentationtrees,Rascheetal[221]introducedfragmentationtreealignmentstoclus-terunknowncompounds,topredictchemicalsimilarity,andtondstructurallysimilarcompoundsinaspec-trallibraryusingFT-BLAST(FragmentationTreeBasicLocalAlignmentSearchTool).
FT-BLASTalsooersthepossibilitytoidentifybogushitsusingadecoydatabase,allowingtheusertoreportresultsforapre-denedFalseDiscoveryRate.
Fasteralgorithmsforthecompu-tationallydemandingalignmentoffragmentationtreeswerepresentedin[222].
FT-BLASTresultswereparsedfor"characteristicsubstructures"in[209].
Rojas-Chertoetal[215]presentedarelatedapproachforthecompar-isonofmultistagemassspectraltrees,basedontrans-formingthetreesintobinaryngerprintsand=comparingthesengerprintsusingtheTanimotoscore(Jaccardindex).
Thiswasappliedformetaboliteidenticationin[216].
AligningfragmentationtreesissimilarinspirittothefeaturetreecomparisonofRareyandDixon[223].
Featuretreeswerecomputedfromthemolecularstructureofaknowncompound,andrepresenthydrophobicfragmentsandfunctionalgroupsofthecompound,andthewaythesegroupsarelinkedtogether.
NetworkreconstructionNetworkelucidationbasedonmassspectrometrydataisawideeld.
Ontheonehand,detailedinformationlikequantitativeuxesofthenetworkisachievedbymetabolicuxanalysis.
Here,basedonisotopelabeledcompounds,theuxproceedingfromthesecompoundscanbetracked.
Ontheotherhand,measuredmetabo-litescanbemappedonaknownnetwork.
ThiscanFigure7FragmentationtreeofphenylalaninecomputedfromtandemMSdata.
Scheubertetal.
JournalofCheminformatics2013,5:12Page16of24http://www.
jcheminf.
com/content/5/1/12elucidatedistinctmetabolicpathwaysthataredier-entially"used"dependentonenvironmentalconditions.
Bothofthesevariantsrequirepreviousknownmetabolicnetworkgraphs.
Inthissection,wewillonlycoverthepuredenovoreconstructionofnetworksfrommetabolitemassspectrometrydata.
Thereconstructionofnetworkssolelyfrommetabolicmassspectrometrydataisaveryyoungeldofresearch.
Itcanbesubdividedintotwomainapproaches:eitherthenetworkreconstructionisbasedonmetabolitelevelcorrelationofmultiplemutantandwildtypesam-ples,orondatafromonlyonesamplebyusinginfor-mationofcommonreactionsorsimilaritybetweenmetabolites.
Arstapproachthatusedmetabolitemassspectrom-etrydataofmultipleexpressedsampleswasintroducedbyFiehnetal[224].
Theirmethodclustersmetabolicphenotypesforexamplebyprinciplecomponentanal-ysis(PCA).
IncontrastArkinetal[225]andKoseetal[226]developedamethodthatdoesnotgroupsamplesbutmetaboliteswithcorrelatingintensityregard-ingallsamples.
Metabolitesofagrouphaveasimilarconcentrationbehaviorinallsamples.
Thisleadstotheassumptionthatthemetabolitesofagroupareprobablysomehowconnectedinametabolicnetwork.
Asthecon-centrationofmetabolitestakenfromplantswithidenticalgenotypeandgrownunderuniformconditionsstillshowvariability,thisapproachcanalsobeusedifnomultiplemutantgenotypesareavailable[227].
Thedisadvantageofthissimpleapproachis,thatitresultsinverydensenetworksthatdonotonlycoverdirectreactionsbutalsoindirectones.
Krumsieketal.
2011[228]suggestedtoapplyGaussiangraphicalmodelstosuchdata.
Gaussiangraphicalnetworkshavetheabilitytocalculateonlydirectcorrelationswhileindirectcorrelationsarenottakenintoaccount.
In2006,Breitlingetal[229]reconstructednetworksbasedonhigh-resolutionmassspectrometrydataofonlyonedataset.
Theyinferredaccuratemassdierencesbetweenallmeasuredmetabolites.
Thesemassdierencesgiveevidencesofbiochemicaltransformationsbetweenthemetabolitesandallowthereconstructionofanetwork.
Rogersetal[152]usedasimilarapproachonmolecu-larformulaleveltoassignbettermolecularformulastometabolites(seeSection"Otherapproachesformolecularformulaidentication").
Watrousetal[230]usedadditionalinformationfromspectralalignmentsoftandemMSdatatodetermineastructuralsimilaritybetweenthemetabolites.
Twostruc-turallysimilarmetabolitesaresupposedtobeconnectedinthenetwork(seeFigure8).
Theyfoundthecom-poundthanamycininPseudomonassp.
SH-C52thathasanantifungaleectandprotectssugarbeetplantsfrominfectionsbyspecicsoil-bornefungi.
TandemMSSpectraSpectralSimilarityScoring0.
70.
70.
80.
90.
80.
20.
30.
5NetworkGenerationm/zDm/zAm/zCm/zBm/zEm/zFm/zGFigure8UsingspectralalignmentoftandemMSdatatogenerateamolecularnetwork.
Thethicknessoftheedgesindicatesthesimilaritybetweenthespectra.
FigureredrawnfromWatrousetal[230].
SoftwarepackagesSeveralopensource,oratleastfreelyavailable,softwarepackagesassistwithprocessingandanalyzingGC-MSmetabolomicsdata.
ThefreelyavailableAMDIS[231]isthemostwidelyusedmethodforextractingindividualcomponentspectra(massspectraldeconvolution)fromGC-MSdata.
MathDAMP[232]helpswiththeidenti-cationandvisualizationofdierencesbetweencom-plexmetaboliteproles.
TagFinder[233,234]supportsthequantitativeanalysisofGC-MS-basedmetaboliteprol-ingexperiments.
TheMetaboliteDetector[235]detectsandsubsequentlyidentiesmetabolitesandallowsfortheanalysisofhigh-throughputdata.
TargetSearch[236]iterativelycorrectsandupdatesretentiontimeindicesforsearchingandidentifyingmetabolites.
Metab[237]isanRpackagethatautomatesthepipelineforanalysisofmetabolomicsGC-MSdatasetsprocessedbyAMDIS.
Scheubertetal.
JournalofCheminformatics2013,5:12Page17of24http://www.
jcheminf.
com/content/5/1/12PyMS[238]comprisesseveralfunctionsforprocessingrawGC-MSdata,suchasnoisesmoothing,baselinecor-rection,peakdetection,peakdeconvolution,peakinte-gration,andpeakalignment.
ADAP-GC2.
0[239]helpswiththedeconvolutionofcoelutingmetabolites,alignscomponentsacrosssamplesandexportstheirqualitativeandquantitativeinformation.
Castilloetal.
2011[240]developedatooltoprocessGC*GC-TOF-MSdata.
ForLC-MSdata,XCMS[13]enablesretentiontimealignment,peakdetectionandpeakmatching.
XCMS2[241]additionallysearchesLC-MS/MSdataagainstMETLINandalsoprovidesstructuralinformationforunknownmetabolites.
Italsoallowsforthecorrectionofmasscalibrationgaps[242]causedbyregularswitchesbetweentheanalyteandastandardreferencecompound.
XCMSOnline[243]istheweb-basedversionofthesoftware.
AStream[244]enablesthedetectionofout-liersandredundantpeaksbyintensitycorrelationandretentiontime,aswellasisotopedetection.
MetSign[245]providesseveralbioinformaticstoolsforrawdatadeconvolution,metaboliteputativeassignment,peaklistalignment,normalization,statisticalsignicancetests,unsupervisedpatternrecognition,andtimecourseanal-ysis.
CAMERA[246]isdesignedtopost-processXCMSfeaturelistsandintegratesalgorithmstoextractcom-poundspectra,annotatepeaks,andproposecompoundmassesincomplexdata.
MetExtract[247]detectspeakscorrespondingtometabolitesbychromatographicchar-acteristicsandisotopelabeling.
IDEOM[248]ltersanddetectspeaksbasedonXCMS[13]andmzMatch.
R[249],enablesnoiselteringbasedon[249,250]andallowsfordatabasematchingandfurtherstatistics.
Brodskyetal[251]presentedamethodforevaluatingindivid-ualpeaksinaLC-MSspectrum,basedonreplicatesamples.
Forboth,GC-MSandLC-MSdata,MZmine[252]andMZmine2[253]allowfordatavisualization,peakidenti-cationandpeaklistalignment.
MET-IDEA[254]proceedsfromcomplexrawdatalestoacompletedatamatrix.
MetAlign[255]iscapableofbaselinecorrection,peakpicking,aswellasspectralalignment.
Tocomparethepowerofthesesoftwarepackages,anindependentvalidationwouldbedesirable.
Butuptonow,thereexistsnosuchcomparison.
Onereasonisthelim-itedamountoffreelyavailablemassspectra,seeSection"Conclusion".
Anotherreasonisthatsomeofthepackagesaredevelopedforspecialexperimentalsetupsorinstru-ments,andhavetobeadaptedforotherdata,whatmakesanindependentvalidationdicult.
ConclusionNocomputationaldenovomethodisabletoelucidatethestructureofametabolitesolelyfrommassspec-traldata.
Theycanonlyreducethesearchspaceorgivehinttothestructureorclassofthecompound.
Computationalmassspectrometryofsmallmoleculesis,atleastcomparedtoproteomics,stillverymuchinadevelopmentalstate.
Thismaybesurprising,asmeth-odsdevelopmentstartedoutmanyyearsbeforecom-putationalmassspectrometryforproteinsandpeptidescameintothefocusofbioinformaticsandcheminformat-icsresearch[183-185].
Butsincethen,methodsdevel-opmentincomputationalproteomicshasproliferated[16-21]andlongsurpassedthatinmetabolomicsandsmallmoleculeresearch.
Toagreatextend,thiscanbeattributedtothefactthatfreelysharingdataandbench-marktestsetshasbecomeatraditioninproteomics,pro-vidingdevelopersofnovelcomputationalmethodswiththerequiredinputfortrainingandevaluationoftheirmethods.
Inmetabolomics,acomparativeevaluationofmethodsisverylimitedduetorestricteddatasharing.
Recently,arstbenchmarktestforsmallmoleculeswasprovidedaspartoftheCASMIchallengea.
CASMIisacontestinwhichGC-MSandLC-MSdataisreleasedtothepublic,andthecomputationalmassspectrometrycom-munityisinvitedtoidentifythecompounds.
ResultsandmethodswillbepublishedinaspecialissueoftheOpenAccessMDPIjournalMetabolites.
Thisisarststeptowardsreliableevaluationofdierentcomputa-tionalmethodsfortheidenticationofsmallmolecules.
Lately,theimportanceofcomputationalmethodshasgainedmoreattentioninsmallmoleculeresearch:CitingKindandFiehn[33],"theultimatesuccessofstructureelucidationofsmallmoleculesliesinbettersoftwarepro-gramsandthedevelopmentofsophisticatedtoolsfordataevaluation.
"Withtheadventofnovelcomputationalapproaches[169,206,207],searchingspectrallibrariesmaybereplacedbysearchingmolecularstructuredatabaseswithininthenextvetotenyears.
Beyondmoleculardatabases,onlyfewapproachesaimatovercomingthelimitsofthe"knownuniverseoforganicchemistry"[256],oneexamplebeingfragmentationtrees[56,148,221].
EndnoteaCriticalAssesmentofSmallMoleculeIdentication,http://casmi-contest.
org/.
CompetinginterestsTheauthorsdeclarethattheyhavenocompetinginterests.
Authors'contributionsKSwrotetheSections"Molecularformulaidentication"and"Networkreconstruction".
FHwrotetheSections"Searchingspectrallibraries"and"Identifyingtheunknowns".
SBwrotetheSection"Fragmentationtrees".
Allauthorsreadandapprovedthenalmanuscript.
AcknowledgementsKScheubertfundedbyDeutscheForschungsgemeinschaft(BO1910/10).
FHufskysupportedbytheInternationalMaxPlanckResearchSchoolJena.
Scheubertetal.
JournalofCheminformatics2013,5:12Page18of24http://www.
jcheminf.
com/content/5/1/12Authordetails1ChairofBioinformatics,FriedrichSchillerUniversity,Ernst-Abbe-Platz2,Jena,Germany.
2MaxPlanckInstituteforChemicalEcology,BeutenbergCampus,Jena,Germany.
Received:23November2012Accepted:1February2013Published:1March2013References1.
CuiQ,LewisIA,HegemanAD,AndersonME,LiJ,SchulteCF,WestlerWM,EghbalniaHR,SussmanMR,MarkleyJL:MetaboliteidenticationviatheMadisonMetabolomicsConsortiumDatabase.
NatBiotechnol2008,26(2):162–164.
2.
LastRL,JonesAD,Shachar-HillY:Towardstheplantmetabolomeandbeyond.
NatRevMolCellBiol2007,8:167–174.
3.
PattiGJ,YanesO,SiuzdakG:Innovation:Metabolomics:Theapogeeoftheomicstrilogy.
NatRevMolCellBiol2012,13(4):263–269.
4.
LederbergJ:Topologicalmappingoforganicmolecules.
ProcNatlAcadSciUSA1965,53(1):134–139.
5.
LederbergJ:HowDENDRALwasconceivedandborn.
InACMConf.
ontheHistoryofMedicalInformatics,HistoryofMedicalInformaticsarchive;1987:5–19.
6.
MunIK,MclaertyFW:Computermethodsofmolecularstructureelucidationfromunknownmassspectra.
InSupercomputersinChemistry,ACSSymposiumSeries,chapter9;1981:117–124.
7.
SmithDH,GrayNA,NourseJG,CrandellCW:TheDENDRALproject:Recentadvancesincomputer-assistedstructureelucidation.
AnalChimActa1981,133(4):471–497.
8.
NovemberJA:DigitizingLife:TheIntroductionofComputerstoBiologyandMedicine.
PhDthesis.
Princeton,USA:PrincetonUniversity;2006.
9.
GasteigerJ,HanebeckW,SchulzKP:Predictionofmassspectrafromstructuralinformation.
JChemInfComputSci1992,32(4):264–271.
10.
BylundD,DanielssonR,MalmquistG,MarkidesKE:Chromatographicalignmentbywarpinganddynamicprogrammingasapre-processingtoolforPARAFACmodellingofliquidchromatography-massspectrometrydata.
JChromatographyA2002,961:237–244.
11.
JeongJ,ShiX,ZhangX,KimS,ShenC:Model-basedpeakalignmentofmetabolomicprolingfromcomprehensivetwo-dimensionalgaschromatographymassspectrometry.
BMCBioinformatics2012,13:27.
12.
LommenA,KoolsHJ:MetAlign3.
0:performanceenhancementbyecientuseofadvancesincomputerhardware.
Metabolomics2012,8(4):719–726.
13.
SmithCA,WantEJ,O'MailleG,AbagyanR,SiuzdakG:XCMS:Processingmassspectrometrydataformetaboliteprolingusingnonlinearpeakalignment,matching,andidentication.
AnalChem2006,78(3):779–787.
14.
Garkani-NejadZ,KarlovitsM,DemuthW,StimpT,VycudilikW,Jalali-HeraviM,VarmuzaK:Predictionofgaschromatographicretentionindicesofadiversesetoftoxicologicallyrelevantcompounds.
JChromatogrA2004,1028(2):287–295.
15.
SteinSE,BabushokVI,BrownRL,LinstromPJ:EstimationofKovatsretentionindicesusinggroupcontributions.
JChemInfModel2007,47(3):975–980.
16.
CoxJ,MannM:MaxQuantenableshighpeptideidenticationrates,individualizedp.
p.
b.
-rangemassaccuraciesandproteome-wideproteinquantication.
NatBiotechnol2008,26(12):1367–1372.
17.
BandeiraN,PhamV,PevznerP,ArnottD,LillJR:Automateddenovoproteinsequencingofmonoclonalantibodies.
NatBiotechnol2008,26(12):1336–1338.
18.
LiuG,ZhangJ,LarsenB,StarkC,BreitkreutzA,LinZY,BreitkreutzBJ,DingY,ColwillK,PasculescuA,PawsonT,WranaJL,NesvizhskiiAI,RaughtB,TyersM,GingrasAC:ProHits:Integratedsoftwareformassspectrometry-basedinteractionproteomics.
NatBiotechnol2010,28(10):1015–1017.
19.
FusaroVA,ManiDR,MesirovJP,CarrSA:Predictionofhigh-respondingpeptidesfortargetedproteinassaysbymassspectrometry.
NatBiotechnol2009,27(2):190–198.
20.
EliasJE,GibbonsFD,KingOD,RothFP,GygiSP:Intensity-basedproteinidenticationbymachinelearningfromalibraryoftandemmassspectra.
NatBiotechnol2004,22(2):214–219.
21.
MannM:Comparativeanalysistoguidequalityimprovementsinproteomics.
NatMethods2009,6(10):717–719.
22.
B¨ockerS:Sequencingfromcompomers:UsingmassspectrometryforDNAde-novosequencingof200+nt.
JComputBiol2004,11(6):1110–1134.
23.
B¨ockerS:SimulatingmultiplexedSNPdiscoveryratesusingbase-speciccleavageandmassspectrometry.
Bioinformatics2007,23(2):e5—e12.
24.
B¨ockerS,KehrB,RascheF:Determinationofglycanstructurefromtandemmassspectra.
IEEE/ACMTransComputBiolBioinform2011,8(4):976–986.
25.
GoldbergD,BernMW,LiB,LebrillaCB:AutomaticdeterminationofO-glycanstructurefromfragmentationspectra.
JProteomeRes2006,5(6):1429–1434.
26.
GoldbergD,BernMW,NorthSJ,HaslamSM,DellA:GlycanfamilyanalysisfordeducingN-glycantopologyfromsingleMS.
Bioinformatics2009,25(3):365–371.
27.
BaumgaertelA,ScheubertK,PietschB,KempeK,CreceliusAC,B¨ockerS,SchubertUS:Analysisofdierentsynthetichomopolymersbytheuseofanewcalculationsoftwarefortandemmassspectra.
RapidCommunMassSpectrom2011,25(12):1765–1778.
28.
ThalassinosK,JacksonAT,WilliamsJP,HiltonGR,SladeSE,ScrivensJH:Novelsoftwarefortheassignmentofpeaksfromtandemmassspectrometryspectraofsyntheticpolymers.
JAmSocMassSpectrom2007,18(7):1324–1331.
29.
KatajamaaM,OresicM:Dataprocessingformassspectrometry-basedmetabolomics.
JChromatogrA2007,1158(1-2):318–328.
30.
WishartDS:Currentprogressincomputationalmetabolomics.
BriefBioinform2007,8(5):279–293.
31.
SteinSE:Massspectralreferencelibraries:Anever-expandingresourceforchemicalidentication.
AnalChem2012,84(17):7274–7282.
32.
HanJ,DatlaR,ChanS,BorchersCH:Massspectrometry-basedtechnologiesforhigh-throughputmetabolomics.
Bioanalysis2009,1(9):1665–1684.
33.
KindT,FiehnO:Advancesinstructureelucidationofsmallmoleculesusingmassspectrometry.
BioanalRev2010,2(1-4):23–60.
34.
XiaoJF,ZhouB,RessomHW:MetaboliteidenticationandquantitationinLC-MS/MS-basedmetabolomics.
TrendsAnalytChem2012,32:1–14.
35.
FiehnO:Extendingthebreadthofmetaboliteprolingbygaschromatographycoupledtomassspectrometry.
TrendsAnalytChem2008,27(3):261–269.
36.
ValkenborgD,MertensI,Lemi`ereF,WittersE,BurzykowskiT:Theisotopicdistributionconundrum.
MassSpectromRev2012,31(1):96–109.
37.
NeumannS,B¨ockerS:Computationalmassspectrometryformetabolomics–areview.
AnalBioanalChem2010,398(7):2779–2788.
38.
FernieAR,TretheweyRN,KrotzkyAJ,WillmitzerL:Metaboliteproling:Fromdiagnosticstosystemsbiology.
NatRevMolCellBiol2004,5(9):763–769.
39.
SweedlerJV:Metabolomicsinanalyticalchemistry.
AnalChem2012,84(14):5833.
40.
ChamparnaudE,HopleyC:Evaluationofthecomparabilityofspectrageneratedusingatuningpointprotocolontwelveelectrosprayionisationtandem-in-spacemassspectrometers.
RapidCommunMassSpectrom2011,25(8):1001–1007.
41.
BristowAWT,WebbKS,LubbenAT,HalketJ:Reproducibleproduct-iontandemmassspectraonvariousliquidchromatography/massspectrometryinstrumentsforthedevelopmentofspectrallibraries.
RapidCommunMassSpectrom2004,18(13):1447–1454.
42.
HalketJM,WatermanD,PrzyborowskaAM,PatelRKP,FraserPD,BramleyPM:ChemicalderivatizationandmassspectrallibrariesinmetabolicprolingbyGC/MSandLC/MS/MS.
JExpBot2005,56(410):219–243.
43.
GoodleyP:MaximizingMS/MSfragmentationintheiontrapusingCIDvoltageramping.
TechnicalReport5988-0704EN,AgilentTechnologies,2007.
44.
HopleyC,BristowT,LubbenA,SimpsonA,BullE,KlagkouK,HernimanJ:LangleyJTowardsauniversalproductionmassspectrallibrary–Scheubertetal.
JournalofCheminformatics2013,5:12Page19of24http://www.
jcheminf.
com/content/5/1/12reproducibilityofproductionspectraacrosselevendierentmassspectrometers.
RapidCommunMassSpectrom2008,22(12):1779–1786.
45.
PalitM,MallardG:Fragmentationenergyindexforuniversalizationoffragmentationenergyiniontrapmassspectrometersfortheanalysisofchemicalweaponconventionrelatedchemicalsbyatmosphericpressureionization-tandemmassspectrometryanalysis.
AnalChem2009,81(7):2477–2485.
46.
KnochenmussR,ZenobiR:MALDIionization:theroleofin-plumeprocesses.
ChemRev2003,103(2):441–452.
47.
KostiainenR,KotiahoT,KuuranneT,AuriolaS:Liquidchromatography/atmosphericpressureionization-massspectrometryindrugmetabolismstudies.
JMassSpectrom2003,38(4):357–372.
48.
MarchiI,RudazS,VeutheyJL:Atmosphericpressurephotoionizationforcouplingliquid-chromatographytomassspectrometry:areview.
Talanta2009,78(1):1–18.
49.
TakatsZ,WisemanJM,GologanB,CooksRG:Massspectrometrysamplingunderambientconditionswithdesorptionelectrosprayionization.
Science2004,306(5695):471–473.
50.
HorvathCG,LipskySR:Useofliquidionexchangechromatographyfortheseparationoforganiccompounds.
Nature1966,211(5050):748–749.
51.
MacNairJE,LewisKC,JorgensonJW:Ultrahigh-pressurereversed-phaseliquidchromatographyinpackedcapillarycolumns.
AnalChem1997,69(6):983–989.
52.
GaoX,ZhangQ,MengD,IsaacG,ZhaoR,FillmoreTL,ChuRK,ZhouJ,TangK,HuZ,MooreRJ,SmithRD,KatzeMG,MetzTO:Areversed-phasecapillaryultra-performanceliquidchromatography-massspectrometry(UPLC-MS)methodforcomprehensivetop-down/bottom-uplipidproling.
AnalBioanalChem2012,402(9):2923–2933.
53.
ZubarevR,MannM:Ontheproperuseofmassaccuracyinproteomics.
MolCellProteomics2007,6(3):377–381.
54.
CajkaT,HajslovaJ,LacinaO,MastovskaK,LehotaySJ:Rapidanalysisofmultiplepesticideresiduesinfruit-basedbabyfoodusingprogrammedtemperaturevaporiserinjection-low-pressuregaschromatography-high-resolutiontime-of-ightmassspectrometry.
JChromatogrA2008,1186(1-2):281–294.
55.
HernandezF,PortolesT,PitarchE,LopezFJ:Gaschromatographycoupledtohigh-resolutiontime-of-ightmassspectrometrytoanalyzetrace-levelorganiccompoundsintheenvironment,foodsafetyandtoxicology.
TrendsAnalChem2011,30(2):388–400.
56.
HufskyF,RemptM,RascheF,PohnertG,B¨ockerS:Denovoanalysisofelectronimpactmassspectrausingfragmentationtrees.
AnalChimActa2012,739:67–76.
57.
BinoRJ,HallRD,FiehnO,KopkaJ,SaitoK,DraperJ,NikolauBJ,MendesP,Roessner-TunaliU,BealeMH,TretheweyRN,LangeBM,WurteleES,SumnerLW:Potentialofmetabolomicsasafunctionalgenomicstool.
TrendsPlantSci2004,9(9):418–425.
58.
JenkinsH,HardyN,BeckmannM,DraperJ,SmithAR,TaylorJ,FiehnO,GoodacreR,BinoRJ,HallR,KopkaJ,LaneGA,LangeBM,LiuJR,MendesP,NikolauBJ,OliverSG,PatonNW,RheeS,Roessner-TunaliU,SaitoK,SmedsgaardJ,SumnerLW,WangT,WalshS,WurteleES,KellDB:Aproposedframeworkforthedescriptionofplantmetabolomicsexperimentsandtheirresults.
NatBiotechnol2004,22(12):1601–1606.
59.
BoardMembersMSI,SansoneSA,FanT,GoodacreR,GrinJL,HardyNW,Kaddurah-DaoukR,KristalBS,LindonJ,MendesP,MorrisonN,NikolauB,RobertsonD,SumnerLW,TaylorC,vanderWerfM,vanOmmenB,FiehnO:Themetabolomicsstandardsinitiative.
NatBiotechnol2007,25(8):846–848.
60.
GoodacreR,BroadhurstD,SmildeAK,KristalBS,BakerJD,BegerR,BessantC,ConnorS,CapuaniG,CraigA,EbbelsT,KellDB,ManettiC,NewtonJ,PaternostroG,SomorjaiR,Sj¨ostr¨omM,TryggJ,WulfertF:Proposedminimumreportingstandardsfordataanalysisinmetabolomics.
Metabolomics2007,3:231–241.
61.
SumnerLW,AmbergA,BarrettD,BealeM,BegerR,DaykinC,FanT,FiehnO,GoodacreR,GrinJL,HankemeierT,HardyN,HarnlyJ,HigashiR,KopkaJ,LaneA,LindonJC,MarriottP,NichollsA,ReilyM,ThadenJ,ViantMR:Proposedminimumreportingstandardsforchemicalanalysis.
Metabolomics2007,3(3):211–221.
62.
HoraiH,AritaM,NishiokaT:ComparisonofESI-MSspectrainMassBankdatabase.
InProc.
ofConferenceonBioMedicalEngineeringandInformatics(BMEI2008),volume2;2008:853–857.
63.
HoraiH,AritaM,KanayaS,NiheiY,IkedaT,SuwaK,OjimaY,TanakaK,TanakaS,AoshimaK,OdaY,KakazuY,KusanoM,TohgeT,MatsudaF,SawadaY,HiraiMY,NakanishiH,IkedaK,AkimotoN,MaokaT,TakahashiH,AraT,SakuraiN,SuzukiH,ShibataD,NeumannS,IidaT,TanakaK,FunatsuK,etal:MassBank:Apublicrepositoryforsharingmassspectraldataforlifesciences.
JMassSpectrom2010,45(7):703–714.
64.
SmithCA,O'MailleG,WantEJ,QinC,TraugerSA,BrandonTR,CustodioDE,AbagyanR,SiuzdakG:METLIN:Ametabolitemassspectraldatabase.
TherDrugMonit2005,27(6):747–751.
65.
TautenhahnR,ChoK,UritboonthaiW,ZhuZ,PattiGJ,SiuzdakG:AnacceleratedworkowforuntargetedmetabolomicsusingtheMETLINdatabase.
NatBiotechnol2012,30(9):826–828.
66.
KopkaJ,SchauerN,KruegerS,BirkemeyerC,UsadelB,Bergm¨ullerE,D¨ormannP,WeckwerthW,GibonY,StittM,WillmitzerL,FernieAR,SteinhauserD:GMD@CSB.
DB:TheGolmMetabolomeDatabase.
Bioinformatics2005,21(8):1635–1638.
67.
AkiyamaK,ChikayamaE,YuasaH,ShimadaY,TohgeT,ShinozakiK,HiraiMY,SakuraiT,KikuchiJ,SaitoK:PRIMe:Awebsitethatassemblestoolsformetabolomicsandtranscriptomics.
InSilicoBiol2008,8(3-4):339–345.
68.
NeuwegerH,AlbaumSP,DondrupM,PersickeM,WattT,NiehausK,StoyeJ,GoesmannA:MeltDB:Asoftwareplatformfortheanalysisandintegrationofmetabolomicsexperimentdata.
Bioinformatics2008,24(23):2726–2732.
69.
SansoneSA,Rocca-SerraP,FieldD,MaguireE,TaylorC,HofmannO,FangH,NeumannS,TongW,Amaral-ZettlerL,BegleyK,BoothT,BougueleretL,BurnsG,ChapmanB,ClarkT,ColemanLA,CopelandJ,DasS,deDaruvarA,deMatosP,DixI,EdmundsS,EveloCT,ForsterMJ,GaudetP,GilbertJ,GobleC,GrinJL,JacobD,etal:Towardinteroperablebiosciencedata.
NatGenet2012,44(2):121–126.
70.
KindT,WohlgemuthG,LeeDY,LuY,PalazogluM,ShahbazS,FiehnO:FiehnLib:Massspectralandretentionindexlibrariesformetabolomicsbasedonquadrupoleandtime-of-ightgaschromatography/massspectrometry.
AnalChem2009,81(24):10038–10048.
71.
OberacherH,PavlicM,LibisellerK,SchubertB,SulyokM,SchuhmacherR,CsaszarE,K¨ofelerHC:Ontheinter-instrumentandinter-laboratorytransferabilityofatandemmassspectralreferencelibrary:1.
ResultsofanAustrianmulticenterstudy.
JMassSpectrom2009,44(4):485–493.
72.
OberacherH,PavlicM,LibisellerK,SchubertB,SulyokM,SchuhmacherR,CsaszarE,K¨ofelerHC:Ontheinter-instrumentandtheinter-laboratorytransferabilityofatandemmassspectralreferencelibrary:2.
Optimizationandcharacterizationofthesearchalgorithm.
JMassSpectrom2009,44(4):494–502.
73.
SanaTR,RoarkJC,LiX,WaddellK,FischerSM:MolecularformulaandMETLINPersonalMetaboliteDatabasematchingappliedtotheidenticationofcompoundsgeneratedbyLC/TOF-MS.
JBiomolTech2008,19(4):258–266.
74.
WishartDS,TzurD,KnoxC,EisnerR,GuoAC,YoungN,ChengD,JewellK,ArndtD,SawhneyS,FungC,NikolaiL,LewisM,CoutoulyMA,ForsytheI,TangP,ShrivastavaS,JeroncicK,StothardP,AmegbeyG,BlockD,HauDD,WagnerJ,MiniaciJ,ClementsM,GebremedhinM,GuoN,ZhangY,DugganGE,MacInnisGD,etal:HMDB:Thehumanmetabolomedatabase.
NucleicAcidsRes2007,35(suppl1):D521—526.
75.
WishartDS,KnoxC,GuoAC,EisnerR,YoungN,GautamB,HauDD,PsychogiosN,DongE,BouatraS,MandalR,SinelnikovI,XiaJ,JiaL,CruzJA,LimE,SobseyCA,ShrivastavaS,HuangP,LiuP,FangL,PengJ,FradetteR,ChengD,TzurD,ClementsM,LewisA,SouzaAD,ZunigaA,DaweM,etal:HMDB:Aknowledgebaseforthehumanmetabolome.
NucleicAcidsRes2009,37:D603—D610.
76.
MatsudaF,Yonekura-SakakibaraK,NiidaR,KuromoriT,ShinozakiK,SaitoK:MS/MSspectraltagbasedannotationofnon-targetedproleofplantsecondarymetabolites.
PlantJ2008,57(3):555–577.
77.
SparkmanOD:Evaluatingelectronionizationmassspectrallibrarysearchresults.
JAmSocMassSpectrom1996,7(4):313–318.
Scheubertetal.
JournalofCheminformatics2013,5:12Page20of24http://www.
jcheminf.
com/content/5/1/1278.
HertzHS,HitesRA,BiemannK:Identicationofmassspectrabycomputer-searchingaleofknownspectra.
AnalChem1971,43(6):681–691.
79.
McLaertyF,HertelR,VillwockR:Computeridenticationofmassspectra:VI.
Probabilitybasedmatchingofmassspectra:Rapididenticationofspeciccompoundsinmixtures.
OrgMassSpectrom1974,9(7):690–702.
80.
McLaertyFW,ZhangMY,StauerDB,LohSY:Comparisonofalgorithmsanddatabasesformatchingunknownmassspectra.
JAmSocMassSpectrom1998,9(1):92–95.
81.
AtwaterBL,StauerDB,McLaertyFW,PetersonDW:Reliabilityrankingandscalingimprovementstotheprobabilitybasedmatchingsystemforunknownmassspectra.
AnalChem1985,57(4):899–903.
82.
DamenH,HennebergD,WeimannB:SISCOM–anewlibrarysearchsystemformassspectra.
AnalChimActa1978,103:289–302.
83.
SokolowS,KarnofskyJ,GustafsonP:Thenniganlibrarysearchprograms.
FinniganApplicationReport2,FinniganCorp.
,1978.
84.
SteinSE,ScottDR:Optimizationandtestingofmassspectrallibrarysearchalgorithmsforcompoundidentication.
JAmSocMassSpectrom1994,5(9):859–866.
85.
RasmussenGT,IsenhourTL,MarshallJC:Massspectrallibrarysearchesusingionseriesdatacompression.
JChemInfComputSci1979,19(2):98–104.
86.
KooI,ZhangX,KimS:Wavelet-andfourier-transform-basedspectrumsimilarityapproachestocompoundidenticationingaschromatography/massspectrometry.
AnalChem2011,83(14):5631–5638.
87.
KimS,KooI,WeiX,ZhangX:Amethodofndingoptimalweightfactorsforcompoundidenticationingaschromatography-massspectrometry.
Bioinformatics2012,28(8):1158–1163.
88.
SteinSE:Estimatingprobabilitiesofcorrectidenticationfromresultsofmassspectrallibrarysearches.
JAmSocMassSpectrom1994,5(4):316–323.
89.
JeongJ,ShiX,ZhangX,KimS,ShenC:Anempiricalbayesmodelusingacompetitionscoreformetaboliteidenticationingaschromatographymassspectrometry.
BMCBioinformatics2011,12:392.
90.
JosephsJL,SandersM:CreationandcomparisonofMS/MSspectrallibrariesusingquadrupoleiontrapandtriple-quadrupolemassspectrometers.
RapidCommunMassSpectrom2004,18(7):743–759.
91.
MilmanBL:TowardsafullreferencelibraryofMSnspectra.
Testingofalibrarycontaining3126MS2spectraof1743compounds.
RapidCommunMassSpectrom2005,19(19):2833–2839.
92.
PavlicM,LibisellerK,OberacherH:CombineduseofESI-QqTOF-MSandESI-QqTOF-MS/MSwithmass-spectrallibrarysearchforqualitativeanalysisofdrugs.
AnalBioanalChem2006,386(1):69–82.
93.
WanKX,VidavskyI,GrossML:Comparingsimilarspectra:Fromsimilarityindextospectralcontrastangle.
JAmSocMassSpectrom2002,13(13):85–88.
94.
ZhouB,CheemaAK,RessomHW:SVM-basedspectralmatchingformetaboliteidentication.
ConfProcIEEEEngMedBiolSoc2010,2010:756–759.
95.
HansenME,SmedsgaardJ:Anewmatchingalgorithmforhighresolutionmassspectra.
JAmSocMassSpectrom2004,15:1173–1180.
96.
MatusitaK:Decisionrule,basedonthedistance,fortheclassicationproblem.
AnnInstStatistMath1956,8(1):67–77.
97.
MylonasR,MauronY,MasselotA,BinzPA,BudinN,FathiM,VietteV,HochstrasserDF,LisacekF:X-rank:Arobustalgorithmforsmallmoleculeidenticationusingtandemmassspectrometry.
AnalChem2009,81(18):7604–7610.
98.
GergovM,WeinmannW,MeriluotoJ,UusitaloJ,Ojanper¨aI:Comparisonofproductionspectraobtainedbyliquidchromatography/triple-quadrupolemassspectrometryforlibrarysearch.
RapidCommunMassSpectrom2004,18(10):1039–1046.
99.
IssaqHJ,VanQN,WaybrightTJ,MuschikGM,VeenstraTD:Analyticalandstatisticalapproachestometabolomicsresearch.
JSepSci2009,32(13):2183–2199.
100.
B¨ockerS,LiptakZs:Ecientmassdecomposition.
InProc.
ofACMSymposiumonAppliedComputing(ACMSAC2005).
NewYork:ACMpress;2005:151–157.
101.
B¨ockerS,LiptakZs:AfastandsimplealgorithmfortheMoneyChangingProblem.
Algorithmica2007,48(4):413–432.
102.
B¨ockerS,LetzelM,LiptakZs,PervukhinA:SIRIUS:Decomposingisotopepatternsformetaboliteidentication.
Bioinformatics2009,25(2):218–224.
103.
B¨ockerS,LiptakZs,MartinM,PervukhinA,SudekH:DECOMP—frominterpretingmassspectrometrypeakstosolvingthemoneychangingproblem.
Bioinformatics2008,24(4):591–593.
104.
KindT,FiehnO:Sevengoldenrulesforheuristiclteringofmolecularformulasobtainedbyaccuratemassspectrometry.
BMCBioinformatics2007,8:105.
105.
YergeyJA:Ageneralapproachtocalculatingisotopicdistributionsformassspectrometry.
IntJMassSpectromIonPhys1983,52(2–3):337–349.
106.
RockwoodAL,VanOrden,SL:Ultrahigh-speedcalculationofisotopedistributions.
AnalChem1996,68:2027–2030.
107.
RockwoodAL,HaimiP:Ecientcalculationofaccuratemassesofisotopicpeaks.
JAmSocMassSpectrom2006,17(3):415–419.
108.
SniderRK:Ecientcalculationofexactmassisotopicdistributions.
JAmSocMassSpectrom2007,18(8):1511–1515.
109.
ClaesenJ,DittwaldP,BurzykowskiT,ValkenborgD:Anecientmethodtocalculatetheaggregatedisotopicdistributionandexactcenter-masses.
JAmSocMassSpectrom2012,23(4):753–63.
110.
Fernandez-de-CossioDiazJ,Fernandez-de-CossioJ:Computationofisotopicpeakcenter-massdistributionbyFouriertransform.
AnalChem2012,84(16):7052–7056.
111.
PluskalT,UeharaT,YanagidaM:Highlyaccuratechemicalformulapredictiontoolutilizinghigh-resolutionmassspectra,MS/MSfragmentation,heuristicrules,andisotopepatternmatching.
AnalChem2012,84(10):4396–4403.
112.
MatsudaF,ShinboY,OikawaA,HiraiMY,FiehnO,KanayaS,SaitoK:Assessmentofmetabolomeannotationquality:Amethodforevaluatingthefalsediscoveryrateofelementalcompositionsearches.
PLoSOne2009,4(10):e7490.
113.
RobertsonAL,HammingMC:MASSFORM:acomputerprogramfortheassignmentofelementalcompositionstohighresolutionmassspectraldata.
BiomedMassSpectrom1977,4(4):203–208.
114.
DromeyRG,FoysterGT:Calculationofelementalcompositionsfromhighresolutionmassspectraldata.
AnalChem1980,52(3):394–398.
115.
F¨urstA,ClercJT,PretschE:Acomputerprogramforthecomputationofthemolecularformula.
ChemomIntellLabSyst1989,5:329–334.
116.
B¨ockerS,LetzelM,LiptakZs,PervukhinA:Decomposingmetabolomicisotopepatterns.
InProc.
ofWorkshoponAlgorithmsinBioinformatics(WABI2006),volume4175ofLectNotesComputSci.
Berlin:Springer;2006:12–23.
117.
KindT,FiehnO:Metabolomicdatabaseannotationsviaqueryofelementalcompositions:Massaccuracyisinsucientevenatlessthan1ppm.
BMCBioinformatics2006,7(1):234.
118.
WieserME:Atomicweightsoftheelements2005(IUPACtechnicalreport).
PureApplChem2006,78(11):2051–2066.
119.
AudiG,WapstraA,ThibaultC:TheAME2003atomicmassevaluation(ii):Tables,graphs,andreferences.
NuclPhysA2003,729:129–336.
120.
deLaeterJR,B¨ohlkeJK,BievrePD,HidakaH,PeiserHS,RosmanKJR,TaylorPDP:Atomicweightsoftheelements.
Review2000(IUPACtechnicalreport).
PureApplChem2003,75(6):683–800.
121.
BiemannK:MassSpectrometry:OrganicChemicalApplications.
NewYork:McGraw-Hill;1962.
122.
KubinyiH:Calculationofisotopedistributionsinmassspectrometry:Atrivialsolutionforanon-trivialproblem.
AnalChimActa1991,247:107–119.
123.
RoussisSG,ProulxR:Reductionofchemicalformulasfromtheisotopicpeakdistributionsofhigh-resolutionmassspectra.
AnalChem2003,75(6):1470–1482.
124.
YamamotoH,McCloskeyJA:Calculationsofisotopicdistributioninmoleculesextensivelylabeledwithheavyisotopes.
AnalChem1977,49(2):281–283.
Scheubertetal.
JournalofCheminformatics2013,5:12Page21of24http://www.
jcheminf.
com/content/5/1/12125.
HsuCS:Diophantineapproachtoisotopicabundancecalculations.
AnalChem1984,56(8):1356–1361.
126.
RockwoodAL:Relationshipoffouriertransformstoisotopedistributioncalculations.
RapidCommunMassSpectrom1995,9:103–105.
127.
RockwoodAL,VanOrden,SL,SmithRD:Rapidcalculationofisotopedistributions.
AnalChem1995,67:2699–2704.
128.
RockwoodAL,OrdenSLV,SmithRD:Ultrahighresolutionisotopedistributioncalculations.
RapidCommunMassSpectrom1996,10:54–59.
129.
LiL,KreshJA,KarabacakNM,CobbJS,AgarJN,HongP:Ahierarchicalalgorithmforcalculatingtheisotopicnestructuresofmolecules.
JAmSocMassSpectrom2008,19(12):1867–1874.
130.
LiL,KarabacakNM,CobbJS,WangQ,HongP,AgarJN:Memory-ecientcalculationoftheisotopicmassstatesofamolecule.
RapidCommunMassSpectrom2010,24(18):2689–2696.
131.
OlsonMT,YergeyAL:Calculationoftheisotopeclusterforpolypeptidesbyprobabilitygrouping.
JAmSocMassSpectrom2009,20(2):295–302.
132.
B¨ockerS:Commenton"Anecientmethodtocalculatetheaggregatedisotopicdistributionandexactcenter-masses"byClaesenetal.
JAmSocMassSpectrom2012,23(10):1826–1827.
133.
Fernandez-de-CossioJ:Computationoftheisotopicdistributionintwodimensions.
AnalChem2010,82(15):6726–6729.
134.
ClaesenJ,DittwaldP,BurzykowskiT,ValkenborgD:Replytothecommenton:"Anecientmethodtocalculatetheaggregatedisotopicdistributionandexactcenter-masses"byClaesenetal.
JAmSocMassSpectrom2012,23(10):1828–1829.
135.
StollN,SchmidtE,ThurowK:Isotopepatternevaluationforthereductionofelementalcompositionsassignedtohigh-resolutionmassspectraldatafromelectrosprayionizationfouriertransformioncyclotronresonancemassspectrometry.
JAmSocMassSpectrom2006,17(12):1692–1699.
136.
TongH,BellD,TabeiK,SiegelMM:Automateddatamassaging,interpretation,ande-mailingmodulesforhighthroughputopenaccessmassspectrometry.
JAmSocMassSpectrom1999,10(11):1174–1187.
137.
AlonT,AmiravA:Isotopeabundanceanalysismethodsandsoftwareforimprovedsampleidenticationwithsupersonicgaschromatography/massspectrometry.
RapidCommunMassSpectrom2006,20(17):2579–2588.
138.
IpsenA,WantEJ,EbbelsTMD:ConstructionofcondenceregionsforisotopicabundancepatternsinLC/MSdatasetsforrigorousdeterminationofmolecularformulas.
AnalChem2010,82(17):7319–7328.
139.
RodgersRP,BlumerEN,HendricksonCL,MarshallAG:Stableisotopeincorporationtriplestheuppermasslimitfordeterminationofelementalcompositionbyaccuratemassmeasurement.
JAmSocMassSpectrom2000,11(10):835–840.
140.
HegemanAD,SchulteCF,CuiQ,LewisIA,HuttlinEL,EghbalniaH,HarmsAC,UlrichEL,MarkleyJL,SussmanMR:Stableisotopeassistedassignmentofelementalcompositionsformetabolomics.
AnalChem2007,79(1):6912–6921.
141.
GiavaliscoP,LiY,MatthesA,EckhardtA,HubbertenHM,HesseH,SeguS,HummelJ,K¨ohlK,WillmitzerL:Elementalformulaannotationofpolarandlipophilicmetabolitesusing13C,15Nand34Sisotopelabelling,incombinationwithhigh-resolutionmassspectrometry.
PlantJ2011,68(2):364–376.
142.
BaranR,BowenBP,BouskillNJ,BrodieEL,YannoneSM,NorthenTR:MetaboliteidenticationinSynechococcussp.
PCC7002usinguntargetedstableisotopeassistedmetaboliteproling.
AnalChem2010,82(21):9034–9042.
143.
JarussophonS,AcocaS,GaoJM,DeprezC,KiyotaT,DraghiciC,PurisimaE,KonishiY:Automatedmolecularformuladeterminationbytandemmassspectrometry(MS/MS).
Analyst2009,134(4):690–700.
144.
KonishiY,KiyotaT,DraghiciC,GaoJM,YeboahF,AcocaS,JarussophonS,PurisimaE:MolecularformulaanalysisbyanMS/MS/MStechniquetoexpeditedereplicationofnaturalproducts.
AnalChem2007,79(3):1187–1197.
145.
Rojas-ChertoM,KasperPT,WillighagenEL,VreekenRJ,HankemeierT,ReijmersTH:ElementalcompositiondeterminationbasedonMSn.
Bioinformatics2011,27:2376–2383.
146.
ZurekG,KrebsI,G¨otzS,ScheibleH,LauferS,KammererB,AlbrechtW:AsoftwaresolutionautomaticallyassignsformulaeforconstructionoffragmentationpathwaysacceleratingdrugelucidationwithESI-TOF.
LCGCEurApplBook2008,7:31–33.
147.
B¨ockerS,RascheF:Towardsdenovoidenticationofmetabolitesbyanalyzingtandemmassspectra.
Bioinformatics2008,24:I49—I55.
148.
RascheF,SvatoˇsA,MaddulaRK,B¨ottcherC,B¨ockerS:Computingfragmentationtreesfromtandemmassspectrometrydata.
AnalChem2011,83(4):1243–1251.
149.
SingletonKE,CooksRG,WoodKV:Utilizationofnaturalisotopicabundanceratiosintandemmassspectrometry.
AnalChem1983,55(4):762–764.
150.
RockwoodAL,KushnirMM,NelsonGJ:Dissociationofindividualisotopicpeaks:PredictingisotopicdistributionsofproductionsinMSn.
JAmSocMassSpectrom2003,14:311–322.
151.
RamaleyL,HerreraLC:Softwareforthecalculationofisotopepatternsintandemmassspectrometry.
RapidCommunMassSpectrom2008,22(17):2707–2714.
152.
RogersS,ScheltemaRA,GirolamiM,BreitlingR:Probabilisticassignmentofformulastomasspeaksinmetabolomicsexperiments.
Bioinformatics2009,25(4):512–518.
153.
SteinSE:Chemicalsubstructureidenticationbymassspectrallibrarysearching.
JAmSocMassSpectrom1995,6(8):644–655.
154.
DemuthW,KarlovitsM,VarmuzaK:Spectralsimilarityversusstructuralsimilarity:Massspectrometry.
AnalChimActa2004,516(1-2):75–85.
155.
SheldonMT,MistrikR,CroleyTR:Determinationofionstructuresinstructurallyrelatedcompoundsusingprecursorionngerprinting.
JAmSocMassSpectrom2009,20(3):370–376.
156.
WernerE,HeilierJF,DucruixC,EzanE,JunotC,TabetJC:Massspectrometryfortheidenticationofthediscriminatingsignalsfrommetabolomics:Currentstatusandfuturetrends.
JChromatogrB2008,871(2):143–163.
157.
VenkataraghavanR,McLaertyFW,vanLearGE:Computer-aidedinterpretationofmassspectra.
OrgMassSpectrom1969,2(1):1–15.
158.
KwokKS,VenkataraghavanR,McLaertyFW:Computer-aidedinterpretationofmassspectra.
III.
Self-traininginterpretiveandretrievalsystem.
JAmChemSoc1973,95(13):4185–4194.
159.
ScottDR:Patternrecognition/expertsystemformassspectraofvolatiletoxicandotherorganiccompounds.
AnalChimActa1992,265:43–54.
160.
ScottDR:Rapidandaccuratemethodforestimatingmolecularweightsoforganiccompoundsfromlowresolutionmassspectra.
ChemometrIntellLab1992,16(3):193–202.
161.
ScottDR,LevitskyA,SteinSE:Largescaleevaluationofapatternrecognition/expertsystemformassspectralmolecularweightestimation.
AnalChimActa1993,278:137–147.
162.
HennebergD,WeimannB,ZalfenU:Computer-aidedinterpretationofmassspectrausingdatabaseswithspectraandstructures.
I.
Structuresearches.
OrgMassSpectrom1993,28:198–206.
163.
VarmuzaK,WertherW:Massspectralclassiersforsupportingsystematicstructureelucidation.
JChemInfCompSci1996,36(2):323–333.
164.
SchymanskiEL,MeinertC,MeringerM,BrackW:TheuseofMSclassiersandstructuregenerationtoassistintheidenticationofunknownsineect-directedanalysis.
AnalChimActa2008,615(2):136–147.
165.
XiongQ,ZhangY,LiM:Computer-assistedpredictionofpesticidesubstructureusingmassspectra.
AnalChimActa2007,593(2):199–206.
166.
ZhangL,LiangY,ChenA:Selectionofneutrallossesandcharacteristicionsformassspectralclassier.
Analyst2009,134(8):1717–1724.
167.
HummelJ,StrehmelN,SelbigJ,WaltherD,KopkaJ:DecisiontreesupportedsubstructurepredictionofmetabolitesfromGC-MSproles.
Metabolomics2010,6(2):322–333.
168.
TsugawaH,TsujimotoY,AritaM,BambaT,FukusakiE:GC/MSbasedmetabolomics:DevelopmentofadataminingsystemforScheubertetal.
JournalofCheminformatics2013,5:12Page22of24http://www.
jcheminf.
com/content/5/1/12metaboliteidenticationbyusingsoftindependentmodelingofclassanalogy(SIMCA).
BMCBioinformatics2011,12:131.
169.
HeinonenM,ShenH,ZamboniN,RousuJ:Metaboliteidenticationandmolecularngerprintpredictionviamachinelearning.
Bioinformatics2012,28(18):2333–2341.
170.
KerberA,LaueR,MoserD:EinStrukturgeneratorf¨urmolekulareGraphen.
AnalChimActa1990,235:221–228.
171.
BeneckeC,GrundR,HohbergerR,KerberA,LaueR,WielandT:MOLGEN+,ageneratorofconnectivityisomersandstereoisomersformolecularstructureelucidation.
AnalChimActa1995,314:141–147.
172.
KerberA,LaueR,MeringerM,R¨uckerC:Moleculesinsilico:Thegenerationofstructuralformulaeanditsapplications.
JComputChemJapan2004,3(3):85–96.
173.
MolchanovaMS,ShcherbukhinVV,ZerovNS:ComputergenerationofmolecularstructuresbytheSMOGprogram.
JChemInfComputSci1996,36(4):888–899.
174.
FontanaP,PretschE:Automaticspectrainterpretation,structuregeneration,andranking.
JChemInfComputSci2002,42(3):614–619.
175.
GrayNAB,BuchsA,SmithDH,DjerassiC:Computerassistedstructuralinterpretationofmassspectraldata.
HelvChimActa1981,64(2):458–470.
176.
FaulonJL:Stochasticgeneratorofchemicalstructure:(1)Applicationtothestructureelucidationoflargemolecules.
JChemInfComputSci1994,34:1204–1218.
177.
PeironcelyJE,Rojas-ChertoM,FicheraD,ReijmersT,CoulierL,FaulonJL,HankemeierT:OMG:openmoleculegenerator.
JCheminform2012,4(1):21.
178.
HillDW,KerteszTM,FontaineD,FriedmanR,GrantDF:Massspectralmetabonomicsbeyondelementalformula:Chemicaldatabasequeryingbymatchingexperimentalwithcomputationalfragmentationspectra.
AnalChem2008,80(14):5574–5582.
179.
WolfS,SchmidtS,M¨uller-HannemannM,NeumannS:Insilicofragmentationforcomputerassistedidenticationofmetabolitemassspectra.
BMCBioinformatics2010,11:148.
180.
KangasLJ,MetzTO,IsaacG,SchromBT,Ginovska-PangovskaB,WangL,TanL,LewisRR,MillerJH:Insilicoidenticationsoftware(ISIS):Amachinelearningapproachtotandemmassspectralidenticationoflipids.
Bioinformatics2012,28(13):1705–1713.
181.
KameyamaA,NakayaS,ItoH,KikuchiN,AngataT,NakamuraM,IshidaHK,NarimatsuH:StrategyforsimulationofCIDspectraofN-linkedoligosaccharidestowardglycomics.
JProteomeRes2006,5(4):808–814.
182.
ZhangH,SinghS,ReinholdVN:Congruentstrategiesforcarbohydratesequencing.
2.
FragLib:AnMSnspectrallibrary.
AnalChem2005,77(19):6263–6270.
183.
ChenT,KaoMY,TepelM,RushJ,ChurchGM:Adynamicprogrammingapproachtodenovopeptidesequencingviatandemmassspectrometry:SocietyforIndustrialandAppliedMathematics;2000.
184.
DanˇckV,AddonaTA,ClauserKR,VathJE,PevznerPA:Denovopeptidesequencingviatandemmassspectrometry:Agraph-theoreticalapproach.
InProc.
ofResearchinComputationalMolecularBiology(RECOMB1999):135–144.
185.
TaylorJA,JohnsonRS:Sequencedatabasesearchesviadenovopeptidesequencingbytandemmassspectrometry.
RapidCommunMassSpectrom1997,11:1067–1075.
186.
Jalali-HeraviM,FatemiM:Simulationofmassspectraofnoncyclicalkanesandalkenesusingarticialneuralnetwork.
AnalChimActa2000,415(1-2):95–103.
187.
CooksRG:Bondformationuponelectron-impact.
OrgMassSpectrom1969,2(5):481–519.
188.
BanduML,WatkinsKR,BretthauerML,MooreCA,DesaireH:PredictionofMS/MSdata.
1.
Afocusonpharmaceuticalscontainingcarboxylicacids.
AnalChem2004,76(6):1746–1753.
189.
KlagkouK,PullenF,HarrisonM,OrganA,FirthA,LangleyGJ:Approachestowardstheautomatedinterpretationandpredictionofelectrospraytandemmassspectraofnon-peptidiccombinatorialcompounds.
RapidCommunMassSpectrom2003,17(11):1163–1168.
190.
GrayNAB,CarhartRE,LavanchyA,SmithDH,VarkonyT,BuchananBG,WhiteWC,CrearyL:Computerizedmassspectrumpredictionandranking.
AnalChem1980,52(7):1095–1102.
191.
ClarkHA,JursPC:Simulationofmassspectralintensitiesbyregressionanalysisofcalculatedstructuralcharacteristics.
AnalChimActa1981,132:75–88.
192.
ChenH,FanB,XiaH,PetitjeanM,YuanS,PanayeA,DoucetJP:MASSIS:Amassspectrumsimulationsystem1.
Principleandmethod.
EurJMassSpectrom(Chichester,Eng)2003,9(3):175–186.
193.
ChenH,FanB,PetitjeanM,PanayeA,DoucetJP,LiF,XiaH,YuanS:MASSIS:amassspectrumsimulationsystem.
2:Proceduresandperformance.
EurJMassSpectrom(Chichester,Eng)2003,9(5):445–457.
194.
FanB,ChenH,PetitjeanM,PanayeA,DoucetJP,XiaH,YuanS:Newstrategyofmassspectrumsimulationbasedonreducedandconcentratedknowledgedatabases.
SpectroscLett2005,38(2):145–170.
195.
SchymanskiEL,MeringerM,BrackW:Matchingstructurestomassspectrausingfragmentationpatterns:AretheresultsasgoodastheylookAnalChem2009,81(9):3608–3617.
196.
KerberA,LaueR,MeringerM,VarmuzaK:MOLGEN-MS:EvaluationoflowresolutionelectronimpactmassspectrawithMSclassicationandexhaustivestructuregeneration.
AdvMassSpectrom2001,15:939–940.
197.
KerberA,MeringerM,R¨uckerC:CASEviaMS:Rankingstructurecandidatesbymassspectra.
CroatChemActa2006,79(3):449–464.
198.
PelanderA,Tyrkk¨oE,Ojanper¨aI:Insilicomethodsforpredictingmetabolismandmassfragmentationappliedtoquetiapineinliquidchromatography/time-of-ightmassspectrometryurinedrugscreening.
RapidCommunMassSpectrom2009,23(4):506–514.
199.
KumariS,StevensD,KindT,DenkertC,FiehnO:Applyingin-silicoretentionindexandmassspectramatchingforidenticationofunknownmetabolitesinaccuratemassGC-TOFmassspectrometry.
AnalChem2011,83(15):5895–5902.
200.
SweeneyDL:Smallmoleculesasmathematicalpartitions.
AnalChem2003,75(20):5362–5373.
201.
HillAW,Mortishire-SmithRJ:Automatedassignmentofhigh-resolutioncollisionallyactivateddissociationmassspectrausingasystematicbonddisconnectionapproach.
RapidCommunMassSpectrom2005,19:3111–3118.
202.
HeinonenM,RantanenA,Mielik¨ainenT,Pitk¨anenE,KokkonenJ,RousuJ:Abinitiopredictionofmolecularfragmentsfromtandemmassspectrometrydata.
InProc.
ofGermanConferenceonBioinformatics(GCB2006),volumeP-83ofLectureNotesinInformatics:40–53.
203.
HeinonenM,RantanenA,Mielik¨ainenT,KokkonenJ,KiuruJ,KetolaRA,RousuJ:FiD:Asoftwareforabinitiostructuralidenticationofproductionsfromtandemmassspectrometricdata.
RapidCommunMassSpectrom2008,22(19):3043–3052.
204.
B¨ockerS,RascheF,SteijgerT:Annotatingfragmentationpatterns.
InProc.
ofWorkshoponAlgorithmsinBioinformatics(WABI2009),volume5724ofLectNotesComputSci.
Berlin:Springer;2009:13–24.
205.
SchymanskiEL,GallampoisCMJ,KraussM,MeringerM,NeumannS,SchulzeT,WolfS,BrackW:ConsensusstructureelucidationcombiningGC/EI-MS,structuregenerationandcalculatedproperties.
AnalChem2012,84(7):3287–3295.
206.
GerlichM,NeumannS:MetFusion:Integrationofcompoundidenticationstrategies.
JMassSpectrom2013,48(3):291–8.
207.
MenikarachchiLC,CawleyS,HillDW,HallLM,HallL,LaiS,WilderJ,GrantDF:MolFind:AsoftwarepackageenablingHPLC/MS-basedidenticationofunknownchemicalstructures.
AnalChem2012,84(21):9388–9394.
208.
RidderL,vanderHooftJJJ,VerhoevenS,deVosRCH,vanSchaikR,VervoortJ:Substructure-basedannotationofhigh-resolutionmultistageMSnspectraltrees.
RapidCommunMassSpectrom2012,26(20):2461–2471.
209.
LudwigM,HufskyF,ElshamyS,B¨ockerS:Findingcharacteristicsubstructuresformetaboliteclasses.
InProc.
ofGermanConferenceonBioinformatics(GCB2012),volume26ofOpenAccessSeriesinInformatics(OASIcs);2012:23–38.
SchlossDagstuhl–Leibniz-ZentrumfuerInformatik.
210.
BodeHB,M¨ullerR:Theimpactofbacterialgenomicsonnaturalproductresearch.
AngewChemIntEdEngl2005,44:6828–6846.
Scheubertetal.
JournalofCheminformatics2013,5:12Page23of24http://www.
jcheminf.
com/content/5/1/12211.
BandeiraN,NgJ,MeluzziD,LiningtonRG,DorresteinP,PevznerPA:Denovosequencingofnonribosomalpeptides.
InProc.
ofResearchinComputationalMolecularBiology(RECOMB2008),volume4955ofLectNotesBioinform.
Berlin:Springer;2008:181–195.
212.
LiuWT,NgJ,MeluzziD,BandeiraN,GutierrezM,SimmonsTL,SchultzAW,LiningtonRG,MooreBS,GerwickWH,PevznerPA,DorresteinPC:Interpretationoftandemmassspectraobtainedfromcyclicnonribosomalpeptides.
AnalChem2009,81:4200–4209.
213.
NgJ,BandeiraN,LiuWT,GhassemianM,SimmonsTL,GerwickWH,LiningtonR,DorresteinPC,PevznerPA:Dereplicationanddenovosequencingofnonribosomalpeptides.
NatMethods2009,6(8):596–599.
214.
MohimaniH,YangYL,LiuWT,HsiehPW,DorresteinPC,PevznerPA:Sequencingcyclicpeptidesbymultistagemassspectrometry.
Proteomics2011,11(18):3642–3650.
215.
Rojas-ChertoM,PeironcelyJE,KasperPT,vanderHooftJJJ,deVosRCH,VreekenRJ,HankemeierT,ReijmersTH:Metaboliteidenticationusingautomatedcomparisonofhigh-resolutionmultistagemassspectraltrees.
AnalChem2012,84(13):5524–5534.
216.
KasperPT,Rojas-ChertoM,MistrikR,ReijmersT,HankemeierT,VreekenRJ:Fragmentationtreesforthestructuralcharacterisationofmetabolites.
RapidCommunMassSpectrom2012,26(19):2275–2286.
217.
RaufI,RascheF,NicolasF,B¨ockerS:Findingmaximumcolorfulsubtreesinpractice.
InProc.
ofResearchinComputationalMolecularBiology(RECOMB2012),volume7262ofLectNotesComputSci.
Berlin:Springer;2012:213–223.
218.
HufskyF,B¨ockerS:Comparingfragmentationtreesfromelectronimpactmassspectrawithannotatedfragmentationpathways.
InProc.
ofGermanConferenceonBioinformatics(GCB2012),volume26ofOpenAccessSeriesinInformatics(OASIcs);2012:12–22.
SchlossDagstuhl-Leibniz-ZentrumfuerInformatik.
219.
ScheubertK,HufskyF,RascheF,B¨ockerS:Computingfragmentationtreesfrommetabolitemultiplemassspectrometrydata.
InProc.
ofResearchinComputationalMolecularBiology(RECOMB2011),volume6577ofLectNotesComputSci.
Berlin:Springer;2011:377–391.
220.
ScheubertK,HufskyF,RascheF,B¨ockerS:Computingfragmentationtreesfrommetabolitemultiplemassspectrometrydata.
JComputBiol2011,18(11):1383–1397.
221.
RascheF,ScheubertK,HufskyF,ZichnerT,KaiM,SvatoˇsA,B¨ockerS:Identifyingtheunknownsbyaligningfragmentationtrees.
AnalChem2012,84(7):3417–3426.
222.
HufskyF,D¨uhrkopK,RascheF,ChimaniM,B¨ockerS:Fastalignmentoffragmentationtrees.
Bioinformatics2012,28:i265—i273.
223.
RareyM,DixonJS:Featuretrees:Anewmolecularsimilaritymeasurebasedontreematching.
JComputAidedMolDes1998,12(5):471–490.
224.
FiehnO,KopkaJ,D¨ormannP,AltmannT,TretheweyRN,WillmitzerL:Metaboliteprolingforplantfunctionalgenomics.
NatBiotechnol2000,18(11):1157–1161.
225.
ArkinA,ShenP,RossJ:Atestcaseofcorrelationmetricconstructionofareactionpathwayfrommeasurements.
Science1997,277(5330):1275–1279.
226.
KoseF,WeckwerthW,LinkeT,FiehnO:Visualizingplantmetabolomiccorrelationnetworksusingclique-metabolitematrices.
Bioinformatics2001,17(12):1198–1208.
227.
SteuerR,KurthsJ,FiehnO,WeckwerthW:Observingandinterpretingcorrelationsinmetabolomicnetworks.
Bioinformatics2003,19(8):1019–1026.
228.
KrumsiekJ,SuhreK,IlligT,AdamskiJ,TheisFJ:Gaussiangraphicalmodelingreconstructspathwayreactionsfromhigh-throughputmetabolomicsdata.
BMCSystBiol2011,5:21.
229.
BreitlingR,RitchieS,GoodenoweD,StewartML,BarrettMP:AbinitiopredictionofmetabolicnetworksusingFouriertransformmassspectrometrydata.
Metabolomics2006,2(3):155–164.
230.
WatrousJ,RoachP,AlexandrovT,HeathBS,YangJY,KerstenRD,vanderVoortM,PoglianoK,GrossH,RaaijmakersJM,MooreBS,LaskinJ,BandeiraN,DorresteinPC:Massspectralmolecularnetworkingoflivingmicrobialcolonies.
ProcNatlAcadSciUSA2012,109(26):E1743—E1752.
231.
SteinSE:Anintegratedmethodforspectrumextractionandcompoundidenticationfromgaschromatography/massspectrometrydata.
JAmSocMassSpectrom1999,10(8):770–781.
232.
BaranR,KochiH,SaitoN,SuematsuM,SogaT,NishiokaT,RobertM,TomitaM:MathDAMP:Apackagefordierentialanalysisofmetaboliteproles.
BMCBioinformatics2006,7:530.
233.
LuedemannA,StrassburgK,ErbanA,KopkaJ:TagFinderforthequantitativeanalysisofgaschromatography–massspectrometry(GC-MS)-basedmetaboliteprolingexperiments.
Bioinformatics2008,24(5):732–737.
234.
LuedemannA,vonMalotkyL,ErbanA,KopkaJ:TagFinder:Preprocessingsoftwareforthengerprintingandtheprolingofgaschromatography-massspectrometrybasedmetabolomeanalyses.
MethodsMolBiol2012,860:255–286.
235.
HillerK,HangebraukJ,J¨agerC,SpuraJ,SchreiberK,SchomburgD:MetaboliteDetector:ComprehensiveanalysistoolfortargetedandnontargetedGC/MSbasedmetabolomeanalysis.
AnalChem2009,81(9):3429–3439.
236.
Cuadros-InostrozaA,CaldanaC,RedestigH,KusanoM,LisecJ,Pena-CortesH,WillmitzerL,HannahMA:TargetSearch–aBioconductorpackagefortheecientpreprocessingofGC-MSmetaboliteprolingdata.
BMCBioinformatics2009,10:428.
237.
AggioR,Villas-BoasSG,RuggieroK:Metab:AnRpackageforhigh-throughputanalysisofmetabolomicsdatageneratedbyGC-MS.
Bioinformatics2011,27(16):2316–2318.
238.
O'CallaghanS,DesouzaDP,IsaacA,WangQ,HodkinsonL,OlshanskyM,ErwinT,AppelbeB,TullDL,RoessnerU,BacicA,McConvilleMJ,LikicVA:PyMS:APythontoolkitforprocessingofgaschromatography–massspectrometry(GC-MS)data.
Applicationandcomparativestudyofselectedtools.
BMCBioinformatics2012,13(1):115.
239.
NiY,QiuY,JiangW,SuttlemyreK,SuM,ZhangW,JiaW,DuX:ADAP-GC2.
0:DeconvolutionofcoelutingmetabolitesfromGC/TOF-MSdataformetabolomicsstudies.
AnalChem2012,84(15):6619–6629.
240.
CastilloS,MattilaI,MiettinenJ,OreˇsiˇcM,Hy¨otyl¨ainenT:Dataanalysistoolforcomprehensivetwo-dimensionalgaschromatography/time-of-ightmassspectrometry.
AnalChem2011,83(8):3058–3067.
241.
BentonHP,WongDM,TraugerSA,SiuzdakG:XCMS2:Processingtandemmassspectrometrydataformetaboliteidenticationandstructuralcharacterization.
AnalChem2008,80(16):6382–6389.
242.
BentonHP,WantEJ,EbbelsTMD:Correctionofmasscalibrationgapsinliquidchromatography-massspectrometrymetabolomicsdata.
Bioinformatics2010,26(19):2488–2489.
243.
TautenhahnR,PattiGJ,RinehartD,SiuzdakG:XCMSOnline:Aweb-basedplatformtoprocessuntargetedmetabolomicdatas.
AnalChem2012,84(11):5035–5039.
244.
AlonsoA,Juli`aA,BeltranA,VinaixaM,DazM,IbanezL,CorreigX,MarsalS:AStream:AnRpackageforannotatingLC/MSmetabolomicdata.
Bioinformatics2011,27(9):1339–1340.
245.
WeiX,SunW,ShiX,KooI,WangB,ZhangJ,YinX,TangY,BogdanovB,KimS,ZhouZ,McClainC,ZhangX:MetSign:Acomputationalplatformforhigh-resolutionmassspectrometry-basedmetabolomics.
AnalChem2011,83(20):7668–7675.
246.
KuhlC,TautenhahnR,B¨ottcherC,LarsonTR,NeumannS:CAMERA:Anintegratedstrategyforcompoundspectraextractionandannotationofliquidchromatography/massspectrometrydatasets.
AnalChem2012,84(1):283–289.
247.
BueschlC,KlugerB,BerthillerF,LirkG,WinklerS,KrskaR,SchuhmacherR:MetExtract:Anewsoftwaretoolfortheautomatedcomprehensiveextractionofmetabolite-derivedLC/MSsignalsinmetabolomicsresearch.
Bioinformatics2012,28(5):736–738.
248.
CreekDJ,JankevicsA,BurgessKEV,BreitlingR,BarrettMP:IDEOM:AnExcelinterfaceforanalysisofLC-MSbasedmetabolomicsdata.
Bioinformatics2012.
249.
ScheltemaRA,JankevicsA,JansenRC,SwertzMA,BreitlingR:PeakML/mzMatch:Aleformat,Javalibrary,Rlibrary,andtool-chainformassspectrometrydataanalysis.
AnalChem2011,83(7):2786–2793.
250.
BrownM,WedgeDC,GoodacreR,KellDB,BakerPN,KennyLC,MamasMA,NeysesL,DunnWB:Automatedworkowsforaccuratemass-basedputativemetaboliteidenticationinLC/MS-derivedmetabolomicdatasets.
Bioinformatics2011,27(8):1108–1112.
Scheubertetal.
JournalofCheminformatics2013,5:12Page24of24http://www.
jcheminf.
com/content/5/1/12251.
BrodskyL,MoussaieA,ShahafN,AharoniA,RogachevI:EvaluationofpeakpickingqualityinLC-MSmetabolomicsdata.
AnalChem2010,82(22):9177–9187.
252.
KatajamaaM,MiettinenJ,OresicM:MZmine:Toolboxforprocessingandvisualizationofmassspectrometrybasedmolecularproledata.
Bioinformatics2006,22(5):634–636.
253.
PluskalT,CastilloS,Villar-BrionesA,OresicM:MZmine2:Modularframeworkforprocessing,visualizing,andanalyzingmassspectrometry-basedmolecularproledata.
BMCBioinformatics2010,11:395.
254.
BroecklingCD,ReddyIR,DuranAL,ZhaoX,SumnerLW:MET-IDEA:Dataextractiontoolformassspectrometry-basedmetabolomics.
AnalChem2006,78(13):4334–4341.
255.
LommenA:MetAlign:Interface-driven,versatilemetabolomicstoolforhyphenatedfull-scanmassspectrometrydatapreprocessing.
AnalChem2009,81(8):3079–3086.
256.
LipkusAH,YuanQ,LucasKA,FunkSA,BarteltWF,SchenckRJ,TrippeAJ:Structuraldiversityoforganicchemistry:AscaoldanalysisoftheCASregistry.
JOrgChem2008,73(12):4443–4451.
doi:10.
1186/1758-2946-5-12Citethisarticleas:Scheubertetal.
:Computationalmassspectrometryforsmallmolecules.
JournalofCheminformatics20135:12.
Openaccessprovidesopportunitiestoourcolleaguesinotherpartsoftheglobe,byallowinganyonetoviewthecontentfreeofcharge.
PublishwithChemistryCentralandeveryscientistcanreadyourworkfreeofchargeW.
JefferyHurst,TheHersheyCompany.
availablefreeofchargetotheentirescientificcommunitypeerreviewedandpublishedimmediatelyuponacceptancecitedinPubMedandarchivedonPubMedCentralyoursyoukeepthecopyrightSubmityourmanuscripthere:http://www.
chemistrycentral.
com/manuscript/

Digital-VM暑期全场六折优惠,8个机房

Digital-VM商家目前也在凑热闹的发布六月份的活动,他们家的机房蛮多的有提供8个数据中心,包括日本、洛杉矶、新加坡等。这次六月份的促销活动全场VPS主机六折优惠。Digital-VM商家还是有一点点特点的,有提供1Gbps和10Gbps带宽的VPS主机,如果有需要大带宽的VPS主机可以看看。第一、商家优惠码优惠码:June40全场主机六折优惠,不过仅可以月付、季付。第二、商家VPS主机套餐1...

VoLLcloud:超便宜香港CMI大带宽vps-三网CMI直连-年付四免服务-低至4刀/月-奈飞

vollcloud LLC创立于2020年,是一家以互联网基础业务服务为主的 技术型企业,运营全球数据中心业务。致力于全球服务器租用、托管及云计算、DDOS安 全防护、数据实时存储、 高防服务器加速、域名、智能高防服务器、网络安全服务解决方案等领域的智 能化、规范化的体验服务。所有购买年付产品免费更换香港原生IP(支持解锁奈飞),商家承诺,支持3天内无条件退款(原路退回)!点击进入:vollclo...

1C2G5M轻量服务器48元/年,2C4G8M三年仅198元,COM域名首年1元起

腾讯云双十一活动已于今天正式开启了,多重优惠享不停,首购服务器低至0.4折,比如1C2G5M轻量应用服务器仅48元/年起,2C4G8M也仅70元/年起;个人及企业用户还可以一键领取3500-7000元满减券,用于支付新购、续费、升级等各项账单;企业用户还可以以首年1年的价格注册.COM域名。活动页面:https://cloud.tencent.com/act/double11我们分享的信息仍然以秒...

www.qqq147.com为你推荐
杨紫别祝我生日快乐周杰伦的祝我生日快乐这首歌有什么寓意或者是在什么背景下写的微信回应封杀钉钉为什么微信被封以后然后解封了过了一会又被封了kaixin.com人人网和开心网互通,可我用的是kaixin001的开心,和kaixin*com不是一个呀!firetrap我淘宝店还是卖二单就被删,怎么回事!sss17.com为什么GAO17.COM网站打不开了www.se222se.com原来的www站到底222eee怎么了莫非不是不能222eee在收视com了,/?求解www.kaspersky.com.cn卡巴斯基杀毒软件有免费的吗?稳定版的怎么找?www.1diaocha.com哪个网站做调查问卷可以赚钱 啊hao.rising.cn如何解除瑞星主页锁定(hao.rising.cn). 不想用瑞星安全助手www.884tt.com刚才找了个下电影的网站www.ttgame8.com,不过好多电影怎么都不能用QQ旋风或者是迅雷下在呢?
汉邦高科域名注册 视频空间租用 域名解析文件 dns是什么 webhosting ixwebhosting 青果网 网通代理服务器 阿里云浏览器 vip购优惠 域名与空间 移动服务器托管 我的世界服务器ip 畅行云 lamp什么意思 空间服务器 阿里云邮箱申请 亿库 web是什么意思 WHMCS 更多