universallysonicchat

sonicchat  时间:2021-03-17  阅读:()
ToolsforAnalyzingTalkPart1:TheCHATTranscriptionFormatBrianMacWhinneyCarnegieMellonUniversityJanuary22,2021https://doi.
org/10.
21415/3mhn-0z89WhencitingtheuseofTalkBankandCHILDESfacilities,pleaseusethisreferencetothelastprintedversionoftheCHILDESmanual:MacWhinney,B.
(2000).
TheCHILDESProject:ToolsforAnalyzingTalk.
3rdEdition.
Mahwah,NJ:LawrenceErlbaumAssociates.
Thisallowsustotrackusageoftheprogramsanddatasystematicallythroughscholar.
google.
com.
Part1:CHAT21Introduction.
52TheCHILDESProject.
72.
1ImpressionisticObservation72.
2BabyBiographies.
82.
3Transcripts.
82.
4Computers92.
5Connectivity.
103FromCHILDEStoTalkBank.
113.
1ThreeTools113.
2ShapingCHAT.
123.
3BuildingCLAN.
123.
4ConstructingtheDatabase133.
5Dissemination133.
6Funding.
143.
7HowtoUseTheseManuals.
143.
8Changes.
154Principles.
164.
1Computerization164.
2WordsofCaution.
174.
2.
1TheDominanceoftheWrittenWord.
174.
2.
2TheMisuseofStandardPunctuation.
184.
2.
3WorkingWithVideo.
184.
3ProblemsWithForcedDecisions.
194.
4TranscriptionandCoding.
194.
5ThreeGoals195minCHAT215.
1minCHAT–theFormofFiles215.
2minCHAT–WordsandUtterances.
215.
3AnalyzingOneSmallFile.
225.
4NextSteps235.
5CheckingSyntacticAccuracy236CorpusOrganization.
246.
1FileNaming246.
2Metadata.
246.
3TheDocumentationFile.
267FileHeaders.
287.
1HiddenHeaders.
287.
2InitialHeaders297.
3Participant-SpecificHeaders.
367.
4ConstantHeaders.
367.
5ChangeableHeaders398Words.
438.
1TheMainLine.
448.
2BasicWords44Part1:CHAT38.
3SpecialFormMarkers448.
4UnidentifiableMaterial478.
5IncompleteandOmittedWords.
498.
6StandardizedSpellings508.
6.
1Letters518.
6.
2CompoundsandLinkages.
518.
6.
3CapitalizationandAcronyms.
528.
6.
4NumbersandTitles.
528.
6.
5KinshipForms.
528.
6.
6Shortenings.
538.
6.
7AssimilationsandCliticizations.
548.
6.
8CommunicatorsandInterjections558.
6.
9SpellingVariants.
558.
6.
10ColloquialForms.
558.
6.
11DialectalVariations568.
6.
12BabyTalk568.
6.
13WordseparationinJapanese.
578.
6.
14AbbreviationsinDutch.
589Utterances599.
1OneUtteranceorMany599.
2SatelliteMarkers.
609.
3DiscourseRepetition619.
4C-Units,sentences,utterances,andrun-ons.
619.
5Retracing.
629.
6BasicUtteranceTerminators629.
7Separators639.
8ToneDirection649.
9ProsodyWithinWords.
649.
10LocalEvents.
659.
10.
1SimpleEvents.
659.
10.
2InterposedWord669.
10.
3ComplexLocalEvents.
669.
10.
4Pauses.
679.
10.
5LongEvents.
679.
11SpecialUtteranceTerminators.
679.
12UtteranceLinkers7010ScopedSymbols.
7210.
1AudioandVideoTimeMarks7210.
2ParalinguisticandDurationScoping.
7310.
3ExplanationsandAlternatives.
7410.
4Retracing,Overlap,andClauses7510.
5ErrorMarking.
7910.
6InitialandFinalCodes.
7911DependentTiers8111.
1StandardDependentTiers8111.
2SynchronyRelations.
8712CHAT-CATranscription.
89Part1:CHAT413DisfluencyTranscription.
9214TranscribingAphasicLanguage9315ArabicandHebrewTranscription9716SpecificApplications10016.
1Code-Switching.
10016.
2ElicitedNarrativesandPictureDescriptions.
10116.
3WrittenLanguage.
10116.
4SignandSpeech10217SpeechActCodes.
10417.
1InterchangeTypes10417.
2IllocutionaryForceCodes.
10518ErrorCoding.
10818.
1Wordlevelerrorcodes10818.
1.
1Phonologicalerrors[*p]10818.
1.
2Semanticerrors[*s]10818.
1.
3Neologisms[*n]10918.
1.
4Morphologicalerrors[*m:a]10918.
1.
5Dysfluencies[*d]11118.
1.
6MissingWords.
11118.
1.
7GeneralConsiderations11118.
2Utterancelevelerrorcoding(post-codes)111References.
113Part1:CHAT51IntroductionThiselectroniceditionoftheCHATmanualisbeingcontinuallyrevisedtokeeppacewiththegrowinginterestsofthelanguageresearchcommunitiesservedbytheTalkBankandCHILDEScommunities.
Thefirstthreeeditionswerepublishedin1990,1995,and2000byLawrenceErlbaumAssociates.
After2000,weswitchedtothecurrentelectronicpublicationformat.
However,inordertoeasilytrackusagethroughsystemssuchasGoogleScholar,weaskthatuserscitetheversionofthemanualpublishedin2000,whenusingdataandprogramsintheirpublishedwork.
Thisisthecitation:MacWhinney,B.
(2000).
TheCHILDESproject:Toolsforanalyzingtalk.
3rdedition.
Mahwah,NJ:LawrenceErlbaumAssociates.
Initsearlierversion,thismanualfocusedexclusivelyontheuseoftheprogramsforchildlanguagedatainthecontextoftheCHILDESsystem(https://childes.
talkbank.
org).
However,beginningin2001withsupportfromNSF,weintroducedtheconceptofTalkBank(https://talkbank.
org)toincludeawidevarietyoflanguagedatabases.
Thesenowinclude:1.
AphasiaBank(https://aphasia.
talkbank.
org)forlanguageinaphasia,2.
ASDBank(https://asd.
talkbank.
org)forlanguageinautism,3.
BilingBank(https://biling.
talkbank.
org)forthestudyofbilingualismandcode-switching,4.
CABank(https://ca.
talkbank.
org)forConversationAnalysis,includingthelargeSCOTUScorpus,5.
CHILDES(https://childes.
talkbank.
org)forchildlanguageacquisition,6.
ClassBank(https://class.
talkbank.
org)forstudiesoflanguageintheclassroom,7.
DementiaBank(https://dementia.
talkbank.
org)forlanguageindementia,8.
FluencyBank(https://fluency.
talkbank.
org)forthestudyofchildhoodfluencydevelopment,9.
HomeBank(https://homebank.
talkbank.
org)fordaylongrecordingsinthehome,10.
PhonBank(https://phonbank.
talkbank.
org)forthestudyofphonologicaldevelopment,11.
RHDBank(https://rhd.
talkbank.
org)forlanguageinrighthemispheredamage,12.
SamtaleBank(https://samtalebank.
talkbank.
org)forDanishconversations.
13.
SLABank(https://slabank.
talkbank.
org)forsecondlanguageacquisition,and14.
TBIBank(https://tbi.
talkbank.
org)forlanguageintraumaticbraininjury,Thecurrentmanualmaintainssomeoftheearlieremphasisonchildlanguage,particularlyinthefirstsections,whileextendingthetreatmenttothesefurtherareasandformatsintermsofnewcodesandseveralnewsections.
Wearecontinuallyaddingcorporatoeachoftheseseparatecollections.
In2018,thesizeofthetextdatabaseis800MBandthereisanadditional5TBofmedia.
AllofthedatainTalkBankarefreelyopentodownloadingandanalysiswiththeexceptionofthedataintheclinicallanguagebankswhichareopentoclinicalresearchersusingpasswords.
TheCLANprogramandtherelatedmorphosyntactictaggersareallfreeandopen-sourcedthroughGitHub.
Fortunately,allofthesedifferentlanguagebanksmakeuseofthesametranscriptionformat(CHAT)andthesamesetofprograms(CLAN).
Thismeansthat,althoughmostoftheexamplesinthismanualrelyondatafromtheCHILDESdatabase,theprinciplesextendPart1:CHAT6easilytodatainalloftheTalkBankrepositories.
TalkBankisthelargestopenrepositoryofdataonspokenlanguage.
AllofthedatainTalkBankaretranscribedintheCHATformatwhichiscompatiblewiththeCLANprograms.
UsingconversionprogramsavailableinsideCLAN(seetheCLANmanualfordetails),transcriptsinCHATformatcanbeautomaticallyconvertedintotheformatsrequiredforPraat(praat.
org),Phon(phonbank.
talkbank.
org),ELAN(tla.
mpi.
nl/tools/elan),CoNLL,ANVIL(anvil-software.
org),EXMARaLDA(exmaralda.
org),LIPP(ihsys.
com),SALT(saltsoftware.
com),LENA(lenafoundation.
org),Transcriber(trans.
sourceforge.
net),andANNIS(corpus-tools.
org/ANNIS).
TalkBankdatabasesandprogramshavebeenusedwidelyintheresearchliterature.
CHILDES,whichistheoldestandmostwidelyrecognizedofthesedatabases,hasbeenusedinover7000publishedarticles.
PhonBankhasbeenusedin480articlesandAphasiaBankhasbeenusedin212presentationsandpublications.
Ingeneral,thelongeradatabasehasbeenavailabletoresearchers,themoretheuseofthatdatabasehasbecomeintegratedintothebasicresearchmethodologyandpublicationhistoryofthefield.
MetadataforthetranscriptsandmediainthesevariousTalkBankdatabaseshavebeenenteredintothetwomajorsystemsforaccessinglinguisticdata:OLAC,andVLO(VirtualLanguageObservatory).
EachtranscriptandmediafilehasbeenassignedaPID(permanentID)usingtheHandleSystem(www.
handle.
net),andeachcorpushasreceivedanISBNandDOI(digitalobjectidentifier)number.
Fortenofthelanguagesinthedatabase,weprovideautomaticmorphosyntacticanalysisusingaseriesofprogramsbuiltintoCLAN.
TheselanguagesareCantonese,Chinese,Dutch,English,French,German,Hebrew,Japanese,Italian,andSpanish.
ThecodesproducedbytheseprogramscouldeventuallybeharmonizedwiththeGOLDontology.
Inaddition,wecancomputeadependencygrammaranalysisforeachofthese10languages.
Asaresultoftheseefforts,TalkBankhasbeenrecognizedasaCenterintheCLARINnetwork(clarin.
eu)andhasreceivedtheDataSealofApproval(datasealofapproval.
org).
TalkBankdatahavealsobeenincludedintheSketchEnginecorpustool(sketchengine.
co.
uk).
Part1:CHAT72TheCHILDESProjectLanguageacquisitionresearchthrivesondatacollectedfromspontaneousinteractionsinnaturallyoccurringsituations.
Youcanturnonataperecorderorvideotape,and,beforeyouknowit,youwillhaveaccumulatedalibraryofdozensorevenhundredsofhoursofnaturalisticinteractions.
Butsimplycollectingdataisonlythebeginningofamuchlargertask,becausetheprocessoftranscribingandanalyzingnaturalisticsamplesisextremelytime-consumingandoftenunreliable.
Inthisfirstvolume,wewillpresentasetofcompu-tationaltoolsdesignedtoincreasethereliabilityoftranscriptions,automatetheprocessofdataanalysis,andfacilitatethesharingoftranscriptdata.
Thesenewcomputationaltoolshavebroughtaboutrevolutionarychangesinthewaythatresearchisconductedinthechildlanguagefield.
Inaddition,theyhaveequallyrevolutionarypotentialforthestudyofsec-ond-languagelearning,adultconversationalinteractions,sociologicalcontentanalyses,andlanguagerecoveryinaphasia.
Althoughthetoolsareofwideapplicability,thisvolumeconcentratesontheiruseinthechildlanguagefield,inthehopethatresearchersfromotherareascanmakethenecessaryanalogiestotheirowntopics.
Beforeturningtoadetailedexaminationofthecurrentsystem,itmaybehelpfultotakeabriefhistoricaltouroversomeofthemajorhighlightsofearlierapproachestothecollec-tionofdataonlanguageacquisition.
Theseearlierapproachescanbegroupedintofivemajorhistoricalperiods.
2.
1ImpressionisticObservationThefirstattempttounderstandtheprocessoflanguagedevelopmentappearsinare-markablepassagefromTheConfessionsofSt.
Augustine(1952).
Inthispassage,Augustineclaimsthatherememberedhowhehadlearnedlanguage:ThisIremember;andhavesinceobservedhowIlearnedtospeak.
Itwasnotthatmyelderstaughtmewords(as,soonafter,otherlearning)inanysetmethod;butI,longingbycriesandbrokenaccentsandvariousmotionsofmylimbstoexpressmythoughts,thatsoImighthavemywill,andyetunabletoexpressallIwilledortowhomIwilled,didmyself,bytheunderstandingwhichThou,myGod,gavestme,practisethesoundsinmymemory.
Whentheynamedanything,andastheyspoketurnedtowardsit,Isawandrememberedthattheycalledwhattheywouldpointoutbythenametheyuttered.
Andthattheymeantthisthing,andnoother,wasplainfromthemotionoftheirbody,thenaturallanguage,asitwere,ofallnations,expressedbythecountenance,glancesoftheeye,gesturesofthelimbs,andtonesofthevoice,indicatingtheaffectionsofthemindasitpursues,possesses,rejects,orshuns.
Andthusbyconstantlyhearingwords,astheyoccurredinvarioussentences,Icollectedgraduallyforwhattheystood;and,havingbrokeninmymouthtothesesigns,Itherebygaveutterancetomywill.
ThusIexchangedwiththoseaboutmethesecurrentsignsofourwills,andsolauncheddeeperintothestormyintercourseofhumanlife,yetdependingonparentalauthorityandthebeckofelders.
Augustine'soutlineofearlywordlearningdrewattentiontotheroleofgaze,pointing,intonation,andmutualunderstandingasfundamentalcuestolanguagelearning.
Modernresearchinwordlearning(Bloom,2000)hassupportedeverypointofAugustine'sanalysis,Part1:CHAT8aswellashisemphasisontheroleofchildren'sintentions.
Inthissense,Augustine'ssomewhatfancifulrecollectionofhisownlanguageacquisitionremainedthehighwatermarkforchildlanguagestudiesthroughtheMiddleAgesandeventheEnlightenment.
Unfortunately,themethodonwhichtheseinsightsweregroundeddependsonourabilitytoactuallyrecalltheeventsofearlychildhood–agiftgrantedtoveryfewofus.
2.
2BabyBiographiesCharlesDarwinprovidedmuchoftheinspirationforthedevelopmentofthesecondmajortechniqueforthestudyoflanguageacquisition.
UsingnotecardsandfieldbookstotrackthedistributionofhundredsofspeciesandsubspeciesinplacesliketheGalapagosandIndonesia,Darwinwasabletocollectanimpressivebodyofnaturalisticdatainsupportofhisviewsonnaturalselectionandevolution.
Inhisstudyofgesturaldevelopmentinhisson,Darwin(1877)showedhowthesesametoolsfornaturalisticobservationcouldbeadoptedtothestudyofhumandevelopment.
Bytakingdetaileddailynotes,Darwinshowedhowresearcherscouldbuilddiariesthatcouldthenbeconvertedintobiographiesdocumentingvirtuallyanyaspectofhumandevelopment.
FollowingDarwin'slead,scholarssuchasAment(1899),Preyer(1882),Gvozdev(1949),Szuman(1955),Stern&Stern(1907),Kenyeres(Kenyeres,1926,1938),andLeopold(1939,1947,1949a,1949b)createdmonumentalbiographiesdetailingthelanguagedevelopmentoftheirownchildren.
Darwin'sbiographicaltechniquealsohaditseffectsonthestudyofadultaphasia.
Fol-lowinginthistradition,studiesofthelanguageofparticularpatientsandsyndromeswerepresentedbyLow(1931),Pick(1913),Wernicke(1874),andmanyothers.
2.
3TranscriptsThelimitsofthediarytechniquewerealwaysquiteapparent.
Eventhemosthighlytrainedobservercouldnotkeeppacewiththerapidflowofnormalspeechproduction.
Anyonewhohasattemptedtofollowachildaboutwithapenandanotebooksoonrealizeshowmuchdetailismissedandhowthenote-takingprocessinterfereswiththeongoinginteractions.
Theintroductionofthetaperecorderinthelate1950sprovidedawayaroundtheselimitationsandusheredinthethirdperiodofobservationalstudies.
Theeffectofthetaperecorderonthefieldoflanguageacquisitionwasverymuchlikeitseffectonethnomusicology,whereresearcherssuchasAlanLomax(Parrish,1996)weresuddenlyabletoproducehighqualityfieldrecordingsusingthisnewtechnology.
Thisperiodwascharacterizedbyprojectsinwhichgroupsofinvestigatorscollectedlargedatasetsoftaperecordingsfromseveralsubjectsacrossaperiodof2or3years.
Muchoftheexcitementinthe1960sregardingnewdirectionsinchildlanguageresearchwasfueleddirectlybythegreatincreaseinrawdatathatwaspossiblethroughuseoftaperecordingsandtypedtran-scripts.
Thisincreaseintheamountofrawdatahadanadditional,seldomdiscussed,conse-quence.
Intheperiodofthebabybiography,thefinalpublishedaccountscloselyresembledtheoriginaldatabaseofnotecards.
Inthissense,therewasnomajorgapbetweentheob-servationaldatabaseandthepublisheddatabase.
Intheperiodoftypedtranscripts,awidergapemerged.
Thesizeofthetranscriptsproducedinthe60sand70smadeitimpossibletopublishthefullcorpora.
Instead,researcherswereforcedtopublishonlyhigh-levelanalysesbasedondatathatwerenotavailabletoothers.
ThisledtoasituationinwhichthePart1:CHAT9rawempiricaldatabaseforthefieldwaskeptonlyinprivatestocks,unavailableforgeneralpublicexamination.
Commentsandtallieswerewrittenintothemarginsofdittomastercopiesandnew,evenlesslegiblecopies,werethenmadebythermalproductionofnewdittomasters.
Eachinvestigatordevisedaproject-specificsystemoftranscriptionandproject-specificcodes.
Aswebegantocomparehand-writtenandtypewrittentranscripts,problemsintranscriptionmethodology,codingschemes,andcross-investigatorreliabilitybecamemoreapparent.
Recognizingthisproblem,RogerBrowntooktheleadinattemptingtosharehistran-scriptsfromAdam,Eve,andSarah(Brown,1973)withotherresearchers.
Thesetranscriptsweretypedontostencilsandmimeographedinmultiplecopies.
Theextracopieswerelenttoandanalyzedbyawidevarietyofresearchers.
Inthismodel,researcherstooktheircopyofthetranscripthome,developedtheirowncodingscheme,appliedit(usuallybymakingpencilmarkingsdirectlyonthetranscript),wroteapaperabouttheresultsand,ifverypolite,sentacopytoRoger.
Someofthesereports(Moerk,1983)evenattemptedtodisprovetheconclusionsdrawnfromthosedatabyBrownhimself!
Duringthisearlyperiod,therelationsbetweenthevariouscodingschemesoftenremainedshroudedinmystery.
Afortunateconsequenceoftheunstablenatureofcodingsystemswasthatresearcherswereverycarefulnottothrowawaytheiroriginaldata,evenafterithadbeencoded.
Brownhimselfcommentedontheimpendingtransitiontocomputersinthispassage(Brown,1973,p.
53):Itissensibletoaskandwewereoftenasked,"Whynotcodethesentencesforgrammaticallysignificantfeaturesandputthemonacomputersothatstudiescouldreadilybemadebyanyone"MyansweralwayswasthatIwascontinuallydiscoveringnewkindsofinformationthatcouldbeminedfromatranscriptionofconversationandneverfeltthatIknewwhatthefullcodingshouldbe.
Thiswascertainlythecaseandindeeditcanbesaidthatintheentiredecadesince1962investigatorshavecontinuedtohituponnewwaysofinferringgrammaticalandsemanticknowledgeorcompetencefromfreeconversation.
But,formyself,Imust,incandor,addthattherewasalsoafactorofresearchstyle.
Ihavelittlepatiencewithprolonged"toolingup"forresearch.
Ialwayswanttogetstarted.
Abetterscientistwouldprobablyhavedonemoreplanningandusedthecomputer.
Hecandosotoday,inanycase,withconsiderableconfidencethatheknowswhattocode.
Withtheexperienceofthreemoredecadesofcomputerizedanalysisbehindus,wenowknowthattheideaofreducingchildlanguagedatatoasetofcodesandthenthrowingawaytheoriginaldataissimplywrong.
Instead,ourgoalmustbetocomputerizethedatainawaythatallowsustocontinuallyenhanceitwithnewcodesandannotations.
ItisfortunatethatBrownpreservedhistranscriptdatainaformthatallowedustocontinuetoworkonit.
Itisunfortunate,however,thattheoriginalaudiotapeswerenotkept.
2.
4ComputersJustasthesedataanalysisproblemswerecomingtolight,amajortechnologicaloppor-tunitywasemergingintheshapeofthepowerful,affordablemicrocomputer.
Microcom-puterword-processingsystemsanddatabaseprogramsallowedresearcherstoentertranscriptdataintocomputerfilesthatcouldthenbeeasilyduplicated,edited,andanalyzedPart1:CHAT10bystandarddata-processingtechniques.
In1981,whentheChildLanguageDataExchangeSystem(CHILDES)Projectwasfirstconceived,researchersbasicallythoughtofcomputersystemsaslargenotepads.
Althoughresearcherswereawareofthewaysinwhichdatabasescouldbesearchedandtabulated,thefullanalyticandcomparativepowerofthecomputersystemsthemselveswasnotyetfullyunderstood.
Ratherthanservingonlyasan"archive"orhistoricalrecord,afocusonashareddata-basecanleadtoadvancesinmethodologyandtheory.
However,toachievetheseadditionaladvances,researchersfirstneededtomovebeyondtheideaofasimpledatarepository.
Atfirst,thepossibilityofutilizingsharedtranscriptionformats,sharedcodes,andsharedanal-ysisprogramsshoneonlyasafaintglimmeronthehorizon,againstthefogandgloomofhandwrittentallies,fuzzydittos,andidiosyncraticcodingschemes.
Slowly,againstthisbackdrop,theideaofacomputerizeddataexchangesystembegantoemerge.
ItwasagainstthisconceptualbackgroundthatCHILDES(thenameusesaone-syllablepronunciation)wasconceived.
Theoriginofthesystemcanbetracedbacktothesummerof1981whenDanSlobin,WillemLevelt,SusanErvin-Tripp,andBrianMacWhinneydiscussedthepos-sibilityofcreatinganarchivefortyped,handwritten,andcomputerizedtranscriptstobelocatedattheMax-Planck-InstitutfürPsycholinguistikinNijmegen.
In1983,theMacArthurFoundationfundedmeetingsofdevelopmentalresearchersinwhichElizabethBates,BrianMacWhinney,CatherineSnow,andotherchildlanguageresearchersdiscussedthepossibilityofsolicitingMacArthurfundstosupportadataexchangesystem.
InJanuaryof1984,theMacArthurFoundationawardedatwo-yeargranttoBrianMacWhinneyandCatherineSnowfortheestablishmentoftheChildLanguageDataExchangeSystem.
Thesefundsprovidedfortheentryofdataintothesystemandfortheconveningofameetingofanadvisoryboard.
TwentychildlanguageresearchersmetforthreedaysinConcord,MassachusettsandagreedonabasicframeworkfortheCHILDESsystem,whichCatherineSnowandBrianMacWhinneywouldthenproceedtoimplement.
2.
5ConnectivitySince1984,whentheCHILDESProjectbeganinearnest,theworldofcomputershasgonethroughaseriesofremarkablerevolutions,eachintroducingnewopportunitiesandchallenges.
Theprocessingpowerofthehomecomputernowdwarfsthepowerofthemainframeofthe1980s;newmachinesarenowshippedwithbuilt-inaudiovisualcapabil-ities;anddevicessuchasCD-ROMsandopticaldisksofferenormousstoragecapacityatreasonableprices.
Thisnewhardwarehasnowopenedupthepossibilityformultimediaaccesstodigitizedaudioandvideofromlinksinsidethewrittentranscripts.
Ineffect,atranscriptisnowthestartingpointforanewexploratoryrealityinwhichthewholeinteractionisaccessiblefromthetranscript.
Althoughresearchershavejustnowbeguntomakeuseofthesenewtools,thecurrentshapeoftheCHILDESsystemreflectsmanyofthesenewrealities.
Inthepagesthatfollow,youwilllearnabouthowweareusingthisnewtechnologytoproviderapidaccesstothedatabaseandtopermitthelinkageoftranscriptstodigitizedaudioandvideorecords,evenovertheInternet.
Part1:CHAT113FromCHILDEStoTalkBankBeginningin2001,withsupportfromanNSFInfrastructuregrant,webegantheextensionoftheCHILDESdatabaseconcepttoaseriesofadditionalfieldslistedintheIntroduction.
TheseextensionshaveledtotheneedforadditionalfeaturesintheCHATcodingsystemtosupportCAnotation,phonologicalanalysis,andgesturecoding.
Aswedevelopnewtoolsforeachoftheseareasandincreasetheinteroperabilitybetweentools,thepowerofthesystemcontinuestogrow.
Asaresult,wecannowrefertothisworkastheTalkBankProject.
3.
1ThreeToolsThereasonsfordevelopingacomputerizedexchangesystemforlanguagedataareim-mediatelyobvioustoanyonewhohasproducedoranalyzedtranscripts.
Withsuchasys-tem,wecan:automatetheprocessofdataanalysis,obtainbetterdatainaconsistent,fully-documentedtranscriptionsystem,andprovidemoredataformorechildrenfrommoreages,speakingmorelanguages.
TheTalkBankProjecthasaddressedeachofthesegoalsbydevelopingthreeseparate,butintegrated,tools.
ThefirsttoolistheCHATtranscriptionandcodingformat.
Thesec-ondtoolistheCLANanalysisprogram,andthethirdtoolisthedatabase.
Thesethreetoolsarelikethelegsofathree-leggedstool.
ThetranscriptsinthedatabasehaveallbeenputintotheCHATtranscriptionsystem.
TheprogramisdesignedtomakefulluseoftheCHATformattofacilitateawidevarietyofsearchesandanalyses.
ManyresearchgroupsarenowusingtheCLANprogramstoenternewdatasets.
Eventually,thesenewdatasetswillbeavailabletootherresearchersasapartofthegrowingTalkBankdatabases.
Inthisway,CHAT,CLAN,andthedatabasefunctionasanintegratedsetoftools.
TherearemanualsforeachoftheseTalkBanktools.
1.
Part1oftheTalkBankmanual,whichyouarenowreading,describestheconventionsandprinciplesofCHATtranscription.
2.
Part2describestheuseofthebasicCLANcomputerprogramsthatyoucanusetotranscribe,annotate,andanalyzelanguageinteractions.
3.
Part3describestheuseofadditionalCLANprogramformorphosyntacticanalysis.
4.
Thefinalsectionofthemanuals,whichdescribesthecontentsofthedatabases,isbrokenoutasacollectionofindexanddocumentationfilesontheweb.
Forexample,ifwanttosurveytheshapeoftheDutchchildlanguagecorpora,youfirstgotohttps://childes.
talkbank.
org.
Thereisalsoalinktothatsitefromtheoverallindexathttps://talkbank.
org/.
Fromthathomepageyouclickon**IndextoCorpora**andthenDutch.
Forthere,youmightwanttoreadaboutthecontentsoftheCLPFcorpusforearlyphonologicaldevelopmentinDutch.
YouthenclickontheCLPFlinkandittakesyoutothefullercorpusdescriptionwithphotosfromthecontributors.
Fromlinksonthatpageyoucaneitherbrowsethecorpus,downloadthetranscripts,ordownloadthemedia.
Inadditiontothesebasicmanualresources,therearethesefurtherfacilitiesforlearningCHATandCLAN,allofwhichcanbedownloadedfromthetalkbank.
organdchildes.
talkbank.
orgserversites:Part1:CHAT121.
NanBernsteinRatnerandShelleyBrundagehavecontributedamanualdesignedspecificallyforclinicalpractitionerscalledtheSLP'sGuidetoCLAN.
2.
ThereareversionsofthemanualsinJapaneseandChinese.
3.
DavidaFrommhasproducedaseriesofscreencastsdescribinghowtousebasicfeaturesofCLAN.
3.
2ShapingCHATWereceivedagreatdealofextremelyhelpfulinputduringtheyearsbetween1984and1988whentheCHATsystemwasbeingformulated.
SomeofthemostdetailedcommentscamefromGeorgeAllen,ElizabethBates,NanBernsteinRatner,GiuseppeCappelli,An-nickDeHouwer,JaneDesimone,JaneEdwards,JuliaEvans,JudiFenson,PaulFletcher,StevenGillis,KristenKeefe,MaryMacWhinney,JonMiller,BarbaraPan,LuciaPfanner,KimPlunkett,KelleySacco,CatherineSnow,JeffSokolov,LeonidSpektor,JosephStemberger,FrankWijnen,andAntonioZampolli.
CommentsdevelopedinEdwards(1992)wereusefulinshapingcoreaspectsofCHAT.
GeorgeAllen(1988)helpeddevelopedtheUNIBETandPHONASCIIsystems.
TheworkersintheLIPPSGroup(LIPPS,2000)havedevelopedextensionsofCHATtocovercode-switchingphenomena.
AdaptationsofCHATtodealwithdataondisfluenciesaredevelopedinBernstein-Ratner,Rooney,andMacWhinney(1996).
TheexercisesintheCLANmanualarebasedonmaterialsoriginallydevelopedbyBarbaraPanforChapter2ofSokolov&Snow(1994)Intheperiodbetween2001and2004,weconvertedmuchoftheCHILDESsystemtoworkwiththenewXMLInternetdataformat.
ThisworkwasbegunbyRomeoAnghelacheandcompletedbyFranklinChen.
SupportforthismajorreformattingandtherelatedtighteningoftheCHATformatcamefromtheNSFTalkBankInfrastructureprojectwhichinvolvedamajorcollaborationwithStevenBirdandMarkLibermanoftheLinguisticDataConsortium.
3.
3BuildingCLANTheCLANprogramisthebrainchildofLeonidSpektor.
Ideasforparticularanalysiscommandscamefromseveralsources.
BillTuthill'sHUMpackageprovidedideasaboutconcordanceanalyses.
TheSALTsystemofMiller&Chapman(1983)providedguidelinesregardingbasicpracticesintranscriptionandanalysis.
CliftonPye'sPALprogramprovidedideasfortheMODREPandPHONFREQcommands.
DariusClynesportedCLANtotheMacintosh.
JeffreySokolovwrotetheCHIPpro-gram.
MitziMorrisdesignedtheMORanalyzerusingspecificationsprovidedbyRolandHauserofErlangenUniversity.
NorioNakaandSusanneMiyatadevelopedaMORrulesystemforJapanese;andMonicaSanz-TorrenthelpeddeveloptheMORsystemforSpanish.
JuliaEvansprovidedrecommendationsforthedesignoftheaudioandvisualcapabilitiesoftheeditor.
JohannesWagnerandSpencerHazelhelpedshowushowwecouldmodifyCLANtopermittranscriptionintheConversationAnalysisframework.
StevenGillisprovidedsuggestionsforaspectsofMODREP.
ChristopheParissebuiltthePOSTandPOSTTRAINprograms(Parisse&LeNormand,2000).
BrianRichardscontributedtheVOCDprogram(Malvern,Richards,Chipere,&Purán,2004).
JuliaEvanshelpedspecifyTIMEDURandworkedonthedetailsofDSS.
CatherineSnowdesignedPart1:CHAT13CHAINS,KEYMAP,andSTATFREQ.
NanBernsteinRatnerspecifiedaspectsofPHONFREQandplansforadditionalprogramsforphonologicalanalysis.
3.
4ConstructingtheDatabaseTheprimaryreasonforthesuccessoftheTalkBankdatabaseshasbeenthegenerosityofover300researcherswhohavecontributedtheircorpora.
Eachofthesecorporarepresentshundreds,oftenthousands,ofhoursspentincarefulcollection,transcription,andcheckingofdata.
Allresearchersinchildlanguageshouldbeproudofthewayresearchershavegenerouslysharedtheirvaluabledatawiththewholeresearchcommunity.
Thegrowingsizeofthedatabaseforlanguageimpairments,adultaphasia,andsecond-languageacquisitionindicatesthattheserelatedareashavealsobeguntounderstandthevalueofdatasharing.
ManyofthecorporacontributedtothesystemweretranscribedbeforetheformulationofCHAT.
Inordertocreateauniformdatabase,wehadtoreformatthesecorporaintoCHAT.
JaneDesimone,MaryMacWhinney,JaneMorrison,KimRoth,KelleySacco,LillianJarold,AnthonyKelly,AndrewYankes,andGergelySikutaworkedmanylonghoursonthistask.
StevenGillis,HelmutFeldweg,SusanPowers,andHeikeBehrenssupervisedaparalleleffortwiththeGermanandDutchdatasets.
Becauseofthecontinuallychangingshapeoftheprogramsandthedatabase,keepingthismanualuptodatehasbeenanongoingactivity.
Inthisprocess,IreceivedhelpfromMikeBlackwell,JuliaEvans,KrisLoh,MaryMacWhinney,LucyHewson,KelleySacco,andGergelySikuta.
BarbaraPan,JeffSokolov,andPamRollinsalsoprovidedareadingofthefinaldraftofthe1995versionofthemanual.
3.
5DisseminationSincethebeginningoftheproject,CatherineSnowhascontinuallyplayedapivotalroleinshapingpolicy,buildingthedatabase,organizingworkshops,anddeterminingtheshapeofCHATandCLAN.
CatherineSnowcollaboratedwithJeffreySokolov,PamRollins,andBarbaraPantoconstructaseriesoftutorialexercisesanddemonstrationanalysesthatappearedinSokolov&Snow(1994).
Thoseexercisesformthebasisforsimilartutorialsectionsinthecurrentmanual.
CatherineSnowhascontributedsixmajorcorporatothedatabaseandhasconductedCHILDESworkshopsinadozencountries.
SeveralothercolleagueshavehelpeddisseminatetheCHILDESsystemthroughwork-shops,visits,andInternetfacilities.
HidetosiSiraiestablishedaCHILDESfileservermir-roratChukyoUniversityinJapanandStevenGillisestablishedamirrorattheUniversityofAntwerp.
StevenGillis,KimPlunkett,JohannesWagner,andSvenStrmqvisthelpedpropagatetheCHILDESsystematuniversitiesinNorthernandCentralEurope.
SusanneMiyatahasbroughttogetheravitalgroupofchildlanguageresearchersusingCHILDEStostudytheacquisitionofJapaneseandhassupervisedthetranslationofthecurrentmanualintoJapanese.
InItaly,ElenaPizzutoorganizedsymposiafordevelopingtheCHILDESsystemandhassupervisedthetranslationofthemanualintoItalian.
MagdalenaSmoczynskainKrakowandWolfgangDresslerinViennahavehelpednewresearcherswhoarelearningtouseCHILDESforlanguagesspokeninEasternEurope.
MiquelSerrahassupportedaseriesofCHILDESworkshopsinBarcelona.
ZhouJingorganizedaworkshopinNanjingandChien-juChangorganizedaworkshopinTaipei.
Part1:CHAT14TheestablishmentandpromotionofadditionalsegmentsofTalkBanknowreliesonawidearrayofinputs.
YvanRosehasspearheadedthecreationofPhonBank.
NanBernsteinRatnerhasledthedevelopmentofFluencyBank.
AudreyHolland,DavidaFromm,andMargieForbeshaveworkedtocreateAphasiaBank.
JohannesWagnerhascreatedSamtaleBankandsegmentsofCABank.
JerryGoldmandevelopedtheSCOTUSsegmentofCABank.
RoyPeacontributedtothedevelopmentofClassBank.
Withineachofthesecommunities,scoresofotherscholarshavehelpedwithdonationsofcorpora,analyses,andideas.
3.
6FundingFrom1984to1988,theJohnD.
andCatherineT.
MacArthurFoundationsupportedtheCHILDESProject.
In1988,theNationalScienceFoundationprovidedanequipmentgrantthatallowedustoputthedatabaseontheInternetandonCD-ROMs.
From1989,theCHILDESprojecthasbeensupportedbyanongoinggrantfromtheNationalInstitutesofHealth(NICHHD).
In1998,theNationalScienceFoundationLinguisticsProgramprovidedadditionalsupporttoimprovetheprogramsformorphosyntacticanalysisofthedatabase.
In1999,NSFfundedtheTalkBankproject.
In2002,NSFprovidedsupportforthedevelopmentoftheGRASPsystemforparsingofthecorpora.
In2002,NIHprovidedadditionalsupportforthedevelopmentofPhonBankforchildlanguagephonologyandAphasiaBankforthestudyofcommunicationinaphasia.
Currently(2017),NICHDisprovidingsupportforCHILDESandPhonBank;NIDCDprovidessupportforAphasiaBankandFluencyBank,NSFprovidessupportforHomeBankandFluencyBank,andNEHprovidessupportforLangBank.
Beginningin2014,TalkBankalsobecameamemberoftheCLARINfederation(clarin.
eu),asystemdesignedtocoordinateresourcesforlanguagecomputationintheHumanitiesandSocialSciences.
3.
7HowtoUseTheseManualsEachofthethreepartsoftheTalkBanksystemisdescribedinseparatesectionsoftheTalkBankmanual.
TheCHATmanualdescribestheconventionsandprinciplesofCHATtranscription.
TheCLANmanualdescribestheuseoftheeditorandtheanalyticcommands.
Thedatabasemanualisasetofoveradozensmallerdocuments,eachdescribingaseparatesegmentofthedatabase.
TolearntheTalkBanksystem,youshouldbeginbydownloadingandinstallingtheCLANprogram.
Next,youshoulddownloadandstarttoreadthecurrentmanual(CHATManual)andtheCLANmanual(Part2oftheTalkBankmanual).
BeforeproceedingtoofarintotheCHATmanual,youwillwanttowalkthroughthetutorialsectionatthebeginningoftheCHATmanual.
Afterfinishingthetutorial,tryworkingabitwitheachoftheCLANcommandstogetafeelfortheoverallscopeofthesystem.
YoucanthenlearnmoreaboutCHATbytranscribingasmallsampleofyourdatainashorttestfile.
RuntheCHECKprogramatfrequentintervalstoverifytheaccuracyofyourcoding.
Onceyouhavefinishedtranscribingasmallsegmentofyourdata,tryoutthevariousanalysispro-gramsyouplantouse,tomakesurethattheyprovidethetypesofresultsyouneedforyourwork.
IfyouareprimarilyinterestedinanalyzingdataalreadystoredinTalkBank,youdonotneedtolearntheCHATtranscriptionformatinmuchdetailandyouwillonlyneedtousePart1:CHAT15theeditortoopenandreadfiles.
Inthatcase,youmaywishtofocusyoureffortsonlearningtousetheCLANprograms.
Ifyouplantotranscribenewdata,thenyoualsoneedtoworkwiththecurrentmanualtolearntouseCHAT.
TeacherswillalsowanttopayparticularattentiontothesectionsoftheCLANmanualthatpresentatutorialintroduction.
Usingsomeoftheexamplesgiventhere,youcanconstructadditionalmaterialstoencouragestudentstoexplorethedatabasetotestoutparticularhypotheses.
TheTalkBanksystemwasnotintendedtoaddressallissuesinthestudyoflanguagelearning,ortobeusedbyallstudentsofspontaneousinteractions.
TheCHATsystemiscomprehensive,butitisnotidealforallpurposes.
Theprogramsarepowerful,buttheycannotsolveallanalyticproblems.
ItisnotthegoalofTalkBanktoprovidefacilitiesforallresearchendeavorsortoforceallresearchintosomeuniformmold.
Onthecontrary,theprogramsaredesignedtooffersupportforalternativeanalyticframeworks.
Forexample,theeditornowsupportsthevariouscodesofConversationAnalysis(CA)format,asalternativesandsupplementstoCHATformat.
.
Moreover,wehavedevelopedprogramsthatconvertbetweenCHATformatandothercommonformats,becauseweknowthatusersoftenneedtorunanalysesintheseotherformats.
3.
8ChangesTheTalkBanktoolshavebeenextensivelytestedforeaseofapplication,accuracy,andreliability.
However,changeisfundamentaltoanyresearchenterprise.
Researchersareconstantlypursuingbetterwaysofcodingandanalyzingdata.
Itisimportantthatthetoolskeepprogresswiththesechangingrequirements.
Forthisreason,therewillberevisionstoCHAT,theprograms,andthedatabaseaslongastheTalkBankProjectisactive.
Part1:CHAT164PrinciplesTheCHATsystemprovidesastandardizedformatforproducingcomputerizedtran-scriptsofface-to-faceconversationalinteractions.
Theseinteractionsmayinvolvechildrenandparents,doctorsandpatients,orteachersandsecond-languagelearners.
Despitethedifferencesbetweentheseinteractions,thereareenoughcommonfeaturestoallowforthecreationofasinglegeneraltranscriptionsystem.
Thesystemdescribedhereisdesignedforusewithbothnormalanddisorderedpopulations.
Itcanbeusedwithlearnersofalltypes,includingchildren,second-languagelearners,andadultsrecoveringfromaphasicdisor-ders.
Thesystemprovidesoptionsforbasicdiscoursetranscriptionaswellasdetailedpho-nologicalandmorphologicalanalysis.
Thesystembearstheacronym"CHAT,"whichstandsforCodesfortheHumanAnalysisofTranscripts.
CHATisthestandardtranscrip-tionsystemfortheTalkBankandCHILDES(ChildLanguageDataExchangeSystem)Projects.
AllofthetranscriptsintheTalkBankdatabasesareinCHATformat.
WhatmakesCHATparticularlypowerfulisthefactthatfilestranscribedinCHATcanalsobeanalyzedbytheCLANprogramsthataredescribedintheCLANmanual,whichisanelectroniccompanionpiecetothismanual.
TheCHATprogramscantrackawidevarietyofstructures,computeautomaticindices,andanalyzemorphosyntax.
Moreover,becauseallCHATfilescannowalsobetranslatedtoahighlystructuredformofXML(alanguageusedfortextdocumentsontheweb),theyarenowalsocompatiblewithawiderangeofotherpowerfulcomputerprogramssuchasELAN,Praat,EXMARaLDA,Phon,Transcriber,andsoon.
TheTalkBanksystemhashadamajorimpactonthestudyofchildlanguage.
Atthetimeofthelastmonitoringin2016,therewereover7000publishedarticlesthathadmadeuseoftheprogramsanddatabase.
In2016,thesizeofthedatabasehadgrowntoover110millionwords,makingitbyfarthelargestdatabaseofconversationalinteractionsavailableanywhere.
Thetotalnumberofresearcherswhohavejoinedasmembersacrossthelengthoftheprojectisnowover5000.
Ofcourse,notallofthesepeoplearemakingactiveuseofthetoolsatalltimes.
However,itissafetosaythat,atanygivenpointintime,wellover100groupsofresearchersaroundtheworldareinvolvedinnewdatacollectionandtranscriptionusingtheCHATsystem.
Eventuallythedatacollectedinthesevariousprojectswillallbecontributedtothedatabase.
4.
1ComputerizationPublicinspectionofexperimentaldataisacrucialprerequisiteforseriousscientificprogress.
Imaginehowgeneticswouldfunctionifeveryexperimenterhadhisorherownindividualstrainofpeasordrosophilaandrefusedtoallowthemtobetestedbyotherexperimenters.
Whatwouldhappeningeology,ifeveryscientistkepthisorherownsetofrockspecimensandrefusedtocomparethemwiththoseofotherresearchersInsomefieldsthebasicphenomenainquestionaresoclearlyopentopublicinspectionthatthisisnotaproblem.
Thebasicfactsofplanetarymotionareopenforalltosee,asarethebasicfactsunderlyingNewtonianmechanics.
Unfortunately,inlanguagestudies,afreeandopensharingandexchangeofdatahasnotalwaysbeenthenorm.
Inearlierdecades,researchersjealouslyguardedtheirfieldnotesfromaparticularlanguagecommunityofsubjecttype,refusingtosharethemopenlywiththebroadercommunity.
Variousjustificationsweregivenforthispractice.
Itwassome-Part1:CHAT17timesclaimedthatotherresearcherswouldnotfullyappreciatethenatureofthedataorthattheymightmisrepresentcrucialpatterns.
Sometimes,itwasclaimedthatonlysomeonewhohadactuallyparticipatedinthecommunityortheinteractioncouldunderstandthenatureofthelanguageandtheinteractions.
Insomecases,theselimitationswererealandimportant.
However,allsuchrestrictionsonthesharingofdatainevitablyimpedetheprogressofthescientificstudyoflanguagelearning.
Withinthefieldoflanguageacquisitionstudiesitisnowunderstoodthattheadvantagesofsharingdataoutweighthepotentialdangers.
Thequestionisnolongerwhetherdatashouldbeshared,butratherhowtheycanbesharedinareliableandresponsiblefashion.
Thecomputerizationoftranscriptsopensupthepossibilityformanytypesofdatasharingandanalysisthatotherwisewouldhavebeenimpossible.
However,thefullexploitationofthisopportunityrequiresthedevelopmentofastandardizedsystemfordatatranscriptionandanalysis.
4.
2WordsofCautionBeforeexaminingtheCHATsystem,weneedtoconsidersomedangersinvolvedincomputerizedtranscriptions.
Thesedangersarisefromtheneedtocompressacomplexsetofverbalandnonverbalmessagesintotheextremelynarrowchannelrequiredforthecomputer.
Inmostcases,thesedangersalsoexistwhenonecreatesatypewrittenorhand-writtentranscript.
Letuslookatsomeofthedangerssurroundingtheenterpriseoftranscription.
4.
2.
1TheDominanceoftheWrittenWordPerhapsthegreatestdangerfacingthetranscriberisthetendencytotreatspokenlan-guageasifitwerewrittenlanguage.
Thedecisiontowriteoutstretchesofvocalmaterialusingtheformsofwrittenlanguagecantriggeravarietyoftheoreticalcommitments.
AsOchs(1979)showedsoclearly,thesedecisionswillinevitablyturntranscriptionintoatheoreticalenterprise.
Themostdifficultbiastoovercomeisthetendencytomapeveryformspokenbyalearner–beitachild,anaphasic,orasecond-languagelearner–ontoasetofstandardlexicalitemsintheadultlanguage.
Transcriberstendtoassimilatenonstandardlearnerstringstostandardformsoftheadultlanguage.
Forexample,whenachildsays"putonmyjamas,"thetranscribermayinsteadenter"putonmypajamas,"reasoningunconsciouslythat"jamas"issimplyachildishformof"pajamas.
"Thistypeofregularizationofthechildformtotheadultlexicalnormcanleadtomisunderstandingoftheshapeofthechild'slexicon.
Forexample,itcouldbethecasethatthechilduses"jamas"and"pajamas"torefertotwoverydifferentthings(Clark,1987;MacWhinney,1989).
Therearetwotypesoferrorspossiblehere.
Oneinvolvesmappingalearner'sspokenformontoanadultformwhen,infact,therewasnorealcorrespondence.
Thisistheprob-lemofovernormalization.
Thesecondtypeoferrorinvolvesfailingtomapalearner'sspo-kenformontoanadultformwhen,infact,thereisacorrespondence.
Thisistheproblemofundernormalization.
ThegoaloftranscribersshouldbetoavoidboththeScyllaofover-normalizationandtheCharybdisofundernormalization.
Steeringacoursebetweenthesetwodangersisnoeasymatter.
Atranscriptionsystemcanprovidedevicestoaidinthisprocess,butitcannotguaranteesafepassage.
Part1:CHAT18Transcribersalsooftentendtoassimilatetheshapeofsoundsspokenbythelearnertotheshapesthataredictatedbymorphosyntacticpatterns.
Forexample,Fletcher(1985)not-edthatbothchildrenandadultsgenerallyproduce"have"as"uv"beforemainverbs.
Asaresult,formslike"mighthavegone"assimilateto"mightuvgone.
"Fletcherbelievedthatyoungerchildrenhavenotyetlearnedtoassociatethefullauxiliary"have"withthecon-tractedform.
Ifwewritethechildren'sformsas"mighthave,"wethenendupmischarac-terizingthestructureoftheirlexicon.
Totakeanotherexample,wecannotethat,inFrench,thevariousendingsoftheverbinthepresenttensearedistinguishedinspelling,whereastheyarehomophonousinspeech.
Ifachildsays/mn/"eat,"arewetotranscribeitasfirstpersonsingularmange,assecondpersonsingularmanges,orastheimperativemangeIfthechildsays/me/,shouldwetranscribeitastheinfinitivemanger,theparticiplemangé,orthesecondpersonformalmangezCHATdealswiththeseproblemsinthreeways.
First,itusesIPAasauniformwayoftranscribingdiscoursephonetically.
Second,theeditorallowstheusertolinkthedigitizedaudiorecordoftheinteractiondirectlytothetranscript.
Thisisthesystemcalled"sonicCHAT.
"WiththesesonicCHATlinks,itispossibletodouble-clickonasentenceandhearitssoundimmediately.
Havingtheactualsoundproducedbythechilddirectlyavailableinthetranscripttakessomeoftheburdenoffofthetranscriptionsystem.
However,whenevercomputerizedanalysesarebasednotontheoriginalaudiosignalbutontranscribedorthographicforms,onemustcontinuetounderstandthelimitsoftranscriptionconventions.
Third,forthosewhowishtoavoidtheworkinvolvedinIPAtranscriptionorsonicCHAT,thatisasystemforusingnonstandardlexicalforms,thattheform"might(h)ave"wouldbeuniversallyrecognizedasthespellingof"mightof",thecontractedformof"mighthave.
"Moreextremecasesofphonologicalvariationcanbeannotatedasinthisexample:popo[:hippopotamus].
4.
2.
2TheMisuseofStandardPunctuationTranscribershaveatendencytowriteoutspokenlanguagewiththepunctuationcon-ventionsofwrittenlanguage.
Writtenlanguageisorganizedintoclausesandsentencesde-limitedbycommas,periods,andothermarksofpunctuation.
Spokenlanguage,ontheotherhand,isorganizedintotoneunitsclusteredaboutatonalnucleusanddelineatedbypausesandtonalcontours(Crystal,1969,1979;Halliday,1966,1967,1968).
Workonthediscoursebasisofsentenceproduction(Chafe,1980;Jefferson,1984)hasdemonstratedacloselinkbetweentoneunitsandideationalunits.
Retracings,pauses,stress,andallformsofintonationalcontoursarecrucialmarkersofaspectsoftheutteranceplanningprocess.
Moreover,thesefeaturesalsoconveyimportantsociolinguisticinformation.
Withinspecialmarkingsorconventions,thereisnowaytodirectlyindicatetheseimportantaspectsofinteractions.
4.
2.
3WorkingWithVideoWhateverformatranscriptmaytake,itwillnevercontainafullyaccuraterecordofwhatwentoninaninteraction.
Atranscriptofaninteractioncanneverfullyreplaceanaudiotape,becauseanaudiorecordingoftheinteractionwillalwaysbemoreaccurateintermsofpreservingtheactualdetailsofwhattranspired.
Bythesametoken,anaudiorecordingcanneverpreserveasmuchdetailasavideorecordingwithahigh-qualityaudiotrack.
AudiorecordingsrecordnoneofthenonverbalinteractionsthatoftenformthePart1:CHAT19backboneofaconversationalinteraction.
Hence,theysystematicallyexcludeasourceofinformationthatiscrucialforafullinterpretationoftheinteraction.
Althoughtherearebiasesinvolvedeveninavideorecording,itisstillthemostaccuraterecordofaninteractionthatwehaveavailable.
Forthosewhoaretryingtousetranscriptiontocapturethefulldetailedcharacterofaninteraction,itisimperativethattranscriptionbedonefromavideorecordingwhichshouldberepeatedlyconsultedduringallphasesofanalysis.
WhentheCLANeditorisusedtolinktranscriptstoaudiorecordings,werefertothisassonicCHAT.
Whenthesystemisusedtolinktranscriptstovideorecordings,werefertothisasvideoCHAT.
TheCLANmanualexplainshowtolinkdigitalaudioandvideototranscripts.
4.
3ProblemsWithForcedDecisionsTranscriptionandcodingsystemsoftenforcetheusertomakedifficultdistinctions.
Forexample,asystemmightmakeadistinctionbetweengrammaticalellipsisandungrammaticalomission.
However,itmayoftenbethecasethattheusercannotdecidewhetheranomissionisgrammaticalornot.
Inthatcase,itmaybehelpfultohavesomewayofblurringthedistinction.
CHAThascertainsymbolsthatcanbeusedwhenacategorizationcannotbemade.
ItisimportanttorememberthatmanyoftheCHATsymbolsareentirelyoptional.
Wheneveryoufeelthatyouarebeingforcedtomakeadistinction,checkthemanualtoseewhethertheparticularcodingchoiceisactuallyrequired.
Ifitisnotrequired,thensimplyomitthecodealtogether.
4.
4TranscriptionandCodingItisimportanttorecognizethedifferencebetweentranscriptionandcoding.
Transcrip-tionfocusesontheproductionofawrittenrecordthatcanleadustounderstand,albeitonlyvaguely,theflowoftheoriginalinteraction.
Transcriptionmustbedonedirectlyoffanaudiotapeor,preferably,avideotape.
Coding,ontheotherhand,istheprocessofrecognizing,analyzing,andtakingnoteofphenomenaintranscribedspeech.
Codingcanoftenbedonebyreferringonlytoawrittentranscript.
Forexample,thecodingofpartsofspeechcanbedonedirectlyfromatranscriptwithoutlisteningtotheaudiotape.
Forothertypesofcoding,suchasspeechactcoding,itisimperativethatcodingbedonewhilewatchingtheoriginalvideotape.
TheCHATsystemincludesconventionsforbothtranscriptionandcoding.
Whenfirstlearningthesystem,itisbesttofocusonlearninghowtotranscribe.
TheCHATsystemoffersthetranscriberalargearrayofcodingoptions.
Althoughfewtranscriberswillneedtousealloftheoptions,everyoneneedstounderstandhowbasictranscriptionisdoneonthe"mainline.
"Additionalcodingisdoneprincipallyonthesecondaryor"dependent"tiers.
Astranscribersworkmorewiththeirdata,theywillincludefurtheroptionsfromthesecondaryor"dependent"tiers.
However,thebeginningusershouldfocusfirstonlearningtocorrectlyusetheconventionsforthemainline.
Themanualincludesseveralsampletranscriptstohelpthebeginnerinlearningthetranscriptionsystem.
4.
5ThreeGoalsLikeotherformsofcommunication,transcriptionsystemsaresubjectedtoavarietyofcommunicativepressures.
TheviewoflanguagestructuredevelopedbySlobin(1977)seesPart1:CHAT20structureasemergingfromthepressureofthreeconflictingchargesorgoals.
Ontheonehand,languageisdesignedtobeclear.
Ontheotherhand,itisdesignedtobeprocessiblebythelistenerandquickandeasyforthespeaker.
Unfortunately,easeofproductionoftencomesinconflictwithclarityofmarking.
Thecompetitionbetweenthesethreemotivesleadstoavarietyofimperfectsolutionsthatsatisfyeachgoalonlypartially.
Suchimperfectandunstablesolutionscharacterizethegrammarandphonologyofhumanlanguage(Bates&MacWhinney,1982).
Onlyrarelydoesasolutionsucceedinfullyachievingallthreegoals.
Slobin'sviewofthepressuresshapinghumanlanguagecanbeextendedtoanalyzethepressuresshapingatranscriptionsystem.
Inmanyregards,atranscriptionsystemismuchlikeanyhumanlanguage.
Itneedstobeclearinitsmarkingsofcategories,andstillpreservereadabilityandeaseoftranscription.
However,transcriptsaddressratherdifferentaudiences.
Oneaudienceisthehumanaudienceoftranscribers,analysts,andreaders.
Theotheraudienceisthedigitalcomputeranditsprograms.
Todealwiththesetwoaudiences,asystemforcomputerizedtranscriptionneedstoachievethefollowinggoals:Clarity:Everysymbolusedinthecodingsystemshouldhavesomeclearanddefinablereal-worldreferent.
Symbolsthatmarkparticularwordsshouldalwaysbespelledinaconsistentmanner.
Symbolsthatmarkparticularconversationalpatternsshouldrefertoconsistentlyobservablepatterns.
CodesmuststeerbetweentheScyllaofoverregularizationandtheCharybdisofunderregularizationdiscussedearlier.
Distinctionsmustavoidbeingeithertoofineortoocoarse.
Anotherwayoflookingatclarityisthroughthenotionofsystematicity.
Codes,words,andsymbolsmustbeusedinaconsistentmanneracrosstranscripts.
Ideally,eachcodeshouldalwayshaveauniquemeaningindependentofthepresenceofothercodesortheparticulartranscriptinwhichitislocated.
Ifinteractionsarenecessary,asinhierarchicalcodingsystems,theseinteractionsneedtobesystematicallydescribed.
Readability:Justashumanlanguageneedstobeeasytoprocess,sotranscriptsneedtobeeasytoread.
Thisgoaloftenrunsdirectlycountertothefirstgoal.
IntheTalkBanksystem,wehaveattemptedtoprovideavarietyofCHAToptionsthatwillallowausertomaximizethereadabilityofatranscript.
Wehavealsoprovidedclantoolsthatwillallowareadertosuppressthelessreadableaspectsintranscriptwhenthegoalofreadabilityismoreimportantthanthegoalofclarityofmarking.
Easeofdataentry:Asdistinctionsproliferatewithinatranscriptionsystem,dataentrybecomesincreasinglydifficultanderror-prone.
Therearetwowaysofdealingwiththisproblem.
Onemethodattemptstosimplifythecodingschemeanditscategories.
Theproblemwiththisapproachisthatitsacrificesclarity.
Thesecondmethodattemptstohelpthetranscriberbyprovidingcomputationalaids.
TheCLANprogramsfollowthispath.
Theyprovidesystemsfortheautomaticcheckingoftranscriptionaccuracy,methodsfortheautomaticanalysisofmorphologyandsyntax,andtoolsforthesemiautomaticentryofcodes.
However,thebasicprocessoftranscriptionhasnotbeenautomatedandremainsthemajortaskduringdataentry.
Part1:CHAT215minCHATCHATprovidesbothbasicandadvancedformatsfortranscriptionandcoding.
ThebasiclevelofCHATiscalledminCHAT.
NewusersshouldstartbylearningminCHAT.
Thissystemlooksmuchlikeotherintuitivetranscriptionsystemsthatareingeneraluseinthefieldsofchildlanguageanddiscourseanalysis.
However,eventuallyuserswillfindthatthereissomethingtheywanttobeabletocodethatgoesbeyondminCHAT.
Atthatpoint,theyshouldmoveontolearningtheadditionalfeaturesofCHATthatarerelevantforthetypeofworkingtheyaredoing.
5.
1minCHAT–theFormofFilesThereareseveralminimumstandardsfortheformofaminCHATfile.
ThesestandardsmustbefollowedfortheCLANcommandstorunsuccessfullyonCHATfiles:1.
Everylinemustendwithacarriagereturn.
2.
Thefirstlineinthefilemustbean@Beginheaderline.
3.
Thesecondlineinthefilemustbean@Languagesheaderline.
Thelanguagesenteredhereuseathree-letterISO639-3code,suchas"eng"forEnglish.
4.
Thethirdlinemustbean@Participantsheaderlinelistingthree-lettercodesforeachparticipant,theparticipant'sname,andtheparticipant'srole.
5.
Afterthe@Participantsheadercomeasetof@IDheadersprovidingfurtherdetailsforeachspeaker.
ThesewillbeinsertedautomaticallyforyouwhenyourunCHECKusingescape-L.
6.
Thelastlineinthefilemustbean@Endheaderline.
7.
Linesbeginningwith*indicatewhatwasactuallysaid.
Thesearecalled"mainlines.
"Eachmainlineshouldcodeoneandonlyoneutterance.
Whenaspeakerproducesseveralutterancesinarow,codeeachwithanewmainline.
8.
Aftertheasteriskonthemainlinecomesathree-lettercodeinuppercaselettersfortheparticipantwhowasthespeakeroftheutterancebeingcoded.
Afterthethree-lettercodecomesacolonandthenatab.
9.
Whatwasactuallysaidisenteredstartingintheninthcolumn.
10.
Linesbeginningwiththe%symbolcancontaincodesandcommentaryregardingwhatwassaid.
Theyarecalled"dependenttier"lines.
The%symbolisfollowedbyathree-lettercodeinlowercaselettersforthedependenttiertype,suchas"pho"forphonology;acolon;andthenatab.
Thetextofthedependenttierbeginsafterthetab.
11.
ContinuationsofmainlinesanddependenttierlinesbeginwithatabwhichisinsertedautomaticallybytheCLANeditor.
5.
2minCHAT–WordsandUtterancesInadditiontotheseminimumrequirementsfortheformofthefile,therearecertainminimumwaysinwhichutterancesandwordsshouldbewrittenonthemainline:1.
Utterancesmustendwithanutteranceterminator.
Thebasicutteranceterminatorsaretheperiod,theexclamationmark,andthequestionmark.
Thesecanbeprecededbyaspace,butthespaceisnotrequired.
2.
Commascanbeusedasneededtomarkphrasaljunctions,buttheyarenotusedbytheprogramsandhavenosharpprosodicdefinition.
Part1:CHAT223.
Useuppercaselettersonlyforpropernounsandtheword"I.
"Donotuseuppercaselettersforthefirstwordsofsentences.
Thiswillfacilitatetheidentificationofpropernouns.
4.
Tofacilitaterecognitionofpropernounsandavoidmisspellings,wordsshouldnotcontaincapitallettersexceptattheirbeginning.
Wordsshouldnotcontainnumbers,unlessthesemarktones.
5.
Unintelligiblewordswithanunclearphoneticshapeshouldbetranscribedasxxx.
6.
Ifyouwishtonotethephonologicalformofanincompleteorunintelligiblepho-nologicalstring,writeitoutwithanampersand,asin&guga.
7.
Incompletewordscanbewrittenwiththeomittedmaterialinparentheses,asin(be)causeand(a)bout.
Hereisasamplethatillustratestheseprinciples.
ThisfileissyntacticallycorrectandusestheminimumnumberofCHATconventionswhilestillmaintainingcompatibilitywiththeCLANcommands.
@Begin@Languages:eng@Participants:CHIRossChild,FATBrianFather@ID:eng|macwhinney|CHI|2;10.
10||||Target_Child|||@ID:eng|macwhinney|FAT|35;2.
||||Target_Child|||*ROS:whyisn'tMommycoming%com:MotherusuallypicksRossuparound4PM.
*FAT:don'tworry.
*FAT:she'llbeheresoon.
*CHI:good.
@End5.
3AnalyzingOneSmallFileForresearcherswhoarejustnowbeginningtouseCHATandCLAN,thereisonesinglesuggestionthatcanpotentiallysaveliterallyhundredsofhoursofwastedtime.
Thesuggestionistotranscribeandanalyzeonesinglesmallfilecompletelyandperfectlybeforelaunchingamajoreffortintranscriptionandanalysis.
TheideaisthatyoushouldlearnjustenoughaboutminCHATandminCLANtoseeyourpaththroughthesefourcrucialsteps:1.
entryofasmallsetofyourdataintoaCHATfile,2.
successfulrunningoftheCHECKcommandinsidetheeditortoguaranteeaccuracyinyourCHATfile,3.
developmentofaseriesofcodesthatwillinterfacewiththeparticularCLANcommandsmostappropriateforyouranalysis,and4.
runningoftherelevantCLANcommands,sothatyoucanbesurethattheresultsyouwillgetwillproperlytestthehypothesesyouwishtodevelop.
Ifyougothroughthesestepsfirst,youcanguaranteeinadvancethesuccessfuloutcomeofyourproject.
Youcanavoidendingupinasituationinwhichyouhavetranscribedhundredsofhoursofdatainawaythatdoesnotmatchcorrectlywiththeinputrequire-mentsforCLAN.
Part1:CHAT235.
4NextStepsAfterhavinglearnedminCHAT,youarereadytolearnthebasicsofCLAN.
Todothis,youwillwanttoworkthroughthefirstchaptersoftheCLANmanualfocusinginparticularontheCLANtutorial.
ThesechapterswilltakeyouuptothelevelofminCLAN,whichcorrespondstotheminCHATlevel.
OnceyouhavelearnedminCHATandminCLAN,youarereadytomoveontolearningtherestofthesystem.
Youshouldnextworkthroughthechaptersonwords,utterances,andscopedsymbols.
Dependingontheshapeofyourparticularproject,youmaythenneedtostudyadditionalchaptersinthismanual.
Forpeopleworkingonlargeprojectsthatlastmanymonths,itisagoodideatoeventuallyreadallofthecurrentmanual,althoughsomesectionsthatseemlessrelevanttotheprojectcanbeskimmed.
5.
5CheckingSyntacticAccuracyEachCLANcommandrunsaverysuperficialchecktoseeifafileconformstomin-CHAT.
Thischecklooksonlytoseethateachlinebeginswitheitherataboraspace.
ThisistheminimumthattheCLANcommandsmusthavetofunction.
However,thecorrectfunctioningofmanyofthefunctionsofCLANdependsonadherencetofurtherstandardsforminCHAT.
Inordertomakesurethatafilematchestheseminimumrequire-mentsforcorrectanalysisthroughCLAN,researchersshouldruneachfilethroughtheCHECKprogram.
TheCHECKcommandcanberundirectlyinsidetheeditor,sothatyoucanverifytheaccuracyofyourtranscriptionasyouareproducingit.
CHECKwilldetecterrorssuchasfailuretostartlineswiththecorrectsymbols,useofincorrectspeakercodes,ormissing@Beginand@Endsymbols.
CHECKcanalsobeusedtofinderrorsinCHATcodingbeyondthosediscussedinthischapter.
UsingCHECKislikebrushingyourteeth.
Itmaybehardatfirsttoremembertousethecommand,butthemoreyouuseittheeasieritbecomesandthebetterthefinalresults.
Part1:CHAT246CorpusOrganization6.
1FileNamingEachTalkBankdatabaseconsistsofacollectionofcorpora,organizedintolargerfoldersbylanguagesandlanguagegroups.
Forexample,thereisatop-levelfoldercalledRomanceinwhichonefindssubfoldersforSpanish,French,andotherRomancelanguages.
WithintheSpanishfolder,therearethendozensoffurtherfolders,eachofwhichhasasinglecorpus.
Withacorpus,filesmaybefurthergroupedbyindividualchildrenorgroupsofchildren.
Forlongitudinalcorpora,werecommendthatfilenamesusetheageofthechildfollowedbyaletterifthereareseveralrecordingsfromagivenday.
Forexample,thetranscriptfromthefourthtapingsessionwhenthechildwas2;3;22wouldbecalled20322d.
cha.
Itisbettertouseagesforfilenames,ratherthandatesorothermaterial.
6.
2MetadataIncreasingly,researchersrelyonInternetsystemstolocateandretrievelanguagedataandresources.
TherearecurrentlyseveralsystemsdesignedtofacilitatethisprocessandwehaveadaptedtheindexingandregistrationofmaterialsintheCHILDESandTalkBanksystemstoprovideinformationthatcanbeincorporatedintothesesystems.
ThetwosystemsdesignedspecificallytodealwithlinguisticdataareOLAC(OnlineLanguageArchivesCommunityatwww.
language-archives.
org)andVLO(VirtualLanguageObservatoryatvlo.
clarin.
eu).
Thesesystemsallowresearcherstosearchforwholecorporaorsinglefiles,usingtermssuchasCantonese,video,gesture,oraphasia.
InordertopublishorregisterTalkBankdatawithinthesesystems,wecreatea0metadata.
cdcfileatthetoplevelofeachcorpusinTalkBank.
SomeofthefieldsinthismetadatafilearedesignedforindexinginOLACandsomearedesignedfortheCMDIsystemusedbyVLOandtherelatedfacilitycalledTheLanguageArchive(tla.
mpi.
nl).
Becauseofthehighlyspecificnatureofthetermsandthesoftwareusedforregularharvestingandpublicationofthesedata,wedonotrequireuserstocreatethe0metadata.
cdcfiles.
Thefollowingtableexplainswhatkeywordsareexpectedwithineachfieldofthesefiles.
ThefirstfieldslistedareforOLACandthelateronesareforCMDI.
ForCMDI,thevaluesunknownandunspecifiedarealsoavailableformostofthefields.
FieldExampleValuesCMDI_PID:11312/c-00041631-1SetbyHandleServersystemTitle:BilingualAarsenBosCorpusopenCreator:Aarssen,JeroenopenCreator:Bos,PetraopenSubject:childlanguagedevelopmentSubject.
olac:linguistic-field:language_acquisitionSubject.
olac:language:ndlISO-639Subject.
olac:language:turISO-639Subject.
olac:language:araISO-639Subject.
childes:participant:age="4-10"openDescription:openPublisher:TalkBankopenContributor:Aarssen,JeroenopenDate:2004-03-30YEAR-MM-DDPart1:CHAT25Type:Texttext,video,Type.
olac:linguistic-type:primary_textlexicon,primary_text,language_descriptionType.
olac:discourse-type:dialoguedialogue,drama,formulaic,ludic,oratory,narrative,procedural,report,singing,unintelligiblespeechFormat:Identifier:1-59642-132-0ISBNLanguage:ISO-639Relation:openCoverage:openRights:openIMDI_Genre:discourseIMDI_Interactivity:interactive(default)interactive,non-interactive,semi-interactiveIMDI_PlanningType:spontaneous(default)spontaneous,semi-spontaneous,plannedIMDI_Involvement:non-elicited(default)elicited,non-elicited,no-observerIMDI_SocialContext:family(default)family,private,public,controlledenvironment,talkshow,shopping,face_to_face,lecture,legal,religious,sports,tutorial,classroom,medicalwork,meeting,clinic,telechat,phonecall,computer,constructedIMDI_EventStructure:conversation(default)monologue,dialogue,conversation,notanaturalformatIMDI_Task:unspecified(default)openIMDI_Modalities:spoken(default)unknown,unspecified,spoken,written,musicnotation,gestures,pointing-gestures,signs,eye-gaze,facial-expressions,emotional-state,haptic,song,instrumentalmusic,other'IMDI_Subject:unspecified(default)openIMDI_EthnicGroup:unspecified(default)openIMDI_RecordingConditions:unspecified(default)openIMDI_AccessAvailability:openaccess(default)openIMDI_Continent:EuropeDublinCoreIMDI_Country:NetherlandsDublinCoreIMDI_ProjectDescription:openIMDI_MediaFileDescription:openIMDI_WrittenResourceSubType:openFortheCMDI/IMDI/VLO/CLARINsystem,theremustbeacmdi.
xmlfileforeachtranscript.
Tocreatetheseseveralthousandfiles,weuseaCLANprogramthattakestheinformationfromthe0metadata.
cdcfilesandfromtheheaderlinesineachtranscript.
Theinformationinthe@IDfieldisparticularlyimportantinthisprocess.
ItalsoreliesonthePart1:CHAT26factthatweuseanisomorphicfilesystemforindexingmediafiles.
Fortunately,usersdonotneedtoconcernthemselveswithallthesemanyadditionaltechnicaldetails.
6.
3TheDocumentationFileCHATfilestypicallyrecordaconversationalsamplecollectedfromaparticularsetofspeakersonaparticularday.
Sometimesresearchersstudyasmallsetofchildrenrepeatedlyoveralongperiodoftime.
Corporacreatedusingthismethodarereferredtoaslongitudinalstudies.
Forsuchstudies,itisbesttobreakupCHATfilesintoonecollectionforeachchild.
Thiscanbedonejustbycreatingfilenamesthatbeginwiththethreelettercodeforthechild,asinlea001.
chaoreve15.
cha.
Eachcollectionoffilesfromthechildreninvolvedinagivenstudyconstitutesacorpus.
Acorpuscanalsobecomposedofagroupoffilesfromdifferentgroupsofspeakerswhenthefocusisonacross-sectionalsamplingoflargernumbersoflanguagelearnersfromvariousagegroups.
Ineithercase,eachcorpusshouldhaveadocumentationfile.
This"readme"fileshouldcontainabasicsetoffactsthatareindispensablefortheproperinterpretationofthedatabyotherresearchers.
Theminimumsetoffactsthatshouldbeineachreadmefilearethefollowing.
Acknowledgments.
Thereshouldbeastatementthataskstheusertocitesomeparticularreferencewhenusingthecorpus.
Forexample,researchersusingtheAdam,Eve,andSarahcorporafromRogerBrownandhiscolleaguesareaskedtociteBrown(1973).
Inaddition,alluserscancitethiscurrentmanualasthesourcefortheTalkBanksystemingeneral.
Restrictions.
IfthedataarebeingcontributedtoTalkBank,contributorscansetparticularrestrictionsontheuseoftheirdata.
Forexample,researchersmayaskthattheybesentcopiesofarticlesthatmakeuseoftheirdata.
Manyresearchershavechosentosetnolimitationsatallontheuseoftheirdata.
Warnings.
Thisdocumentationfileshouldalsowarnotherresearchersaboutlimitationsontheuseofthedata.
Forexample,ifaninvestigatorpaidnoattentiontocorrecttranscriptionofspeecherrors,thisshouldbenoted.
Pseudonyms.
Thereadmefileshouldalsoincludeinformationonwhetherinformantsgaveinformedconsentfortheuseoftheirdataandwhetherpseudonymshavebeenusedtopreserveinformantanonymity.
Ingeneral,realnamesshouldbereplacedbypseudonyms.
Anonymizationisnotnecessarywhenthesubjectofthetranscriptionsistheresearcher'sownchild,aslongasthechildgrantspermissionfortheuseofthedata.
History.
Thereshouldbedetailedinformationonthehistoryoftheproject.
HowwasfundingobtainedWhatwerethegoalsoftheprojectHowwasdatacollectedWhatwasthesamplingprocedureHowwastranscriptiondoneWhatwasignoredintranscriptionWeretranscriberstrainedWasreliabilitycheckedWascodingdoneWhatcodeswereusedWasthematerialcomputerizedHowCodes.
Ifthereareproject-specificcodes,theseshouldbedescribed.
Biographicaldata.
Wherepossible,extensivedemographic,dialectological,andpsychometricdatashouldbeprovidedforeachinformant.
Thereshouldbeinformationontopicssuchasage,gender,siblings,schooling,socialclass,occupation,previousresidences,religion,interests,friends,andsoforth.
Informationonwheretheparentsgrewupandthevariousresidencesofthefamilyisparticularlyimportantinattemptingtounderstandsociolinguisticissuesregardinglanguagechange,regionalism,anddialect.
Part1:CHAT27Withoutdetailedinformationaboutspecificdialectfeatures,itisdifficulttoknowwhethertheseparticularmarkersarebeingusedthroughoutthelanguageorjustincertainregions.
Situationaldescriptions.
Thereadmefileshouldincludedescriptionsofthecontextsoftherecordings,suchasthelayoutofthechild'shomeandbedroomorthenatureoftheactivitiesbeingrecorded.
Additionalspecificsituationalinformationshouldbeincludedinthe@Situationand@Commentfieldsineachfile.
Part1:CHAT287FileHeadersThethreemajorcomponentsofaCHATtranscriptarethefileheaders,themaintier,andthedependenttiers.
Inthischapterwediscusscreatingthefirstmajorcomponent–thefileheaders.
AcomputerizedtranscriptinCHATformatbeginswithaseriesof"header"lines,whichtellsusaboutthingssuchasthedateoftherecording,thenamesofthepar-ticipants,theagesoftheparticipants,thesettingoftheinteraction,andsoforth.
Aheaderisalineoftextthatgivesinformationabouttheparticipantsandthesetting.
Allheadersbeginwiththe"@"sign.
Someheadersrequirenothingmorethanthe@signandtheheadername.
Theseare"bare"headerssuchas@Beginor@NewEpisode.
How-ever,mostheadersrequirethattherebesomeadditionalmaterial.
Thisadditionalmaterialiscalledan"entry.
"Headersthattakeentriesmusthaveacolon,whichisthenfollowedbyoneortwotabsandtherequiredentry.
Bydefault,tabsareusuallyunderstoodtobeplacedateight-characterintervals.
Thematerialuptothecoloniscalledthe"headername.
"Intheexamplefollowing,"@Media"and"@Date"arebothheadernames@Media:abe88,video@Date:25-JAN-1983Thetextthatfollowstheheadernameiscalledthe"headerentry.
"Here,"abe88movie"and"25-JAN-1983"aretheheaderentries.
Theheadernameandtheheaderentrytogetherarecalledthe"headerline.
"Theheaderlineshouldneverhaveapunctuationmarkattheend.
InCHAT,onlyutterancesactuallyspokenbythesubjectsreceivefinalpunctuation.
Thischapterpresentsasetofheadersthatresearchershaveconsideredimportant.
Exceptforthe@Begin,@Languages,@Participants,@ID,and@Endheaders,noneoftheheadersarerequiredandyoushouldfeelfreetouseonlythoseheadersthatyoufeelareneededfortheaccuratedocumentationofyourcorpus.
7.
1HiddenHeadersCHATusesfivetypesofheaders:hidden,initial,participant-specific,constant,andchangeable.
Intheeditor,CHATfilesappeartobeginwiththe@Beginheader.
However,thereareactuallyfivehiddenheadersthatappearbeforethisheader.
Theseheadersare@UTF8,@PID,@ColorWords,@Window,and@Fontwhichappearinthatorder.
Allareoptional,exceptforthe@UTF8header.
@UTF8AllfilesinthedatabaseusethisheadertomarkthefactthattheyareencodedinUTF8.
IfthefilewasproducedoutsideofCLANandthisheaderismissing,CLANwillcomplainandasktheusertoverifywhetherthefileshouldbereadinUTF8.
OftenthismeansthattheusershouldruntheCP2UTFprogramtoconvertthefiletoUTF8.
@PIDThishiddenheaderfollowsafterthe@UTFheaderanditdeclaresthevalueofthetranscriptfortheHandleSystem(www.
handle.
net)thatallowsforpersistentidentificationPart1:CHAT29ofthelocationofdigitalobjects.
ThesenumbersarethenfurtherprocessedusingtheCMDImetadataschemeforpublicationandharvestingoverthewebthroughtheCLARIN(www.
clarin.
eu)schemathatcreatesaccessthroughTLA(TheLanguageArchive;https://tla.
mpi.
nl/)andtheVLO(VirtualLanguageObservatory;https://vlo.
clarin.
edu),aswellasparallelmethodsfromOLAC(OnlineLanguageArchivesCommunity).
ThesevaluescanbeenteredintoanysystemthatresolvesPIDstolocatetherequiredresource.
ForexampleoneofthefilesfromtheMacWhinneycorpushasthisnumber11312/c-00044068-1whichreferstotheCMDImetadatafileforthattranscript.
Themetadatacanbelocatedathttp://hdl.
handle.
net/11312/c-00044068-1.
Ifyouchangethe-1to-2,thenitreferstothetranscriptitself.
Ifyouchangethe-1to-3,itreferstothemedia,ifthatexists.
TherearealsoPIDnumbersinthe0metadata.
cdcfilethataccompanieseachcorpus.
Whenthosenumbersendin-1,theyrefertotheCMDIfileassociatedwiththecorpus.
Ifyouchangethat-1to-2,itreferstothe.
zipfilethatyoucandownloadforthecorpus.
@ColorWordsThishiddenheaderstoresthecolorvaluesthatuserscreatewhenusingtheColorKeywordsdialog.
@WindowThishiddenheaderstoresinformationaboutthepositionofthetranscriptwindowonthecomputerscreenandthelocationofthelastplacebeingedited.
Thislineisusefulduringthedevelopmentofanewcorpus.
However,itisremovedwhenfilesareaddedtothepermanentTalkBankdatabases.
@FontThisheaderisusedtosetthedefaultfontforthefile.
Whenthisheaderisnotpresent,CLANusestheArialUnicodefont.
NoneofthetranscriptsinTalkBankusethisheader,becauseallofthemassumethatthefontisArialUnicode.
7.
2InitialHeadersCHAThasseveninitialheaders.
Thefirstsixofthese–@Begin,@Languages,@Participants,@Options,@ID,and@Media–appearinthisorderasthefirstlinesofthefile.
Thelastone@Endappearsattheendofthefileasthelastline.
@BeginThisheaderisalwaysthefirstvisibleheaderplacedatthebeginningofthefile.
Itisneededtoguaranteethatnomaterialhasbeenlostatthebeginningofthefile.
Thisisa"bare"headerthattakesnoentryandusesnocolon.
Part1:CHAT30@Languages:Thisisthesecondvisibleheader;ittellstheprogramswhichlanguageisbeingusedinthedialogues.
HereisanexampleofthislineforabilingualtranscriptusingSwedishandPortuguese.
@Languages:swe,porThelanguagecodescomefromtheinternationalISO639-3standard.
Forthelanguagescurrentlyinthedatabase,thesethree-lettercodesandextendedcodesareused:LanguageCodeLanguageCodeLanguageCodeAfrikaansafrGermandeuPolishpolArabicaraGreekellPortugueseporBasqueeusHebrewhebPunjabipanCantonesezho-yueHungarianhunRomanianronCatalancatIcelandicislRussianrusChinesezhoIndonesianindSesothosotCreecrlIrishgleSpanishspaCroatianhrvItalianitaSwahiliswaCzechcesJapanesejpnSwedishsweDanishdanJavanesejavTagalogtagDutchnldKannadakanTaiwanesezho-minEnglishengKikuyukikTamiltamEstonianestKoreankorThaithaFarsifasLithuanianlitTurkishturFinnishsunNorwegiannorVietnamesevieFrenchfraWelshcymGalicianglgYiddishyidWecontinuallyupdatethislist,andCLANreliesonafileinthelib/fixesdirectorycalledISO-639.
cutthatliststhecurrentlanguages.
Therearespecialconditionsforcertainlanguages.
Forexample,tonelanguageslikeCantonese,Mandarin,andThaiareallowedtohaveRomanizedwordformsthatincludetonenumbers.
Inaddition,Chinesewordsinnon-Romancharacterscanusenumberstodisambiguatehomonyms.
Inmultilingualcorpora,severalcodescanbecombinedonthe@Languagesline.
Thefirstcodegivenisforthelanguageusedmostfrequentlyinthetranscript.
Individualutterancesinasecondorthirdmostfrequentlanguagescanbemarkedwithprecodesasinthisexample:*CHI:[-eng]thisismyjuguete@s.
Inthisexample,Spanishisthemostfrequentlanguage,buttheparticularsentenceismarkedasEnglish.
The@LanguagesheaderlistsspaforSpanish,andthenengforEnglish.
WithinthisEnglishsentence,theuseofaSpanishwordisthenmarkedas@s.
Whenthe@sisusedinthemainbodyofthetranscriptwithoutthe[-eng],thenitindicatesashifttoEnglish,ratherthantoSpanish.
Pleaseseethesectiononcode-switchingannotationforfurtherdetailsontheuseofthesecodesforinteractionswithcode-switching.
Part1:CHAT31@Participants:Thisisthethirdvisibleheader.
Likethe@Beginand@Participantsheaders,itisobligatory.
Itlistsallofthespeakerswithinthefile.
TheformatforthisheaderisXXXNameRole,XXXNameRole,XXXNameRole.
XXXstandsforthethree-letterspeakerID.
Hereisanexampleofacompleted@Participantsheaderline:@Participants:SARSue_DayTarget_Child,CARCarolMotherParticipantsareidentifiedbythreeelements:theirspeakerID,theirnameandtheirrole.
SpeakerID.
ThespeakerIDisusuallycomposedofthreeletters.
Thecodemaybebasedeitherontheparticipant'sname,asin*ROSor*BIL,oronherrole,asin*CHIor*MOT.
Incorporastudyingsinglechildren,theform*CHIshouldalwaysbeusedfortheTarget_Child,asinthisexample.
@Participants:CHIMarkTarget_Child,MOTMaryMotherSeveraldifferentTarget_Childparticipantscanbeindicatedas*CH1,*CH2,*CH3.
However,ifoneisprimary,itshouldbe*CHI.
ThereareseveralCHILDEScorporathatusefirstnameabbreviationsfortargetchildren,becausetheyarestudiesofsiblingsorgroupsessions.
ThesecorporaincludeFernAguado,Koine,Becasesno,Palasis,Luque,Weissenborn,Gathercole,Guilfoyle,Garvey,Evans,Levy,Navracsics,MCF,andIonin.
Nameand/orSpecificRole.
Thespeaker'snamecanbeplacedinthefieldafterthe3-letterID.
However,thisfieldcanbeomitted,particularlyifitisimportanttodeidentifythedata.
IfCLANfindsonlyathree-letterIDandarole,itwillassumethatthenamehasbeenomitted.
Inordertopreserveanonymity,itisoftenusefultoincludeapseudonymforthename,becausethepseudonymwillalsobeusedinthebodyofthetranscript.
ForCLANtocorrectlyparsetheparticipantsline,multiple-wordnamedefinitionssuchas"SueDay"needtobejoinedintheform"Sue_Day.
"Insteadofputtinginthename,youcanputinaspecificrole,suchasMaternal_Grandmother.
Thenamecanbecombinedwiththespecificroleinthisway:@Participants:ROSRose_Maternal_GrandmotherGrandparentStandardRole.
AftertheIDandname,thelastfieldgivesthestandardroleofthespeaker.
Thereisafixedsetofstandardrolesspecifiedinthedepfile.
cutfileusedbyCHECK.
Youwillalsoseethissamelistofpossiblerolesinthe"role"segmentofthe"IDHeaders"dialogbox.
Alloftheserolesarehard-wiredintothedepfile.
cutfileusedbyCHECK.
Ifoneofthesestandardrolesdoesnotwork,itwouldbebesttouseoneofthegenericage-relatedroles,likeAdult,Child,orTeenager.
Furtherdetailsregardingthespecificrolecanbeputintheplaceofthenameinthefieldbeforetherole,asintheseexamples:@Participants:TBOToll_Booth_OperatorAdult,AIRAirport_AttendantAdult,SI1First_SiblingSibling,SI2Second_SiblingSibling,COMComputer_TalkMediaNotethatthetermsinthesecondfield,suchasToll_Booth_OperatororSecond_Siblingmustbewrittenasasinglewordbyusingunderscorestolinkseparatewords.
ThePart1:CHAT32followingisalistoftherolescurrentlyindepfile.
cut.
Althoughweoccasionallyaddmoreroles,wetrytolimitthisbyusingthefollowingstandardroles:Target_Child:UseofthisroleisveryimportantforCHILDESandPhonBanktranscripts,becauseitallowsuserstosearchandanalyzetheoutputfromthechildrenwhoarethefocusofmanyofthestudies.
Target_Adult:ThisroleservesasimilarfunctiontoTarget_Childbymakingitclearwhowhichspeakerwasatthefocusofthedatacollection.
Child:Thisroleisusedmostlyintranscriptsstudyinglargegroupsofchildren,whenitisnoteasytodeterminewhetherachildisaboyorgirlorperhapsarelative.
Mother:ThisshouldbethemotheroftheTarget_Child.
Father:ThisshouldbethefatheroftheTarget_Child.
Brother:ThisshouldbeabrotheroftheTarget_Child.
Sister:ThisshouldbeasisteroftheTarget_Child.
Sibling:ThisshouldbeasiblingoftheTarget_Child.
Grandfather:ThisshouldbethegrandfatheroftheTarget_Child.
FurtherdetailssuchasPaternal_GrandfathercanbeplacedintotheSpecificRolefield.
Grandmother:ThisshouldbethegrandmotheroftheTarget_Child.
FurtherdetailssuchasPaternal_GrandmothercanbeplacedintotheSpecificRolefield.
Relative:Thisroleisdesignedtoincludeallotherrelations,includingAunt,Uncle,Cousin,Father_in_Lawetc.
whichcanthenbeenteredintotheSpecificRolefield.
Participant:Thisisthegenericroleforadultparticipantsininterviewsandotherconversations.
Usually,thesearecodedashavingaParticipantandanInvestigator.
OtherformsofthisroleincludePatient,Informant,andSubjectwhichcanbelistedintheSpecificRolefieldorelsejustomitted.
Investigator:OthertermsforthisrolecanbelistedintheSpecificRoles.
TheseincludeResearcher,Clinician,Therapist,Observer,Camera_Operator,andsoon.
Partner:ThisistheroleforthepersonaccompanyingtheParticipanttotheintervieworconversation.
Boy:Thisisagenericrole.
Girl:Thisisagenericrole.
Adult:Thisisaverygenericroleforusewhenlittleelseisknown.
Teenager:Thisisagenericrole.
Male:Usethisrolewhenallweknowisthattheparticipantisanadultmale.
Female:Usethisrolewhenallweknowisthattheparticipantisanadultfemale.
Visitor:Thisroleassumesthatthevisitoriscomingtoaconversationinthehome.
Friend:ThisisaroleforaFriendofthetargetparticipants.
Playmate:ThisisaroleforachildthattheTarget_Childplayswith.
Caretaker:Thispersontakescareofthechild.
OthernamesfortheSpecificRolefieldincludeHousekeeper,Nursemaid,orBabysitter.
Environment:ThisroleisusedintheSBCSAEcorpus.
Group:Thisroleisusedwhentranscribingsimultaneousproductionsfromawholegroup.
Unidentified:Thisisaroleforunidentifiableparticipants.
Uncertain:Thisrolecanbeusedwhenitisnotclearwhoproducedanutterance.
Part1:CHAT33Other:Thisisagenericrole.
Whenitisused,thereshouldbefurtherspecificationintheSpecificRolefield.
RolesdefinedbyjobssuchasTechnician,Patron,Policeman,etccanbelistedasOtherandthedetailsgivenintheSpecificRolefield.
Text:ThisroleisusedforwrittensegmentsofTalkBank.
Media:Thisroleisusedforspeechfromtelevisions,computers,ortalkingtoys.
PlayRole:Thisroleisusedwhenspeakerspretendtobesomething,suchasananimaloranotherperson.
LENA:ThisroleisusedinHomeBankLENArecordings.
ThespecificLENAroleisthenlistedintheSpecificRolefield.
Justice:ThisisroleisusedintheSCOTUScorpus.
ItalsoincludestheroleofJudge.
Attorney:Thisisthegeneralroleforattorneys,lawyers,prosecutors,etc.
Doctor:Thisisthegeneralrolefordoctors.
Nurse:Thisisthegeneralrolefornurses.
Student:SpecificformsofthisgeneralroleincludeGraduateStudent,Senior,High_Schooler,andsoon.
Teacher:ThisisthegeneralroleforTeachers.
SpecificformsofthisgeneralroleincludeInstuctor,Advisor,Faculty,Professor,Tutor,orT_A.
Host:SpecificformsofthisgeneralroleincludeShowHost,Interviewer,andCallTaker.
Guest:SpecificformsofthisgeneralroleincludeShowGuest,Interviewee,andCaller.
Leader:SpecificformsofthisgeneralroleincludeGroup_Leader,Panel_Moderator,Committee_Chair,Facilitator,Tour_Guide,Tour_Leader,Peer_Leader,Chair,orDiscussion_Leader.
Member:SpecificformsofthisgeneralroleincludeCommittee_Member,Group_Member,Panelist,andTour_Participant.
Narrator:Thisisaroleforpresentationsofstories.
Speaker:SpecificformsofthisgeneralroleincludeLecturer,Presenter,Introducer,Welcomer,andMain_Speaker.
Audience:Thisisthegeneralroleforsingleaudiencemembers.
@Options:Thisheaderisnotobligatory,butitisfrequentlyneeded.
Whenitoccurs,itmustfollowthe@Participantsline.
Thisheaderallowsthecheckingprograms(CHECKandtheXMLvalidator)tosuspendcertaincheckingrulesforcertainfiletypes.
Thespellingoftheseoptionsiscase-sensitive.
1.
heritage:UseofthisoptiontellsCHECKandthevalidatornottolookatthecontentofthemainlinesatall.
ThisradicalblockageofthefunctionofCHECKisonlyrecommendedforpeopleworkingwithCAfilesdoneinthetraditionalJeffersonianformat.
Whenthisoptionisused,textmaybeplacedintoitalics,asintraditionalCA.
2.
IPA:UseofthisoptionpermitstheuseofIPAnotationonthemainline.
3.
CA:UseofthisoptionsuspendstheusualrequirementforutteranceterminatorstoaccommodateConversationAnalysistranscripts.
4.
CA-Unicode:ThisoptionisneededforCAtranscriptsusingEastAsianscriptsinordertoautomaticallyloadArialUnicodeinsteadofCAFont.
Unfortunately,overlapalignmentisaccurateforavariable-widthfontlikeArialUnicode.
Part1:CHAT345.
multi:.
UseofthisoptiontellsCHECKandChattertoexpectmultiplebulletsonasingleline.
ThiscanbeusedfordatathatcomefromprogramslikePraatthatmarktimeforeachword.
6.
bullets:Thisoptionturnsofftherequirementthateachtime-markingbulletshouldbeginafterthepreviousone.
7.
dummy:Thisoptionisusedinfilesthatpointtomediathatdonotyethaveanytranscription.
@ID:ThisheaderisusedtocontrolprogramssuchasSTATFREQ,outputtoExcel,andnewprogramsbasedonXML.
Theformofthislineis:@ID:language|corpus|code|age|sex|group|SES|role|education|custom|Theremustbeone@IDfieldforeachparticipant.
Oftenyouwillnotcaretoencodeallofthisinformation.
Inthatcase,youcanleavesomeofthesefieldsempty.
Hereisatypical@IDheader.
@ID:en|macwhinney|CHI|2;10.
10||||Target_Child|||Tofacilitatetypingoftheseheaders,youcanruntheCHECKprogramonanewCHATfile.
IfCHECKdoesnotsee@IDheaders,itwillusethe@Participantslinetoinsertasetof@IDheaderstowhichyoucanthenaddfurtherinformation.
Alternatively,youcanusetheINSERTprogramtocreatethesefieldsautomaticallyfromtheinformationinthe@Participantsline.
Forevenmorecompletecontrolovercreationofthese@IDheaders,youcanusethedialogsystemthatcomesupwhenyouhaveanopenCHATfileandselect"IDHeaders"undertheTiersMenupulldown.
Hereisasampleversionofthisdialogbox:Part1:CHAT35Herearesomefurtherspecificationsofthecodesinthefieldsforthe@IDheader.
Language:asintheISOcodestablegivenaboveCorpus:aone-wordlabelforthecorpusinlowercaseCode:thethree-lettercodeforthespeakerincapitalsAge:theageofthespeaker(seebelow)Sex:either"male"or"female"inlowercaseGroup:anysinglewordlabel.
Commonabbreviationsinclude:ASDautismspectrumdisorder,DSDownssyndrome,HLhearinglimited,LIlanguageimpaired,LTlatetalker,SLIspecificlanguageimpaired,TDtypicallydeveloping,RHDrighthemispheredamage,ADAlzheimer'sdementia,TBItraumaticbraininjury,NSnativespeaker,NNSnon-nativespeakerEth,SES:Ethnicity(Asian,Black,Latino,Multiple,Native,Pacific,Unknown,White),SES(WCforworkingclass,UCforupperclass,MCformiddleclass,LIforlimitedincome)Note:ifbothEthnicityandSESaregiven,thereisacommaseparatingthem.
IfEthnicityisnotlisted,itisassumedtobeWhite.
Role:theroleasgiveninthe@ParticipantslineEducation:educationallevelofthespeaker(ortheparent):Elem,HS,UG,Grad,DocCustom:anyadditionalinformationneededforagivenprojectItisimportanttousethecorrectformatfortheTarget_Child'sage.
Thisfieldusestheformyears;months.
daysasin2;11.
17for2years,11months,and17days.
ThefieldsforPart1:CHAT36themonthsanddaysshouldalwayshavetwoplaces.
UsingthisformatisimportantwhenitcomestoorderingdatabyageinspreadsheetsystemssuchasExcel.
Thisoftenmeansthatyouneedtoaddleadingzeroes,asin2;05.
06and5;09.
01.
However,youdonotneedtoaddanyleadingzeroesbeforetheyears.
Ifyoudonotknowthechild'sageindays,youcansimplyuseyearsandmonths,asin6;04.
withaperiodafterthemonths.
Ifyoudonotknowthemonths,youcanusetheform6;withthesemicolonaftertheyears.
Ifyouonlyknowthechild'sbirthdateandthedateofthetranscript,youcanusetheDATESprogramtocomputethechild'sage.
@Media:ThisheaderisusedtotellCLANhowtolocateandplaybackmediathatarelinkedtotranscripts.
Thefirstfieldinthisheaderspecifiesthenameofthemediafile.
Extensionsshouldbeomitted.
AfundamentalprincipleforfileorganizationintheTalkBankdatabasesisthatthenameofthemediafileshouldbethesameasthenameofthetranscript,ignoringextensions.
ThisiscrucialforallowingtheTalkBankBrowsertolocatethemediafileontheweb.
Afurtherrestrictionisthateachtranscriptshouldlinktoonlyonemediafileandeachmediafileshouldlinktoonlyonetranscript.
Ifthemediafileisabe88.
wav,thenjustenter"abe88".
Thendeclaretheformatas"sound"or"video".
Itisalsopossibletoaddoneofthethreetermsmissing,notrans,orunlinkedafterthemediatype.
Thetermmissingisusedwhenthemediaismissingfromthecollection.
Thetermunlinkedisusedfortranscriptsthathavenotyetbeenlinkedtomedia.
Thetermnotransisusedformediathathavenotyetbeentranscribed.
Sothelinehasthisshape:@Media:abe88,sound,missing@EndLikethe@Beginheader,thisheaderusesnocolonandtakesnoentry.
Itisplacedattheendofthefileastheverylastline.
Addingthisheaderprovidesasafeguardagainstthedangerofundetectedfiletruncationduringcopying.
7.
3Participant-SpecificHeadersThethirdsetofheadersprovidesinformationspecifictoeachparticipant.
Theseheadersmustfollowafterthe@IDheaders.
Mostoftheparticipant-specificinformationisinthe@IDtier.
ThatinformationcanbeenteredbyusingtheIDheadersoptioninCLAN'sTiersmenu.
Theexceptionsareforthesetiers:@Birthof#:@Birthplaceof#:@L1of#:7.
4ConstantHeadersConstantheadersfollowtheparticipant-specificheaders.
Theseheaders,whicharealloptional,describevariousgeneralfactsaboutthefile.
Part1:CHAT37@Location:Thisheadershouldincludethecity,stateorprovince,andcountryinwhichtheinterac-tiontookplace.
Hereisanexampleofacompletedheaderline:@Location:Boston,MA,USA@Number:Theheaderindicatesthenumberofparticpants.
Possibleentrieshereinclude:1,2,3,4,5,more,andaudience.
@RecordingQuality:Possibleentrieshereare:1,2,3,4,5with5beingthehighestquality.
@RoomLayout:Thisheaderoutlinesroomconfigurationandpositioningoffurniture.
Thisisespeciallyusefulforexperimentalsettings.
Theentryshouldbeadescriptionoftheroomanditscontents.
Hereisanexampleofthecompletedheaderline:@RoomLayout:Kitchen;Tableincenterofroomwithwindowonwestwall,doortooutsideonnorthwall@TapeLocation:ThisheaderindicatesthespecifictapeID,sideandfootage.
Thisisveryimportantforidentifyingthespotontheanalogtapefromwhichthetranscriptionwasmade.
TheentryforthisheadershouldincludethetapeID,sideandfootage.
Hereisanexampleofthisheader:@TapeLocation:tape74,sidea,104@TimeDuration:Itisoftennecessarytoindicatethetimeatwhichtheaudiotapingbeganandtheamountoftimethatpassedduringthecourseofthetaping,asinthesetwowaysofindicatingthattapingbeganat1hour10minutesandendedat2hours5minutes:@TimeDuration:01:10:00-02:05:00@TimeDuration:01:10-02:05Theinformationcanalsobeenteredasthetotalamountoftimetranspired(55minutes):@TimeDuration:00:55:00Formostprojectswhatisimportantisnottheabsolutetime,butthetimeofindividualeventsrelativetoeachother.
Thissortofrelativetimingcanprovidedbycodingonthe%timdependenttierinconjunctionwiththe@TimeStartheaderdescribednext.
However,thistypeofcodingisreallyonlyneededforoldertranscriptsforwhichthereisnomedia.
Part1:CHAT38@TimeDurationinformationisusedinseveralCLANcommands,suchasC-NNLA,C-QPA,EVAL,SCRIPT,SUGAR,TIMEDURandLENA2CHAT.
@TimeStart:Ifyouaretrackingelapsedtimeonthe%timtier,the@TimeStartheadercanbeusedtoindicatetheabsolutetimeatwhichthetimingmarksbegin.
Ifanew@TimeStartheaderisplacedinthemiddleofthetranscript,this"restarts"theclock.
Thismethodisreallyonlyappropriateforoldertranscriptsforwhichthereisnomedia.
Transcriptslinkedtomediawillnotneedthisinformation.
NoneoftheCLANprogramsmakeuseofthisinformationbutitisproducedbySALT2CHAT.
@TimeStart:12:30(12minutes30seconds)@TimeStart:01:30:00(1hour30minutes)@Transcriber:Thislineidentifiesthepeoplewhotranscribedandcodedthefile.
Havingthisindicatedisoftenhelpfullater,whenquestionsarise.
Italsoprovidesawayofacknowledgingthepeoplewhohavetakenthetimetomakethedataavailableforfurtherstudy.
@Transcription:Thepossibleentrieshereare:eye_dialect,partial,full,detailed,coarse,checked@Types:Thisheaderisusedtomarkclassesofgroups,activities,andexperimentaldesignforchildlanguagecorpora.
Currently,thevaluesare:Design:crosscross-sectionallonglongitudinalobservobservationalActivity:toyplayplayingwithtoysnarrativetellingstoriesmealtalkduringmealtimepicturesdescribingactionsinpicturesbookadultreadingtothechildinterviewaskingquestionsofchildtestsstructuredtestspreverbaladultstalkingtopreverbalchildgroupseveralchildrentalkingwitheachotherclassroomschoolclassroomreadingchildreadingeverydayactivitesacrossthedayGroup:TDtypicallydevelopingchildrenbilingtypicallydevelopingbilingualchildrenPart1:CHAT39AAEtypicallydevelopingspeakersofAAEL2typicallydevelopingL2learnersSLIspecificlanguageimpairmentHLhearinglimitedCIcochlearimplantsPDphonologicaldisorderASDautismspectrumdisorderLTlatetalkerDSDownssyndromeMRmentallyretardedADHDattentiondeficithyperactivitydisorderCWSchildrenwhostutterAWSadultswhostutterTheseindicatorsareusedbyKidEvaltoandTalkBankDBtocreatecomparisongroups.
Inthisway,userscanfocusonaparticulartypeoftranscript,suchaslongitudinalstudiesoftypically-developingchildrenwithtoyplay.
The0types.
txtfileinafolderisusedtocopyinformationtoalltranscriptsinthatfolder.
Inthiscase,theresultand@Typelineisthis:@Type:long,toyplay,TD@Videos:Thisheaderisusedspecificallybythefunctionthatallowsyoutoshiftbetweendifferentcameraanglesonthesameinteraction,asillustratedanddescribedinthesectionon"MultipleVideoPlayback"intheCLANmanual.
@Warning:Thisheaderisusedtowarntheuseraboutcertaindefectsorpeculiaritiesinthecollec-tionandtranscriptionofthedatainthefile.
Sometypicalwarningsareasfollows:1.
Thesedataarenotusefulfortheanalysisofoverlaps,becauseoverlappingwasnotaccuratelytranscribed.
2.
Thesedatacontainnoinformationregardingthecontext.
Thereforetheywillbeinappropriateformanytypesofanalysis.
3.
Retracingsandhesitationphenomenahavenotbeenaccuratelytranscribedinthesedata.
4.
Thesedatahavebeentranscribed,butthetranscriptionhasnotyetbeendouble-checked.
5.
ThisfilehasnotyetpassedsuccessfullythroughCHECK.
7.
5ChangeableHeadersChangeableheaderscanoccureitheratthebeginningofthefilealongwiththeconstantheadersorelseinthebodyofthefile.
Changeableheaderscontaininformationthatcanchangewithinthefile.
Forexample,ifthefilecontainsmaterialthatwasrecordedononlyoneday,the@Dateheaderwouldoccuronlyonceatthebeginningofthefile.
However,ifthefilecontainssomematerialfromalaterday,the@DateheaderwouldbeusedagainPart1:CHAT40laterinthefiletoindicatethenextdate.
Thesechangeableheadersappear,then,atthepointwithinthefilewheretheinformationchanges.
Thelistthatfollowsisalphabetical.
@Activities:Thisheaderdescribestheactivitiesinvolvedinthesituation.
Theentryisalistofcom-ponentactivitiesinthesituation.
Supposethe@Situationheaderreads,"Gettingreadytogoout.
"The@Activitiesheaderwouldthenlistwhatwasinvolvedinthis,suchasputtingoncoats,gatheringschoolbooks,andsayinggood-bye.
@Bck:DiarymaterialthatwasnotoriginallytranscribedintheCHATformatoftenhasexplan-atoryorbackgroundmaterialplacedbeforeachild'sutterance.
Whenconvertingthisma-terialtotheCHATformat,itissometimesimpossibletodecidewhetherthisbackgroundmaterialoccursbefore,during,oraftertheutterance.
Inordertoavoidhavingtomakethesedecisionsafterthefact,onecansimplyenteritinan@Bckheader.
@Bck:Rachelwasfussingandpointingtowardthecabinetwherethecookiesarestored.
*RAC:cookie[/]cookie.
@Bgand@Bg:Theseheadersareusedtomarkthebeginningofa"gem"foranalysisbyGEM.
Ifthereisacolon,youmustfollowthecolonwithatabandthenoneormorecodewords.
Each@Bgheadermusthaveamatching@Egheaderthatindicatestheendofthegemsection.
Thealternativetousing@Bgand@Egistousethe@G"lazygem"form.
Researchersusegemmarkersfordifferentpurposes.
InsomeCHILDEScorpora,theyareusedtomarkthedatesornumbersofdiaryentries.
Instudiesofnarrativesandbookreading,theyareusedtomarkpagenumbers.
Intaskswithobjectandpicturedescription,theymayindicatethenumberornameofthepicture.
Insomecorpora,theyareusedjusttoenterdescriptiveremarks.
Oneimportantandinterestinguseofgemsistofacilitatelaterretrievalandanalysis.
Forexample,somestudieswithchildrenmakeuseofafixedsetsofactivitiessuchasMotherPlay,bookreading,andstorytelling.
Forthesegems,itcanbeusefultocomparesimilaractivitiesacrosstranscripts.
Tosupportthis,wehaveenteredthepossiblegemsinacorpusthatusesgemsinthiswayintotheTalkBankDBfacilityinapulldownmenu.
Descriptionsofthegemsusedinagivencorpuscanbefoundinthehomepageforthatcorpus.
@BlankThisheaderiscreatedbytheTEXTINprogram.
Itisusedtorepresentthefactthatsomewrittentextincludesablanklineornewparagraph.
Itshouldnotbeusedfortranscriptsofspokenlanguage.
Part1:CHAT41@Comment:Thisheadercanbeusedasanall-purposecommentline.
Anytypeofcommentcanbeenteredonan@Commentline.
Whenthecommentreferstoaparticularutterance,usethe%comline.
Whenthecommentreferstomoregeneralmaterial,usethe@Commentheader.
Ifthecommentisintendedtoapplytothefileasawhole,placethe@Commentheaderalongwiththeconstantheadersbeforethefirstutterance.
Insteadoftryingtomakeupanewcodingtiernamesuchas"@GestationalAge"foraspecialpurposetypeofinforma-tion,itisbesttousethe@Commentfield,asinthisexample:@Comment:GestationalageofMARis7months@Comment:BirthweightofMARis6lbs.
4ozAnotherexampleofaspecial@CommentfieldisusedinthediarynotesoftheMacWhinneycorpus,wheretheyhavethisshape:@Comment:Diary-Brian–Rosssaid"Idon'tneedtothrowmyblocksoutthewindowanymore.
"@Date:Thisheaderindicatesthedateoftheinteraction.
Theentryforthisheaderisgivenintheformday-month-year.
Thedateisabbreviatedinthesamewayasinthe@Birthheaderentry.
Hereisanexampleofacompleted@Dateheaderline:@Date:01-JUL-1965Becausewehavesomecorporagoingbackoveracentury,itisimportanttoincludethefullvaluefortheyear.
Also,becausethedaysofthemonthshouldalwayshavetwodigits,itisnecessarytoaddaleading"0"fordayssuchas"01".
@Egand@Eg:Theseheadersareusedtomarktheendofa"gem"foranalysisbytheGEMcommand.
Ifthereisacolon,youmustfollowthecolonwithatabandthenoneormorecodewords.
Each@Egmusthaveamatching@Bg.
Ifthe@Eg:formisused,thenthetextfollowingitmustexactlymatchthetextinthecorresponding@Bg:Youcannestonesetof@Bg-@Egmarkersinsideanother,butdoubleembeddingisnotallowed.
Youcanalsobeginanewpairbeforefinishingthecurrentone,butagainthiscannotbedoneforthreebeginnings.
Pleaseseetheentryabovefor@Bgforfurtherinformation.
@G:ThisheaderisusedinconjunctionwiththeGEMprogram,whichisdescribedintheCLANmanual.
Itmarksthebeginningof"gems"whennonestingoroverlappingofgemsoccurs.
Eachgemisdefinedasmaterialthatbeginswithan@Gmarkerandendswiththenext@Gmarker.
Werefertothesemarkersas"lazy"gemmarkers,becausetheyareeasiertousethanthe@Bg:and@Eg:markers.
Tousethisfeature,youneedtoalsousethe+nswitchinGEM.
Youmaynestatmostone@Bg-@Egpairinsideaseriesof@Gheaders.
Aswiththe@Bgand@Egmarkers,thiscodecaneitherbeusedalonewithoutacolonorPart1:CHAT42elseusedwithacolonfollowedbyatabandsomefollowingcodeforlaterspecificretrieval.
Pleaseseetheentryabovefor@Bgforfurtherinformationaboutgems.
@NewEpisodeThisheadersimplymarksthefactthattherehasbeenabreakintherecordingandthatanewepisodehasstarted.
Itisa"bare"headerthatisusedwithoutacolon,becauseittakesnoentry.
Thereisnoneedtomarktheendoftheepisodebecausethe@NewEpisodeheaderindicatesboththeendofoneepisodeandthebeginningofanother.
@Page:Thisheaderisusedtoindicatethenumberofthepagefromwhichsometextistaken.
Itshouldnotbeusedforspokentexts.
@Situation:Thischangeableheaderdescribesthegeneralsettingoftheinteraction.
Itappliestoallthematerialthatfollowsituntilanew@Situationheaderappears.
Theentryforthisheaderisastandarddescriptionofthesituation.
Trytousestandardsituationssuchas:"breakfast,""outing,""bath,""working,""visitingplaymates,""school,"or"gettingreadytogoout.
"Hereisanexampleofthecompletedheaderline:@Situation:TimandBillareplayingwithtoysinthehallway.
Thereshouldbeenoughsituationalinformationgiventoallowtheusertoreconstructthesituationasmuchaspossible.
WhoispresentWhatisthelayoutoftheroomorotherspaceWhatisthesocialroleofthosepresentWhoisusuallythecaregiverWhatactivityisinprogressIstheactivityroutinizedand,ifso,whatisthenatureoftheroutineIstheroutineoccurringinitsstandardtime,place,andpersonnelconfigurationWhatobjectsarepresentthataffectorassisttheinteractionItwillalsobeimportanttoincluderelevantethnographicinformationthatwouldmaketheinteractioninterpretabletotheuserofthedatabase.
Forexample,ifthetextisparent-childinteractionbeforeanobserver,whatistheculture'sevaluationofbehaviorssuchassilence,talkingalot,displayingformulaicskills,defendingagainstchallenges,andsoforthPart1:CHAT438WordsWordsarethebasicbuildingblocksforallsententialanddiscoursestructures.
Bystudyingthedevelopmentofworduse,wecanlearnanenormousamountaboutthegrowthofsyntax,discourse,morphology,andconceptualstructure.
However,inordertorealizethefullpotentialofcomputationalanalysisofwordusage,weneedtofollowcertainbasicrules.
Inparticular,weneedtomakesurethatwespellwordsinaconsistentmanner.
Ifwesometimesusetheformdoughnutandsometimesusetheformdonut,wearebeingin-consistentinourrepresentationofthisparticularword.
Ifsuchinconsistenciesarerepeatedthroughoutthelexicon,computerizedanalysiswillbecomeinaccurateandmisleading.
OneofthemajorgoalsofCHATanalysisistomaximizesystematicityandminimizeinconsis-tency.
IntheIntroduction,wediscussedsomeoftheproblemsinvolvedinmappingthespeechoflanguagelearnersontostandardadultforms.
Thischapterspellsoutsomerulesandheuristicsdesignedtoachievethegoalofconsistencyforword-leveltranscription.
Onesolutiontothisproblemwouldbetoavoidtheuseofwordsaltogetherbytranscrib-ingeverythinginphoneticorphonemicnotation.
Butthissolutionwouldmakethetran-scriptdifficulttoreadandanalyze.
Agreatdealofworkinlanguagelearningisbasedonsearchesforwordsandcombinationsofwords.
Ifwewanttoconducttheselexicalanaly-ses,wehavetotrytomatchupthechild'sproductiontoactualwords.
Workintheanalysisofsyntacticdevelopmentalsorequiresthatthetextbeanalyzedintermsoflexicalitems.
Withoutaclearrepresentationoflexicalitemsandthewaysthattheydivergefromtheadultstandard,itwouldbeimpossibletoconductlexicalandsyntacticanalysescomputationally.
Evenforthoseresearcherswhodonotplantoconductlexicalanalyses,itisextremelydifficulttounderstandtheflowofatranscriptifnoattemptismadetorelatethelearner'ssoundstoitemsintheadultlanguage.
Atthesametime,attemptstoforceadultlexicalformsontolearnerformscanseriouslymisrepresentthedata.
Thesolutiontothisproblemistodevisewaystoindicatethevarioustypesofdivergencesbetweenlearnerformsandadultstandardforms.
Notethatweusetheterm"divergences"ratherthan"error.
"Althoughbothlearners(MacWhinney&Osser,1977)andadults(Stemberger,1985)clearlydomakeerrors,mostofthedivergencesbe-tweenlearnerformsandadultformsareduetostructuralaspectsofthelearner'ssystem.
ThischapterdiscussesthevarioustoolsthatCHATprovidestomarksomeofthesedivergencesofchildformsfromadultstandards.
Thebasictypesofcodesfordivergencesthatwediscussare:speciallearner-formmarkers,codesforunidentifiablematerial,1.
codesforincompletewords,2.
waysoftreatingformulaicuseofwords,and3.
conventionsforstandardizedspellings.
ForlanguagessuchasEnglish,Spanish,andJapanese,wenowhavecompleteMORgrammars.
ThelexiconsusedbythesegrammarsconstitutethedefinitivecurrentCHATstandardforwords.
Pleasetakealookattherelevantlexicalfiles,sincetheyillustrateingreatdetailtheoverallprincipleswearedescribinginthischapter.
Part1:CHAT448.
1TheMainLineThewordformswewillbediscussingherearetheprincipalcomponentsofthe"mainline.
"Thislinegivesthebasictranscriptionofwhatthespeakersaid.
ThestructureofmainlinesinCHATisfairlysimple.
Eachmaintierlinebeginswithanasterisk.
Aftertheasterisk,thereisathree-letterspeakerID,acolonandatab.
Thetranscriptionofwhatwassaidbeginsintheninthcolumn,afterthetab,becausethetabstopintheeditorissetfortheeighthcolumn.
Theremainderofthemaintierlineiscomposedprimarilyofaseriesofwords.
WordsaredefinedasaseriesofASCIIcharactersseparatedbyspaces.
Inthischapter,wediscusstheprinciplesgoverningthetranscriptionofwords.
InCLAN,allcharactersthatarenotpunctuationmarkersarepotentiallypartsofwords.
Thedefaultpunctuationsetincludesthespaceandthesecharacters:Noneofthesecharactersorthespacecanbeusedwithinwords.
Thispunctuationsetappliestothemainlinesandallcodinglineswiththeexceptionofthe%phoand%modlineswhichusethesystemdescribedinthechapteronDependentTiers.
Becausethosesystemsmakeuseofpunctuationmarkersforspecialcharacters,onlythespacecanbeusedasadelimiteronthe%phoand%modlines.
AstheCLANmanualexplains,thisdefaultpunctuationsetcanbechangedforparticularanalyses.
Othernon-lettercharacterscanbeusedwithinwordstoexpressspecialmeanings.
TheseincludethevariousmarksinthesectiononCAcoding,aswellasthese:8.
2BasicWordsMainlinesarecomposedofwordsandothermarkers.
Wordsarepronounceableforms,surroundedbyspaces.
Mostwordsareenteredjustastheyarefoundinthedictionary.
Thefirstwordofasentenceisnotcapitalized,unlessitisapropernounorawordnormallycapitalizedbyitself,suchasanouninGermanortheword"I"inEnglish.
8.
3SpecialFormMarkersSpecialformmarkerscanbeplacedattheendofaword.
Todothis,thesymbol"@"isusedinconjunctionwithoneortwoadditionalletters.
Hereisanexampleoftheuseofthe@symbol:*SAR:Igotabingbing@c.
Herethechildhasinventedtheformbingbingtorefertoatoy.
Thewordbingbingisnotinthedictionaryandmustbetreatedasaspecialform.
Tofurtherclarifytheuseofthese@cforms,thetranscribershouldcreateafilecalled"0lexicon.
cdc"thatprovidesglossesforsuchforms.
The@cformillustratedinthisexampleisonlyoneofmanypossiblespecialformmarkersthatcanbedevised.
Thefollowingtablelistssomeofthesemarkersthatwehavefounduseful.
However,thiscategorizationsystemismeantonlytobesuggestive,notex-haustive.
Researchersmaywishtoaddfurtherdistinctionsorignoresomeofthecategorieslisted.
Theparticularchoiceofmarkersandthedecisiontocodeawordwithamarkerformisonethatismadebythetranscriber,notbyCHAT.
ThebasicideaisthatCLANwilltreatPart1:CHAT45wordsmarkedwiththespeciallearner-formmarkersaswordsandnotasfragments.
Inaddition,theMORprogramwillnotattempttoanalyzespecialformsforpartofspeech,asindicatedinthefinalcolumninthistable.
SpecialFormMarkersLettersCategoriesExampleMeaningPOS@aadditionxxx@aunintelligiblew@bbabblingabame@b-bab@cchild-inventedformgumma@cstickychi@ddialectformyounz@dyoudia@eecholalia,repetitionwant@emore@ewantmoreskip@ffamily-specificformbunko@fbrokenfam@ggeneralspecialformgongga@g-skip@iinterjection,interactionuhhuh@i-co@kmultipleletterska@kJapanese"ka"n:let@lletterb@lletterbn:let@nneologismbreaked@nbrokeneo@oonomatopoeiawoofwoof@odogbarkingon@pphonol.
consistentformaga@p-phon@qmetalinguisticusenoif@q-sorbut@q-swhencitingwordsmeta@s:*second-languageformistenem@s:huHungarianwordL2@s$nsecond-languagenounperro@s$nSpanishnounn|@sisinginglalala@sisingingsing@slsignedlanguageapple@slapplesign@sassign&speechapple@sasappleandsignsas@ttestwordwug@ttesttest@uUnibettranscriptionbinga@u-uni@wpwordplaygoobarumba@wp-wp@xexcludedwordsstuff@xexcludedunk@z:xxxuser-definedcodeword@z:rtfdanyusercodeWecandefinethesespecialmarkersinthefollowingways:Additioncanbeusedtomarkanunintelligiblestringasawordforinclusiononthe%morline.
MORthenrecognizesxxx@aasw|xxx.
Italsorecognizesxxx@a$nas,forexamplen|xxx.
AddingthisfeaturewillstillnotallowinclusionofsentenceswithunintelligiblewordsforMLUandDSS,becausetherulesforthoseindicesprohibitthis.
Inmostcases,researchersprefertosimplymarkunintelligibleformsasxxxwithouttheadditional@a.
Babblingcanbeusedtomarkbothlow-levelearlybabbling.
Theseformshavenoobviousmeaningandareusedjusttohavefunwithsound.
Child-inventedformsarewordscreatedbythechildsometimesfromotherwordswithoutobviousderivationalmorphology.
Sometimestheyappeartobesoundvariantsofotherwords.
Sometimestheiroriginisobscure.
However,thechildappearstobeconvincedthattheyhavemeaningandadultssometimescometousetheseformsthemselves.
Part1:CHAT46Dialectformisoftenaninterestinggeneralpropertyofatranscript.
However,thecodingofphonologicaldialectvariationsonthewordlevelshouldbeminimized,becauseitoftenmakestranscriptsmoredifficulttoreadandanalyze.
Instead,generalpatternsofphonologicalvariationcanbenotedinthereadmefile.
Echolaliaformcanbemarkedforindividualwords.
Ifawholeutteranceisechoed,thenitisbettertousethe[+imit]postcode.
Family-specificformsaremuchlikechild-inventedformsthathavebeentakenoverbythewholefamily.
Sometimesthesourceoftheseformsarechildren,buttheycanalsobeoldermembersofthefamily.
Sometimestheformscomefromvariationsofwordsinanotherlanguage.
Anexamplemightbetheuseofundertoadtorefertosomemysteriousbeinginthesurf,althoughthewordwassimplyundertowinitially.
Generalspecialformmarkingwith@gcanbeusedwhenalloftheabovefail.
However,itsuseshouldgenerallybeavoided.
Markingwiththe@withoutafollowingletterisnotacceptedbyCHECK.
Interjectionscanbeindicatedinstandardways,makingtheuseofthe@inotationusuallynotnecessary.
Insteadoftranscribing"ahem@i,"onecansimplytranscribeahemfollowingtheconventionslistedlater.
Letterscaneitherbetranscribedwiththe@lmarkerorsimplyassingle-characterwords.
Ifitisnecessarytomarkaletternameasplural,itispossibletoaddasuffix,asinm@l-s.
Multiplelettersorstringsoflettersaremarkedas@k(asin"kana").
Neologismsaremeanttorefertomorphologicalcoinages.
Ifthenovelformismonomorphemic,thenitshouldbecharacterizedasachild-inventedform(@c),family-specificform(@f),oratestword(@t).
NotethatthisusageisonlyreallysanctionedforCHILDEScorpora.
ForAphasiaBankcorpora,neologismsareconsideredtobeformsthathavenorealwordsource,asistypicalinjargonaphasia.
Ifyouwanttoindicatethepartofspeechforaneologism,youcanuseacodinglikedumpf@n$vtoindicatethatdumpfisintendedtobeaverb.
Thiscanbehelpfulfor%morcoding.
Nonvoicedformsareproducedtypicallybyhearing-impairedchildrenortheirparentswhoaremouthingwordswithoutmakingtheirsounds.
Onomatopoeiasincludeanimalsoundsandattemptstoimitatenaturalsounds.
Phonologicalconsistentforms(PCFs)areearlyformsthatarephonologicallyconsis-tent,butwhosemeaningisuncleartothetranscriber.
Oftentheseformsareprotomorphemes.
QuotingorMetalinguisticreferencecanbeusedtoeitherciteorquotesinglestandardwordsorspecialchildforms.
Second-languageformsderivefromsomelanguagenotusuallyusedinthehome.
Thesearemarkedwithasecondletterforthefirstletterofthesecondlanguage,asin@s:zhforMandarinwordsinsideanEnglishsentence.
Partofspeechcodes.
Youcanalsomarkthepartofspeechofasecondlanguagewordbyusingtheform@s$asinperro@s$ntoindicatethattheSpanishwordperro(dog)isanoun.
Youcanusethesamemethodwithoutthe@sforL1words.
Thus,theformgoodbyes$nwillberecognizedasn|goodbyes.
Also,youcanusethismethodwithotherspecialformmarkers.
So,bimp@c$adjwouldindicatethatbimpisachild-inventedformthatisfunctioningasanadjective.
Signlanguageusecanbeindicatedbythe@sl.
Part1:CHAT47Signandspeechuseinvolvesmakingasignorinformalsigninparallelwithsayingtheword.
Singingcanbemarkedwith@si.
Sometimesthephrasethatisbeingsunginvolvesnonwords,asinlalaleloo@si.
Inothercases,itinvolveswordsthatcanbejoinedbyunderscores.
However,ifalargerpassageissung,itisbesttotranscribeitasspeechandjustmarkitasbeingsungthroughacommentline.
Testwordsarenonceformsgeneratedbytheinvestigatorstotesttheproductivityofthechild'sgrammar.
Unibettranscriptioncanbegivenonthemainlinebyusingthe@umarker.
However,ifmanysuchformsarebeingnoted,itmaybebettertoconstructa@pholine.
WiththeadventofIPAUnicode,wenowprefertoavoidtheuseofUnibet,relyinginsteaddirectlyonIPA.
Wordplayinolderchildrenproducesformsthatmaysoundmuchliketheformsofbabbling,butwhicharisefromaslightlydifferentprocess.
Itisbesttousethe@bforformsproducedbychildrenyoungerthan2;0and@wpforolderchildren.
Excludedformscanbemarkedwith@x.
User-definedspecialformscanbemarkedwith@zfollowedbyuptofivelettersofauser-definedcode,suchasinword@z:rftd.
Thisformatshouldbeusedcarefully,becauseitwillbedifficultfortheMORprogramtoevaluatewordswiththesecodesunlessadditionaldetailedinformationisaddedtothesf.
cutfile.
The@b,@u,and@wpmarkersallowthetranscribertorepresentwordsandbabblingwordsphonologicallyonthemainlineandhaveCLANtreatthemasfulllexicalitems.
Thisshouldonlybedonewhentheanalysisrequiresthatthephonologicalstringbetreatedasawordanditisunclearwhichstandardmorphemecorrespondstotheword.
Ifaphonologicalstringshouldnotbetreatedasafullword,itshouldbemarkedbyabeginning&,andthe@bor@uendingsshouldnotbeused.
Also,ifthetranscriptincludesacomplete%pholineforeachwordandthedataareintendedforphonologicalanalysis,itisbettertouseyy(seethenextsection)onthemainlineandthengivethephonologicalformonthe%pholine.
Ifyouwishtoomitcodingofanitemonthe%pholine,youcaninsertthehorizontalellipsischaracter…(Unicodecharacternumber2026).
Thisisasinglecharacter,notthreeperiods,anditisnottheellipsischaracterusedbyMS-Word.
Family-specificformsarespecialwordsusedonlybythefamily.
Theseareoftende-rivedfromchildformsthatareadoptedbyallfamilymembers.
Theyalsoincludecertain"caregiverese"formsthatarenoteasilyrecognizedbythemajorityofadultspeakersbutwhichmaybecommontosomeareasorsomefamilies.
Family-specificformscanbeusedbyeitheradultsorchildren.
The@nmarkerisintendedformorphologicalneologismsandover-regularizations,whereasthe@cmarkerisintendedtomarknoncecreationofstems.
Ofcourse,thisdistinc-tionissomewhatarbitraryandincomplete.
Wheneverachild-inventedformisclearlyon-omatopoeic,usethe@ocodinginsteadofthe@ccoding.
Afullercharacterizationofneologismscanbeprovidedbytheerrorcodingsystempresentedinaseparatechapter.
8.
4UnidentifiableMaterialSometimesitisdifficulttomapasoundorgroupofsoundsontoeitheraconventionalwordoranon-conventionalword.
ThiscanoccurwhentheaudiosignalissoweakorPart1:CHAT48garbledthatyoucannotevenidentifythesoundsbeingused.
Atothertimes,youcanrecognizethesoundsthatthespeakerisusing,butcannotmapthesoundsontowords.
Sometimesyoumaychoosenottotranscribeapassage,becauseitisirrelevanttotheinteraction.
Sometimesthepersonmakesanoiseorperformsanactioninsteadofspeaking,andsometimesapersonbreaksoffbeforecompletingarecognizableword.
Alloftheseproblemscanbedealtwithbyusingcertainspecialsymbolsforthoseitemsthatcannotbeeasilyrelatedtowords.
Thesesymbolsaretypedinlowercaseandareprecededandfollowedbyspaces.
Whenstandingaloneonatexttier,theyshouldbefollowedbyaperiod,unlessitisclearthattheutterancewasaquestionoracommand.
UnintelligibleSpeechxxxUsethesymbolxxxwhenyoucannothearorunderstandwhatthespeakerissaying.
Ifyoubelieveyoucandistinguishthenumberofunintelligiblewords,youmayuseseveralxxxstringsinarow.
Hereisanexampleoftheuseofthexxxsymbol:*SAR:xxx.
*MOT:what*SAR:Iwantxxx.
Sarah'sfirstutteranceisfullyunintelligible.
Hersecondutteranceincludessomeunin-telligiblematerialalongwithsomeintelligiblematerial.
TheMLUandMLTcommandswillignorethexxxsymbolwhencomputingmeanlengthofutteranceandotherstatistics.
Ifyouwanttohaveseveralwordsincluded,useasmanyoccurrencesofxxxasyouwish.
PhonologicalCodingyyyUsethesymbolyyywhenyouplantocodeallmaterialphonologicallyona%pholine.
Ifyouarenotconsistentlycreatinga%pholineinwhicheachwordistranscribedinIPAintheorderofthemainline,youshouldusethe@uor¬ationsinstead.
Hereisanexampleoftheuseofyyy:*SAR:yyyyyyaball.
%pho:tagbalThefirsttwowordscannotbematchedtoparticularwords,buttheirphonologicalformisgivenonthe%pholine.
UntranscribedMaterialwwwThissymbolmustbeusedinconjunctionwithan%exptierwhichisdiscussedinthechapterondependenttiers.
Thissymbolisusedonthemainlinetoindicatematerialthatatranscriberdoesnotknowhowtotranscribeordoesnotwanttotranscribe.
Forexample,itcouldbethatthematerialisinalanguagethatthetranscriberdoesnotknow.
Thissymbolcanalsobeusedwhenaspeakersayssomethingthathasnorelevancetotheinteractionstakingplaceandtheexperimenterwouldratherignoreit.
Forexample,wwwcouldindicatealongconversationbetweenadultsthatwouldbesuperfluoustotranscribe.
Hereisanexampleoftheuseofthissymbol:*MOT:www.
Part1:CHAT49%exp:talkstoneighboronthetelephonePhonologicalFragments&Disfluenciessuchasfillers,phonologicalfragments,andrepeatedsegmentsareallcodedbyapreceding&.
Morespecifically,&-maybeusedforfillersand&+forfragments(pleaseseethechapterondisfluencycodingforthedetails).
MaterialfollowingtheampersandsymbolwillbeignoredbycertainCLANcommands,suchasMLU,whichcomputesthemeanlengthoftheutteranceinatranscript.
IfyouwantacommandsuchasFREQtocountalloftheinstancesofphonologicalfragments,youwouldhavetoaddaswitchsuchas+s"&*"(or+s"&+*").
8.
5IncompleteandOmittedWordsWordsmayalsobeincompleteorevenfullyomitted.
Wecanjudgeawordtobeincom-pletewhenenoughofitisproducedforustobesurewhatwasintended.
Judgingawordtobeomittedisoftenmuchmoredifficult.
NoncompletionofaWordtext(text)textWhenawordisincomplete,buttheintendedmeaningseemsclear,insertthemissingmaterialwithinparentheses.
Donotusethisnotationforfullyomittedwords,onlyforwordswithpartialomissions.
Thisnotationcanalsobeusedtoderiveaconsistentspellingforcommonlyshortenedwords,suchas(un)tiland(be)cause.
CLANwilltreatitemsthatarecodedinthiswayasfullwords.
ForprogramssuchasFREQ,theparentheseswillessentiallybeignoredand(be)causewillbetreatedasifitwerebecause.
TheCLANprogramsalsoprovidewaysofeitherincludingorexcludingthematerialintheparenthe-ses,dependingonthegoalsoftheanalysis.
*RAL:Ibeensit(ting)allday.
TheinclusionorexclusionofmaterialenclosedinparenthesesiswellsupportedbyCLANandthissamenotationcanalsobeusedforotherpurposeswhennecessary.
Forexample,studiesoffluencymayfinditconvenienttocodethenumberoftimesthatawordisrepeateddirectlyonthatword,asinthisexamplewiththreeproductionsoftheworddog.
Havingthreeproductionsmeansthattheformwassaidthreetimes.
Thismeansthatitwassaidonceandthenrepeatedtwomoretimes.
JEF:that'sadog[x3].
Bydefault,theprogramswillremovethe[x3]formandthesentencewillbetreatedasathreewordutterance.
Thisbehaviorcanbemodifiedbyusingthe+rswitch.
OmittedWord0wordThecodingofwordomissionsisadifficultandunreliableprocess.
Manyresearcherswillprefernottoevenopenupthisparticularcanofworms.
Ontheotherhand,researchersinlanguagedisordersandaphasiaoftenfindthatthecodingofwordomissionsiscrucialPart1:CHAT50toparticulartheoreticalissues.
Insuchcases,itisimportantthatthecodingofomittedwordsbedoneinasclearamanneraspossible.
Tocodeanomission,thezerosymbolisplacedbeforeawordonthetexttier.
Thefullomissionofawordalwaysbecodedinthiswayandnotthroughtheuseofparentheses.
Ifwhatisimportantisnottheactualwordomitted,butthepartofspeech,thenacodeforthepartofspeechcanfollowthezero.
Similarly,theidentityoftheomittedwordisalwaysaguess.
Thebestguessisplacedonthemainline.
Hereisanexampleofitsuse:*EVE:Iwant0togo.
Itisverydifficulttoknowwhenawordhasbeenomitted.
However,thefollowingcriteriacanbeusedtohelpmakethisdecisionforEnglishdata:1.
0det:Unlessthereisamissingplural,acommonnounwithoutanarticleiscodedas0det.
2.
0v:Sentenceswithnoverbscanbecodedashavingmissingverbs.
Ofcourse,oftentheomissionofaverbcanbeviewedasagrammaticaluseofellipsis.
3.
0aux:InstandardEnglish,sentenceslike"herunning"clearlyhaveamissingauxiliary.
4.
0subj:InEnglish,everyfiniteverbrequiresasubject.
InEnglish,thereareseldomsolidgroundsforassigningcodeslike0adj,0adv,0obj,or0prep.
However,thesecodesarepossible.
Inaddition,someresearchersthinkthatinsomecontextstheycanknowexactlywhatwordsarebeingomitted.
Forexample,theymaymarkformssuchas0person,0spot,andsoon.
Makingsuchmarkingsispossible,althoughwewouldratherseecodesatthelevelof0vor0det.
ItemsmarkedasomittedarenotincludedintheMLUcount.
8.
6StandardizedSpellingsThereareanumberofcommonwordsintheEnglishlanguagethatcannotbefoundinthedictionaryorwhoselexicalstatusisvague.
Forexample,howshouldlettersbespelledWhataboutnumbersandtitlesWhatisthebestspellingdoggyordoggie,yeahoryah,andpstorpssIfwecanincreasetheconsistencywithwhichsuchformsaretranscribed,wecanimprovethequalityofautomaticlexicalanalyses.
CLANcommandssuchasFREQandCOMBOprovideoutputbasedonsearchesforparticularwordstrings.
Ifawordisspelledinanindeterminatenumberofvariantways,researcherswhoattempttoanalyzetheoccurrenceofthatwordwillinevitablyendupwithinaccurateresults.
Forexample,ifaresearcherwantstotracetheuseofthepronounyou,itmightbenecessarytosearchnotonlyforyou,ya,andyah,butalsoforalltheassimilationsofthepronounswithverbssuchasdidya/dicha/didchaorcouldya/couldcha/coucha.
Withoutastandardsetofrulesforthetranscriptionofsuchforms,accuratelexicalsearchescouldbecomeimpossible.
Ontheotherhand,thereisnoreasontoavoidusingtheseformsifasetofstandardscanbeestablishedfortheiruse.
Otherprogramsrelyontheuseofdictionariesofwords.
Ifthespellingsofwordsareindeterminate,theanalysesproducedwillbeequallyindeterminate.
Forthatreason,itishelpfultospecifyasetofstandardspellingsformarginalwords.
Ifyouhavedoubtsaboutthespellingsofcertainwords,youcanlookinthe0allwords.
cdcfilethisisincludedinthe/lexfolderoftheMORgramarforeachlanguage.
Thewordstherearelistedinalphabeticalorder.
.
Part1:CHAT518.
6.
1LettersTotranscribeletters,usethe@lsymbolaftertheletter.
Forexample,theletter"b"wouldbeb@l.
Hereisanexampleofthespellingofalettersequence.
*MOT:couldyoupleasespellyourname*MAR:it'sm@la@lr@lk@l.
Thedictionarysaysthat"abc"isastandardword,sothatisacceptedwithoutthe@lmarking.
InJapanese,manylettersrefertowholesyllablesor"kana"suchasroorka.
TorepresentthisaswellasstringsoflettersinEnglish,usethe@ksymbol,asinka@korjklmn@k.
Usingthisform,theaboveexamplecouldbetterbecodedas:*MOT:couldyoupleasespellyourname*MAR:it'smark@k.
However,inthiscase,thespellingiscountedasoneword,notfour.
8.
6.
2CompoundsandLinkagesLanguagesuseavarietyofmethodsforcombiningwordsintolargerlexicalitems.
Onemethodinvolvesinflectionalprocesses,suchascliticizationandaffixation,thatwillbediscussedlater.
Hereweconsidercompoundsandlinkages.
InearlierversionsofCLAN,itwasnecessarytowritecompoundsintheformofbird+houseandbaby+sitter,butnowtheplusisnolongernecessary.
Youcanjustwritebirdhouseandbabysitterandthecorrectformwillbeinsertedintothe%morlinebytheMORprogram.
Asecondlevelofconcatenationinvolvestheuseofanunderscoretoindicatethefactthataphrasalcombinationisnotreallyacompound,butwhatwecalla"linkage".
CommonexampleshereincludetitlesofbookssuchasGreen_Eggs_and_Ham,appellationssuchasLittle_Bo_BeeporSanta_Claus,linesfromsongssuchasThe_Farmer_in_the_Dell,andplacessuchasHong_Kong_University.
Fortheseforms,theunderscoreisusedtoemphasizethefactthat,althoughtheformiscollocational,itdoesnotobeystandardrulesofcompoundformation.
Becausetheseformsallbeginwithacapitalletter,themorphologicalanalyzerwillrecognizethemaspropernouns.
Theunderscoreisusedforthreeotherpurposes.
First,itcanbeusedforirregularcombinations,suchashow_aboutandhow_come.
Second,itcanbeusedonthe%morlinetorepresentamultiwordEnglishglossforasinglestem,asin"lose_flowers"fordefleurir.
Third,itisusedinsideacronymstomakeitclearthattheyinvolveseparateletters,asinc_dorv_c_r.
Whenacronymsarethemselvespropernouns,theycanbewrittenwithfullcapitalization,asinTGVorCIAwithoutanyneedtoaddunderscores,becausewordsbeginningwithcapitallettersarealwaysassumedtobepropernouns.
Thethirdformofconcatenationinvolvestheuseofhyphensinwordssuchascul-de-sacorhi-fi.
Thesewordsarecustomarilywrittenwithhyphensandthatisthewaytheyshouldbetranscribedonthemainline.
ForEnglish,thesewordsarelistedinfilessuchasn-hyphen.
cut.
Hyphensshouldonlybeusedifthewordsinvolvedarecustomarilywrittenwithhyphens.
Unfortunately,thehyphenisalsousedonthe%morlinetoindicatesuffixation,asinn|dog-PLfordogs.
Toeliminatethisconfusion,whenMORruns,itchangesthehyphensthatwouldotherwiseappearinwordsonthe%morlinetoanen-dash(Unicode0x2013)onthefly.
ThischangeisdoneforboththestemwordsandthewordsinEnglishtranslationbetweenthe=signs.
Part1:CHAT528.
6.
3CapitalizationandAcronymsTheMORprogramdependsoncapitalizationofthefirstlettertoidentifyawordasapropernoun.
EarlierversionsofCHATonlyallowedcapitallettersatthebeginningsofwords.
ThismeantthattranscribershadtowriteacronymssucasFBIasF_B_I.
However,thatrestrictionhasnowbeenliftedandwritingFBIisnowcorrect.
OtherexamplesincludeMIT,CMU,USA,MTV,ET,andIU.
Morecomplicatedacronymsmayrequireunderscores,asinC_three_POandR_two_D_two.
Therecommendedwayoftranscribingthecommonnamefortelevisionisjusttv.
Thisformisnotcapitalized,sinceitisnotapropernoun.
Similarly,wecanwritecd,vcr,tv,anddvd.
Theunderscoreisthebestmarkforcombinationsthatarenottruecompoundssuchasm_and_m-sfortheM&Mcandy.
Acronymsthatarenotactuallyspelledoutwhenproducedinconversationshouldbewrittenaswords.
ThusUNESCOwouldbewrittenasUnesco.
Thecapitalizationofthefirstletterisusedtoindicatethefactthatitisapropernoun.
Theremustbenoperiodsinsideacronymsandtitles,becausethesecanbeconfusedwithutterancedelimiters.
8.
6.
4NumbersandTitlesNumbersshouldbewrittenoutinwords.
Forexample,thenumber256couldbewrittenas"twohundredandfiftysix,""twohundredfiftysix,""twofivesix,"or"twofiftysix,"dependingonhowitwaspronounced.
Itisbesttousetheform"fiftysix"ratherthan"fifty-six,"becausethehyphenisusedinCHATtoindicatemorphemicization.
Otherstringswithnumbersaremonetaryamounts,percentages,times,fractions,logarithms,andsoon.
Allshouldbewrittenoutinwords,asin"eightthousandtwohundredandtwentydollars"for$8220,"twentyninepointfivepercent"for29.
5%,"sevenfifteen"for7:15,"teno'clocka@lm@l"for10:00AM,and"fourandthreefifths.
"TitlessuchasDr.
orMr.
shouldbewrittenoutintheirfullcapitalizedformasDoctororMister,asin"DoctorSpock"and"MisterRogers.
"For"Mrs.
"usetheform"Missus.
"8.
6.
5KinshipFormsThefollowingtablelistssomeofthemostimportantkinshipaddressformsinstandardAmericanEnglish.
TheformswithasteriskscannotbefoundinWebster'sThirdNewIn-ternationalDictionary.
KinshipFormsChildFormalChildFormalDa(da)FatherMommyMotherDaddyFatherNanGrandmotherGram(s)GrandmotherNanaGrandmotherGrammyGrandmother*NonnyGrandmotherGramp(s)GrandfatherPaFather*GrampyGrandfatherPapFatherGrandmaGrandmotherPapaFatherGrandpaGrandfatherPappyFatherMaMotherPopFatherMamaMotherPoppaFatherPart1:CHAT53MommaMother*PoppyFatherMomMother8.
6.
6ShorteningsOneofthebiggestproblemsthatthetranscriberfacesisthetendencyofspeakerstodropsoundsoutofwords.
Forexample,aspeakermayleavetheinitial"a"offof"about,"sayinginstead"'bout.
"InCHAT,thisshortenedformappearsas(a)bout.
clancaneasilyignoretheparenthesesandtreatthewordas"about.
"Alternatively,thereisaCLANoptiontoallowthecommandstotreatthewordasaspellingvariant.
Manycommonwordshavestandardshortenedforms.
Someofthemostfrequentaregiveninthetablethatfollows.
Thebasicnotationalprincipleillustratedinthattablecanbeextendedtootherwordsasneeded.
AllofthesewordscanbefoundinWebster'sThirdNewInternationalDictionary.
Moreextremetypesofshorteningsinclude:"(what)s(th)at"whichbecomes"sat,""y(ou)are"whichbecomes"yar,"and"d(o)you"whichbecomes"dyou.
"Representingtheseformsasshorteningsratherthanasnonstandardwordsfacilitatesstandardizationandtheautomaticanalysisoftranscripts.
TwosetsofcontractionsthatcauseparticularproblemsformorphologicalanalysisinEnglisharefinalapostrophesandapostrophed,asinJohn'sandyou'd.
IfyoutranscribetheseasJohn(ha)sandyou(woul)d,thentheMORprogramwillworkmuchmoreefficiently.
ShorteningsExamplesofShortenings(a)boutdon('t)(h)is(re)frigeratoran(d)(e)nough(h)isself(re)member(a)n(d)(e)spress(o)-in(g)sec(ond)(a)fraid(e)spressonothin(g)s(up)pose(a)gain(es)presso(i)n(th)e(a)nother(ex)cept(in)stead(th)em(a)round(ex)cuseJag(uar)(th)emselvesave(nue)(ex)cusedlib(r)ary(th)ere(a)way(e)xcuseMass(achusetts)(th)ese(be)cause(e)xcusedmicro(phone)(th)ey(be)fore(h)e(pa)jamas(to)gether(be)hind(h)er(o)k(to)matob(e)long(h)ereo(v)er(to)morrowb(e)longs(h)erself(po)tato(to)nightCad(illac)(h)improb(ab)ly(un)tildoc(tor)(h)imself(re)corderwan(t)Themarkingofshortenedformssuchas(a)boutinthiswaygreatlyfacilitatesthelateranalysisofthetranscript,whilestillpreservingreadabilityandphonologicalaccuracy.
Learningtomakeeffectiveuseofthisformoftranscriptionisanimportantpartofmaster-Part1:CHAT54inguseofCHAT.
UnderuseofthisfeatureisacommonerrormadebybeginningusersofCHAT.
8.
6.
7AssimilationsandCliticizationsWordssuchas"gonna"for"goingto"and"whyntcha"for"whydon'tyou"involvecomplexsoundchanges,oftenwithassimilationsbetweenauxiliariesandtheinfinitiveorapronoun.
NoneoftheseformscanbefoundinWebster'sThirdNewInternationalDictionary.
However,tofacilitatebothphonologicalandgrammaticalanalysis,itisbesttotranscribetheseformsastheyarepronounced,whichmeansascliticizations.
InthecorporainTalkBank,wehavetriedtoalwaysusethisform.
Thismeansthatwealwayshavecouldainsteadofcouldhaveandsoon.
Theexceptionstothisareforgonnaandgotta.
Althoughwerecommendusingtheseformsinsteadofgoingtoandgotto,wewerenotabletouseaglobalreplacefunctionforthesetwoforms,becauseoftengoingtoisusedinpatternssuchasgoingtoChicagoandgottocanbeusedinaformlikehegottomyhouseearly.
However,whendoingnewtranscription,itisveryhelpfultousegonnaandgottawhenthereisarealcliticization.
CliticizationsMod~AuxStandardMod~InfStandardV~InfStandardcouldacouldhavegottagottowannawanttomightamighthavehadtahadtoneedaneedtomustamusthavehaftahavetogonnagoingtoshouldashouldhavehastahastospostasupposedtowouldawouldhaveoughtaoughttousetausedtoInadditiontothesecliticizations,othercommonassimilationsincludeformslistedinthistable.
AssimilationsAssimilationStandardAssimilationStandarddunnodon'tknowkindakindofdyoudoyousortasortofgimmegivemewhyntchawhydidn'tyoulemmeletmewassupwhat'suplotsalotsofwhaddyawhatdidyouUnlikethemod:auxgroup,furthertypesofassimilationsarenearlylimitless.
Someofthemostcommonassimilationsarelistedinthev-clit.
cutfileinMOR.
However,itisnotpossibletolistallpossibleassimilationsortoassignthemtoparticularpartsofspeech.
Moreover,theseotherassimilationsneedtobetreatedastwoormoremorphemes.
Todothis,youshouldusethereplacementnotation,asin*CHI:lemme[:letme]Part1:CHAT55Ifyoudothis,MORandtheotherprogramswillworkonthematerialinthesquarebrackets,ratherthanthelemmeform.
Anevensimplerwayofrepresentingsomeoftheseformsisbynotingomittedletterswithparenthesesasin:"gi(ve)me"for"gimme,""le(t)me"for"lemme,"or"d(o)you"for"dyou.
"8.
6.
8CommunicatorsandInterjectionsCommunicatorssuchasuhandnopeandinterjections,suchasughandgosh,areveryfrequent.
InearlierversionsofCHATandMOR,wedistinguishedinterjectionsfromcommunicators.
However,nowwetreattheseallascommunicators.
Becausetheirphonologicalshapevariessomuch,theseformsoftenhaveanunclearlexicalstatus.
Theco.
cutfileintheMORlexiconforEnglishprovidestheshapesforthesewordsthatwillberecognizedbytheMORgrammarforEnglish.
Forconsistency,theseformsshouldbeusedevenwhentheactualphonologicalformdivergesfromthestandardizingconvention,aslongasthevariantisperceivedasrelatedtothestandard.
Ratherthancreatingnewformsforvariationsinvowellength,itisbettertouseformssuchasa:hforaah.
TheEnglishMORprogramusesastandardsetofformsintheco.
cut,co-rhymes.
cut,co-under.
cut,andco-voc.
cutfilesinthe/lexfolderthatyouwillneedtoconsult.
Filledpausesaretreatedinadifferentway.
Theyareprecededbytheampersand-hyphenmark(&-)whichallowsthemtobeignoredaswords.
Specificallytheseformsareusedtomarkthevariousformsoffilledpauses:&-ah,&-eh,&-er,&-ew,&-hm,&-mm,&-uh,&-uhm,and&-um.
8.
6.
9SpellingVariantsAnumberofwordshavefrequentspellingvariants.
Theseincludealthoforalthough,donutfordoughnut,thoforthough,thruforthrough,andabc'sforabcs.
TranscribersshouldusethespellingsforthesewordsusedbythefilesintheEnglishMORgrammar.
Ingeneral,itisbesttoavoidtheuseofmonomorphemicwordswithapostrophes.
Forexample,itisbettertousetheformmamthantheformma'am.
However,apostrophesmustbeusedinEnglishformultimorphemiccontractionssuchasI'mordon't.
8.
6.
10ColloquialFormsColloquialandslangformsareoftenlistedinthedictionary.
Examplesincludetellyfortelevisionandradforradical.
Thefollowingtablelistssomesuchcolloquialformswiththeircorrespondingstandardforms.
WordsthataremarkedwithanasteriskcannotbefoundinWebster'sThirdNewInternationalDictionary.
ColloquialFormsFormMeaningFormMeaningDoggoneproblematicokeydokeyallrightfuddy+duddyold-fashionedpersontellytelevisiongrabbygrasping(adj)thingumabobthinghonhoney(name)thingumajigthinghumongoushugetinker+toytoyLookalookwho(se)jiggerthingPart1:CHAT56Lookitlook!
whatchamacallitthing8.
6.
11DialectalVariationsOthervariantpronunciations,suchasdatforthat,involvestandarddialectalsoundsubstitutionswithoutdeletions.
Unfortunately,usingtheseformscanmakelexicalretrievalverydifficult.
Forexample,aresearcherinterestedinthewordtogetherwillseldomre-membertoincludetagetherinthesearchstring.
Onesolutiontothisproblemistofolloweachvariantformwiththestandardform,asgivenbelowusingthe[:replacement]notation.
AnothersolutionistocreateafullphonologicaltranscriptionofthewholeinteractionlinkedtoafullsonicCHATdigitizedaudiorecord.
Intranscriptswherethespeakershavestrongdialectalinfluences,thisisprobablythebestsolution.
Athirdsolutionistoignorethedialectalvariationandsimplytranscribethestandardform.
Ifthisisbeingdone,thepracticemustbeclearlynotedinthereadmefile.
NoneoftheseformsareinWebster'sThirdNewInternationalDictionary.
DialectalVariantsVariantStandardVariantStandardcaintcan'thowsabouthowaboutdathenutinnothingdanthansumpinsomethingdatthattatodethetagethertogetherdesethesetamorrowtomorrowdeirtheirweunzwedeirselvesthemselveswhadwhatdemthemwifwithdemselvesthemselvesyayoudenthenyallyouallderethereyeryourdeytheyyouseyoualldisthisyinzyoualldosethoseyounzyouallferforzethegitgetzisthisgongoingzatthathisselfhimself8.
6.
12BabyTalkBabytalkor"caregiverese"formsincludeonomatopoeicwords,suchaschoochoo,anddiminutives,suchasfroggieorthingie.
Inthefollowingtable,diminutivesaregiveninfinal"-ie"exceptforthesixcommonformsdoggy,kitty,piggy,potty,tummy,anddolly.
Part1:CHAT57Whereverpossible,usethesuffix"-ie"forthediminutiveandthesuffix"-y"fortheadjectivalizer.
Thefollowingtabledoesnotincludethehundredsofpossiblediminutiveswiththe"-ie"suffixsimplyattachedtothestem,asineggie,footie,horsie,andsoon.
Nordoesitattempttolistformssuchaspoopy,whichusetheadjectivalizer"-y"attacheddirectlytothestem.
WordsthataremarkedwithanasteriskcannotbefoundinWebster'sThirdNewInternationalDictionary.
BabyTalkBabyTalkStandardBabyTalkStandardbeddie(bye)gotosleepnunuhurtblankieblanketnight(ie)+nightgoodnightboobooinjury,hurtowiehurtboomfallpantieunderpantsbyebyegood-byepeeurine,urinatechoochootrainpeekaboolookinggamecootchykooticklepeepeeurine,urinatedark+timenight,eveningpeeyousmellydoggydogpoo(p)defecation,defecatedollydollpoopoodefecation,defecatedoodoofecespottytoiletdumdumstupidrockabyesleepewunpleasantscrunchcrunchfootie+balliefootballsmooshsmashgidd(y)upgetmoving(t)eensy(w)eensylittlegoodydelight(t)eeny(w)eenylittleguckunpleasantteeteeurine,urinatejammiepajamastittybreastkikicattippytoeontipsoftoeskittycattummystomach,bellylookeelookyee!
ughunpleasantmoo+cowcow(wh)oopsadaisysurpriseormistake8.
6.
13WordseparationinJapaneseManyanalyseswithCLANrelyonwordsasitems.
However,inJapanesescript(Kana,Kanji),wordsaretraditionallynotdividedbyspaces.
WhentranscribingJapanesedatainLatinscript(Romaji)aswellasinJapanesescript(KanaKanji),youshouldaddspacestoidentifywords.
TheWAKACHI02systemcanbedownloadedasapartofthecompleteJPNgrammarfromhttps://talkbank.
org/morgrams.
Thiswebpagesummarizestherulesforwordseparation(Wakachigaki).
ItiscrucialtofollowtheserulesinordertogetcorrectresultsfromMOR(automaticalmorphologicalanalysis)orDSS(DevelopmentalSentenceScore).
Part1:CHAT588.
6.
14AbbreviationsinDutchDutchmakesextensiveuseofabbreviationsinwhichvowelsareoftenomittedleavingsingleconsonants,whicharemergedwithnearbywords.
Forconsistencyofmorphologicalanalysis,itisbesttotranscribetheseshorteningsusingtheparenthesisnotation,asfollows:AbbreviationsinDutchAbbreviationCHATformAbbreviationCHATform'k(i)knienie(t)'m(he)mese(en)s'r(e)r'n(ee)nz'nz(ij)n's(i)s'b(he)b't(he)t'ns(ee)nswawa(t)'rin(e)rindada(t)'raf(e)raf'weest(ge)weest'ruit(e)ruit'rop(e)ropSomeformsthatshouldprobablyremainwiththeirstandardapostrophesinclude'smor-gens,'sochtends,'savonds,'snachts,andtheapostrophe-spluralform.
Part1:CHAT599UtterancesThebasicunitsofCHATtranscriptionarethemorpheme,theword,andtheutterance.
Intheprevioustwochaptersweexaminedprinciplesfortranscribingwordsandmorphemes.
Inthischapterweprinciplesfordelimitingutterances.
9.
1OneUtteranceorManyEarlychildlanguageisrichwithrepetitions.
Forexample,achildmayoftensaythesamewordorgroupofwordseighttimesinarowwithoutchanges.
TheCHATsystemprovidesmechanismsforcodingtheserepetitionsintosingleutterances.
However,attheearlieststages,itmaybemisleadingtotrytocompactthesemultipleattemptsintoasingleline.
Considerfivealternativewaysoftranscribingaseriesofrepeatedwords.
1.
Simpletranscriptionofthewordsasseveralitemsinasingleutterance:*CHI:milkmilkmilkmilk.
2.
Transcriptionofthewordsasitemsinasingleutterance,separatedbycommas:*CHI:milk,milk,milk,milk.
3.
Transcriptionasfourusesofasingleword.
Thisisequivalentto(4)below.
*CHI:milk[x4].
4.
Treatmentofthewordsasaseriesofattemptstorepeatthesingleword:*CHI:milk[/]milk[/]milk[/]milk.
5.
Treatmentofthewordsasseparateutterances:*CHI:milk.
*CHI:milk.
*CHI:milk.
*CHI:milk.
ThesefiveformsoftranscriptionwillleadtomarkedlydifferentanalyticoutcomesforprogramssuchasMLU(meanlengthofutterance).
ThefirsttwoformswillallbecountedashavingoneutterancewithfourmorphemesforanMLUof4.
0.
ThethirdandfourthformswillbecountedashavingoneutterancewithonemorphemeforanMLUof1.
0.
ThefifthformwillbecountedashavingfourutteranceseachwithonemorphemeforanMLUof1.
0.
Ofcourse,notallanalysesdependcruciallyonthecomputationofMLU,butproblemswithdecidinghowtocomputeMLUpointtodeeperissuesintranscriptionandanalysis.
InordertocomputeMLU,onehastodecidewhatisawordandwhatisanutteranceandthesearetwoofthebiggestdecisionsthatonehastomakewhentranscribingandanalyzingchildlanguage.
Inthissense,thecomputationofMLUservesasamethodologicaltripwirefortheconsiderationofthesetwodeeperissues.
Otheranalyses,includinglexical,syntactic,anddiscourseanalysesalsorequirethatthesedecisionsbemadeclearlyandconsistently.
However,becauseofitsconceptualsimplicity,theMLUindexplacestheseproblemsintothesharpestfocus.
Part1:CHAT60Thefirsttwoformsoftranscriptionallmakethebasicassumptionthatthereisasingleutterancewithfourmorphemes.
Giventheabsenceofanyclearsyntacticrelationbetweenthefourwords,itseemsdifficulttodefenduseofthisformoftranscription.
Thethirdandfourthformsoftranscriptiontreatthesuccessiveproductionsoftheword"milk"asrepeatedattemptstoproduceasingleword.
Thisformoftranscriptionmakessenseifthechildwassimplyperseverating.
Ifthethirdformoftranscriptionisused,thecommandswill,bydefault,treattheutteranceashavingonlyonemorpheme.
Forthefourthformoftranscription,CLANprovidestwopossibilities.
Thedefaultistotreatthefourthtypeasavariantofthethirdform.
However,thereisalsoaCLANoptionthatallowstheusertooverridethisdefaultandtreateachwordasaseparatemorpheme.
ThisthenallowstheresearchertocomputetwodifferentMLUvalues.
Theanalysiswithrepetitionsexcludedcouldbeviewedastheonethatemphasizessyntacticstructureandtheonewithrepetitionsincludedcouldbeviewedastheonethatemphasizesproductivity.
Finally,ifthereisevidencethatthewordisnotsimplyarepetition,itwouldseembesttousethefifthformoftranscription.
Thisisparticularlytrueiftheintonationpatternindicatesrepeatedinsistenceonabasicsingle-wordmessage.
Theexamplewehavebeendiscussinginvolvesasimplecaseofwordrepetition.
Inothercases,researchersmaywanttogrouptogethernon-repeatedwordsforwhichthereisonlypartialevidenceofsyntacticorsemanticcombination.
Considerthecontrastbetweenthesenexttwoexamples.
Inthefirstexample,thepresenceoftheconjunction"and"motivatestreatmentofthewordsasasyntacticcombination:*CHI:red,yellow,blue,andwhite.
However,withouttheconjunctionorotherintonationalevidence,thewordsarebesttreatedasseparateutterances:*CHI:red.
*CHI:yellow.
*CHI:blue.
*CHI:white.
Asthechildgetsolder,thesolidificationofintonationalpatternsandsyntacticstruc-tureswillgivethetranscribermorereasontogroupwordstogetherintoutterancesandtocoderetracingsandrepetitionsaspartsoflargerutterances.
9.
2SatelliteMarkersSegmentationintoutterancescanbefacilitatedthroughcarefultreatmentofinteractionalmarkersandother"communicators"suchas"yes,""sure,""well,"and"now.
"Thesemarkersshouldbegroupedtogetherwiththeutterancestowhichtheyaremostcloselyboundintermsofintonation.
Thisgroupingcanbemarkedwithcommas,ormoreexplicitlythroughtheuseofprefixed(F2+vtoenter)andsuffixed(F2+ttoenter)interactionalmarkers,asintheseexamples.
*CHI:noMommynogo.
*CHI:MommyIwantsome.
*CHI:youneeditright*CHI:youneeditdon'tyouPart1:CHAT61Thetypesofelementsthatoccurasinitialsatellitesincludevocativesandcommunicators(well,but,sure,gosh).
Elementsthatoccurasfinalsatellitesincludequestionmarkers(okay,see)andsentencefinalparticles,aswellasvocativesandcommunicators.
TheuseoftheseprefixingandsuffixinginteractionalmarkersisparticularlyimportantforAsianlanguagesthatusesentencefinalparticles.
Thesesatellitemarkersshouldbesurroundedbyspaces,sincetheywillbetreatedasseparatewordformsbyMORandGRASP.
Useofthesemarkershelpsimprovesyntacticanalysis,andprovidesamorerealisticcharacterizationofutterances.
9.
3DiscourseRepetitionEarlier,wediscussedproblemsinvolvedindecidingwhetheragroupofwordsshouldbeviewedasoneutteranceorasseveral.
Thisissuemovesintothebackgroundwhenthewordrepetitionsarebrokenupbytheconversationalinteractionsorbythechild'sownactions.
Considerthisexample:*MOT:whatdoyoudrinkforbreakfast*CHI:milk.
*MOT:andwhatdoyoudrinkforlunch*CHI:milk.
*MOT:howaboutfordinner*CHI:milk.
*MOT:andwhatisyourfavoritethingtodrinkatbedtime*CHI:milk.
Orthechildmayuseasingleutterancerepeatedly,buteachtimewithaslightlydiffer-entpurpose.
Forexample,whenputtingtogetherapuzzle,thechildmaypickupapieceandask:*CHI:wheredoesthispiecegoThismayhappenninetimesinsuccession.
Inbothoftheseexamples,itseemsunfairfromadiscoursepointofviewtotreateachutteranceasamererepetition.
Instead,eachisfunctioningindependentlyasafullcommunication.
Onemaywanttomarkthefactthatthelexicalmaterialisrepeated,butthisshouldnotaffectotherquantitativemeasures.
9.
4C-Units,sentences,utterances,andrun-onsThereisatendencyintheliteraturetoavoidtheuseoftheterm"sentence"torefertotheunitsofspokenlanguage.
Toavoidthisproblem,researchersusetheterms"utterance"and"c-unit"orconversationalunit.
Thelatterisdefinedasamainclausealongwithitsdependent(subordinateorcoordinate)clauses.
However,whendefinedinthisway,ac-unitisreallynottoodifferentfromasentence.
Themajordifferenceisthatac-unitmaybeincompleteandmayincludedisfluencies,retraces,etc.
whichwouldnotbepresentinwrittenlanguage.
Inthepast,sometranscribershavetendedtogroupallofthewordsinaturnintoasinglesentencewithonlyonefinaldelimiter.
Thisisamistake.
Utterancescanincludemainclauseswithassociateddependingclauses,buttheyshouldnotincludemultiplemainclauses.
Sometimeschildrenwillstringtogethermultipleutteranceswith"and…and".
Insuchcases,eachutterancewithanew"and"shouldbeplacedonanewtier,asanewPart1:CHAT62utterance.
However,clausesthatarejoinedbyotherconjunctionsshouldbetreatedasasingleutterance.
9.
5RetracingWhenaspeakerabandonsanutterance,sometimesanotherspeakerwilltakeaturn.
Inthatcase,thefirstutterancecanbemarkedwithatrailingoffterminator,asdiscussedbelow.
However,ifthefirstspeakercontinues,thenthetranscriberhasachoicetomake.
Onepossibilityistomarktheabandonedandretracedmaterialwiththe[//]symbolalongwithsomescopingmarkers.
However,iftheabandonmentofthefirstsegmentisfollowedbyasignificantpause,thenitwouldbebettertoconsideritasatrailingoffandthentobeginanewutterancewiththefollowingmaterial.
Ineithercase,itisagoodideatomarkthefactthattherewasalongpausebyinsertingapausemarkerwithatimevalue,suchas(3.
5)for3.
5seconds.
9.
6BasicUtteranceTerminatorsThebasicCHATutteranceterminatorsaretheperiod,thequestionmark,andtheexcla-mationmark.
CHATrequiresthattherebeonlyoneutteranceoneachmainline.
Inordertomarkthis,eachutterancemustendwithoneofthesethreeutteranceterminators.
Itispossibletousethecommaonthemainline,butitisnottreatedasaterminator.
However,asinglemainlineutterancemayextendforseveralcomputerlines,asinthisexample:*CHI:this.
*MOT:ifthisistheoneyouwant,youwillhavetotakeyourspoonoutoftheotherone.
Theutteranceinthismaintierextendsfortwolinesinthecomputerfile.
Whenitisnecessarytocontinueanutteranceonthemaintierontoasecondline,thesecondlinemustbeginwithatab.
CLANissettoexpectnomorethan2000charactersineachmaintier,dependenttier,orheaderline.
Period.
Aperiodmarkstheendofanunmarked(declarative)utterance.
Herearesomeexam-plesofunmarkedutterances:*SAR:Igotcold.
*SAR:pickle.
*SAR:no.
ForcorrectfunctioningofCLAN,periodsshouldbeeliminatedfromabbreviations.
Thus"Mrs.
"shouldbewrittenasMrsandE.
T.
shouldbecomeE+T.
Onlypropernounsandtheword"I"anditscontractionsarecapitalized.
Wordsthatbeginsentencesarenotcapitalized.
QuestionMarkPart1:CHAT63Thequestionmarkindicatestheendofaquestion.
Aquestionisanutterancethatusesawh-questionword,subject-verbinversion,oratagquestionending.
Hereisanexampleofaquestion:*FAT:isthatacarrotThequestionmarkcanalsobeusedafteradeclarativesentencewhenitisspokenwiththerisingintonationofaquestion.
ExclamationPoint!
Anexclamationpointmarkstheendofanimperativeoremphaticutterance.
Hereisanexampleofanexclamation:*MOT:sitdown!
Ifthisutteranceweretobeconveyedwithfinalrisingcontour,itwouldinsteadbe:*MOT:sitdown9.
7SeparatorsCHATallowsfortheuseofseveralconventionalpunctuationfeaturesthathavenoformalroleinthetranscriptionsystem.
Wecallthese"separators"anddistinguishthemfromterminators,whichhaveaformalrole,andthevariousCAintonationmarks.
Comma,ThecommaisusedwidelythroughoutCHATtranscriptstorepresentacombinationoffeaturessuchaspause,syntacticjuncture,intonationaldrop,andothers.
Althoughithasnoformaldefinitionorsystematiccharacterization,itisfinetousethissymbol.
TheuseofcommatomarklevelintonationinCAisreplacedbytheuseofthemark→.
Semicolon;ThesemicolonisusedprimarilytomarksyntacticstructuresincorporasuchastheSCOTUSoralargumentsfromtheSupremeCourt.
Mostconversationaltranscriptsdonotneedtousethismark.
TheuseofsemicolontomarkalightfinaldropinCAisreplacedbytheuseofthemark↘.
Colon:Inordertousethecolonasaseparator,itmustbesurroundedbyspaces.
Thecolonisalsousedwithinwordstomarklengthening.
OtherTranscribersshouldavoidusingotherseparators,becausemostofthemhavespecialmeaningsinCHAT.
Part1:CHAT649.
8ToneDirectionEarlierversionsofCHAThadusedaspecialsetofterminatingtoneunits,suchas-and-!
.
InordertobringCHATmoreintoaccordwithstandardpractice,wehaveshiftedtoarelianceonmarkssuchas↑forrising↓.
AlltheotherCAmarkscanalsobeusedinCHATfiles.
However,unlikeCA,CHATrequiresthateveryutterancehaveafinaldelimiter.
ThismeansthatCAandCHATareinagreementinassumingthatfinalquestionmarkincludesarisingintonation,finalexclamationmarkrepresentsemphaticintonation,andthatfinalperiodrepresentsafinalfall.
Inaddition,CHATassumesthatthequestionmarkisusedwithquestions,thattheexclamationmarkisusedwithexclamations,andthattheperiodterminatesdeclarativesentences.
Sometimesquestionsdonotendinarisingintonation.
Inthatcase,theactualintonationusedcanbemarkedwiththefallingmark↓afterthefinalword,thenfollowedbythequestionmark,asinthisexample:*MOT:Areyougoingtostore↓Finalrisefallcontourcanberepresentedwith↑↓andfinalfall-risecanberepresentedwith↓↑.
9.
9ProsodyWithinWordsCHATalsoprovidescodesformarkinglengthening,andpausingwithinwords.
Formarkingfeaturessuchasstressingandpitchriseandfall,transcribersshouldrelyontheCHAT-CAmarksindicatedaboveandprovidedinthechapteronCAcoding.
Inadditiontothosesymbols,thefollowingsymbolsarealsoavailable:PrimaryStressTheUnicodesymbol(U02C8)canbeusedtomarkprimarystress.
Itisplacedrightbeforethestressedsyllable,asinthisexample:MOT:babywantbana:nasSecondaryStressTheUnicodesymbol(U02CC)canbeusedtomarksecondarystress.
Itisplacedrightbeforethestressedsyllable,asinthisexample:MOT:babywantbana:nasLengthenedSyllable:Acolonwithinawordindicatesthelengtheningordrawlingofasyllable.
Thismarkshouldbeattachedtoavowelorcontinuant,becauseitisdifficulttodrawlanobstruent:MOT:babywantbana:nasPauseBetweenSyllables^Apausebetweensyllablesmaybeindicatedasinthisexample:Part1:CHAT65MOT:isthatarhi^nocerosThereisnospecialCHATsymbolforafilledpause.
Instead,&-ah,&-eh,&-er,&-ew,&-hm,&-mm,&-uh,&-uhm,and&-umareusedtomarkthevariousformsoffilledpauses.
Blocking^Speakerswithmarkedlanguagedisfluenciesoftenengageinaformofwordattackknownas"blocking"(BernsteinRatneretal.
,1996).
Thisformofwordattackismarkedbyacaretoruparrowplaceddirectlybeforetheword.
9.
10LocalEventsWetendtothinkofthebasicformofatranscriptasinvolvingaseriesofwords,alongwithoccasionalcommentaryaboutthesewords.
Wecanthinkofthesewordsasachainofeventsinwhichourconventionofwritingfromlefttorightrepresentsthetemporalsequenceoftheevents.
Duringthissequenceofwords,wecanalsodistinguishavarietyoflocaleventsthatdonotmapontowords.
Therearefivetypesoftheselocalevents:simpleevents,complexevents,pauses,longevents,andinterposedremarks.
9.
10.
1SimpleEventsInadditiontotheformalizedexclamationsgiveninthechapteronwords,speakersproduceawidevarietyofsoundssuchascries,sneezes,andcoughs.
TheseareindicatedinCHATwiththeprefix&=,inordertoproduceformssuchas&=sneezesand&=yells.
Inordertoretrievetheseformsconsistently,wehavesetupthefollowingstandardizedspellings.
Notethatverbsaregiveninthethirdpersonpresentform.
Otherlanguagescaneitherusethissetorcreatetheirowntranslationsoftheseterms.
Perhapsthemostcommonoftheseis&=laughs,whichcanbeusedtorepresentalltypesoflaughs,chuckles,andgiggles.
&=belches&=hisses&=grunts&=whines&=coughs&=hums&=roars&=whistles&=cries&=laughs&=sneezes&=whimpers&=gasps&=moans&=sighs&=yawns&=groans&=mumbles&=sings&=yells&=growls&=pants&=squeals&=vocalizesItisimportanttorememberthatthesecodesmustfullycharacterizecompletelocalevents.
Ifyourintentionistomarkthatastretchofwordshasbeenmumbled,thenyoushouldusethescopedcodesdiscussedinthenextchapter.
However,ifyouonlywishtocodethatsomemumblingorsingingoccursataparticularpoint,thenyoucanusethissimplerform.
Simpleeventformscanalsobeusedtomarkactionssuchasrunningandreading.
Whentheseactionsaretransitive,asinimit:(imitation),point:andmove:theycanalsotakeanobject.
Forexample,averycommonvocalizeris&=imit:motorforanimitationofthesoundofamotor.
Thetablebelowillustratesthisuseofcompoundsimplecodes.
Part1:CHAT66&=imit:motor&=ges:frustration&=writes:dog&=points:car&=imit:plane&=ges:squeeze&=reads:sign&=points:nose&=imit:lion&=ges:come&=walks:door&=turns:page&=imit:baby&=shows:picture&=runs:door&=hits:table&=ges:ignore&=shows:scab&=eats:cookie&=pats:head&=ges:unsure&=moves:doll&=drinks:milkTheobjectofthe&=imitcodesindicatesthenoisesourcebeingimitatedvocally.
Theobjectsofthe&=gescodesindicatethemeaningofthegesturesbeingused.
Theobjectsofactivitiessuchas&=walkand&=runindicatethedirectionorgoalofthewalkingorrunning.
Foractionssuchas&=slurpand&=eatusedbythemselves,thecoderepresentstheauditoryresultsoftheslurpingoreating.
Finally,youcancomposecodesusingpartsofthebodyasin&=head:yestoindicatenodding"yes"withthehead.
Somecodesofthistypeinclude:&=head:yes,&=head:no,&=head:shake,&=hands:no,&=hands:hello,&=eyes:open,&=mouth:open,and&=mouth:close.
Thisformofcodingiscompactandcanbeeasilysearched.
Moreover,itiseasytolocateatapointwithinanongoingutterancewithoutbreakingupthereadabilityoftheutterance.
Wheneverpossible,trytousethisformofcodingasasubstituteforwritinglongercommentsonthecommentlineorinsertingcomplexlocaleventsonthemainline.
9.
10.
2InterposedWord&*Itissometimesconvenienttomarktheinterpositionorinsertionofashortcommentwordinabackchannel,suchas"yeah"or"mhm",withinalongerdiscoursefromthespeakerwhohasthefloorwithoutbreakinguptheutteranceofthemainspeaker.
Thisismarkedusing&*followedbythespeaker's3-letterID,acolon,andthentheinterposedword.
Hereisanexampleofhowthiscanbeused:CHI:whenIwasoveratmyfriend'shouse&*MOT:mhmthedogtriedtolickmeallover.
9.
10.
3ComplexLocalEventsInadditiontotherestrictedsetofsimpleeventsdiscussedabove,itispossibletouseanopenformtosimplyinsertanysortofdescriptionofaneventonthemainline.
ComplexLocalEvent[^text]Likethesimplelocalevents,thesecomplexlocaleventsareassumedtooccurexactlyatthepositionmarkedinthetextandnottoextendoversomeotherevents.
Ifthematerialisintendedasacommentoveralongerscopeofevents,usetheformofthescopedcommentsgiveninthenextsection.
Thisformofcodingcanalsobeusedattheverybeginningofutterancestoreplacetheearlier"precodes"thatmarkedthingslikethespecificaddressee,eventsjustbeforetheutterance,orthebackgroundtotheutterance.
Part1:CHAT679.
10.
4PausesThethirdtypeoflocaleventistheunfilledpause,whichtakesupaspecifieddurationoftimeatthepointmarkedbythecode.
Pausesthataremarkedonlybysilencearecodedonthemainlinewiththesymbol(.
).
Longerpausesbetweenwordscanberepresentedas(.
.
)andaverylongpauseas(…)Thisexampleillustratestheseforms:*SAR:Idon't(.
.
)know.
*SAR:(.
.
.
)whatdoyou(.
.
.
)thinkIfyouwanttobeexact,youcancodetheexactlengthofthepausesinseconds,asintheseexamples.
*SAR:Idon't(0.
15)know.
*SAR:(13.
4)whatdoyou(2.
)thinkIfyouneedtoaddminutes,thenyoucanuseacolonforoneminute,and5.
15seconds:*SAR:Idon't(1:05.
15)know.
9.
10.
5LongEventsItispossibletomarkthebeginningandendingofsomeextralinguisticeventwiththelongfeatureconvention.
Forthismarking,thereisabeginningcodeatthebeginningoftheeventandaterminationcodefortheending.
LongVocalEvent&{l=*interveningtext&}l=*Heretheasteriskmarkssomedescriptionofthelongevent.
Forexample,aspeakercouldbeginlaughingatthepointmarkedby&{l=laughsandthencontinueuntiltheendmarkedby&}l=laughs.
LongNonvocalEvent&{n=*interveningtext&}n=*Heretheasteriskmarkssomedescriptionofalongnonverbalevent.
Forexample,aspeakercouldbeginwavingtheirhandsatthepointmarkedby&{n=waving:handsandthencontinueuntiltheendmarkedby&}n=waving:hands.
9.
11SpecialUtteranceTerminatorsInadditiontothethreebasicutteranceterminators,CHATprovidesaseriesofmorecomplexutteranceterminatorstomarkvariousspecialfunctions.
Thesespecialterminatorsallbeginwiththe+symbolandendwithoneofthethreebasicutteranceterminators.
TrailingOff+…Thetrailingofforincompletionmarker(plussignfollowedbythreeperiods)istheterminatorforanincomplete,butnotinterrupted,utterance.
Trailingoffoccurswhenspeakersshiftattentionawayfromwhattheyaresaying,sometimesevenforgettingwhattheyweregoingtosay.
Usuallythetrailingoffisfollowedbyapauseintheconversation.
Part1:CHAT68Afterthislull,thespeakermaycontinuewithanotherutteranceoranewspeakermayproducethenextutterance.
Hereisanexampleofanuncompletedutterance:*SAR:smellsgoodenoughfor+.
.
.
*SAR:whatisthatIfthespeakerdoesnotreallygetachancetotrailoffbeforebeinginterruptedbyanotherspeaker,thenusetheinterruptionmarker+/.
ratherthantheincompletionsymbol.
Donotusetheincompletionmarkertoindicateeithersimplepausing(.
),repetition[/],orretracing[//].
Notethatutterancefragmentscodedwith+…willbecountedascompleteutterancesforanalysessuchasMLU,MLT,andCHAINS.
Ifyourintentionistoavoidtreatingthesefragmentsascompleteutterances,thenyoushouldusethesymbol[/-]discussedlater.
TrailingOffofaQuestion+.
.
Iftheutterancethatisbeingtrailedoffhastheshapeofaquestion,thenthissymbolshouldbeused.
QuestionWithExclamation+!
Whenaquestionisproducedwithgreatamazementorpuzzlement,itcanbecodedusingthissymbol.
Theutteranceisunderstoodtoconstituteaquestionsyntacticallyandpragmatically,butanexclamationintonationally.
Interruption+/.
Thissymbolisusedforanutterancethatisincompletebecauseonespeakerisinterrupt-edbyanotherspeaker.
Hereisanexampleofaninterruption:*MOT:whatdidyou+/.
*SAR:Mommy.
*MOT:+,withyourspoon.
Someresearchersmaywishtodistinguishbetweenaninvitedinterruptionandanuninvitedinterruption.
Aninvitedinterruptionmayoccurwhenonespeakerispromptinghisaddresseetocompletetheutterance.
Thisshouldbemarkedbythe++symbolforother-completion,whichisgivenlater.
Uninvitedinterruptionsshouldbecodedwiththesymbol+/.
attheendoftheutterance.
Anadvantageofusing+/.
insteadof+.
.
.
isthatprogramslikeMLUareabletopiecetogetherthetwosegmentsandtreatitasasingleutterancewhenasegmentwith+/.
isfollowedby+,onthenextutterance.
InterruptionofaQuestion+/Iftheutterancethatisbeinginterruptedhastheshapeofaquestion,thenthissymbolshouldbeused.
Part1:CHAT69Self-Interruption+//.
Someresearcherswishtobeabletodistinguishbetweenincompletionsinvolvingatrailingoffandincompletionsinvolvinganactualself-interruption.
Whenanincompletionisnotfollowedbyfurthermaterialfromthesamespeaker,the+.
.
.
symbolshouldalwaysbeselected.
However,whenthespeakerbreaksoffanutteranceandstartsupanother,the+//.
symbolcanbeused,asinthisexample:*SAR:smellsgoodenoughfor+//.
*SAR:whatisthatThereisnohardandfastwayofdistinguishingcasesoftrailingofffromself-interrup-tion.
Forthisreason,someresearchersprefertoavoidmakingthedistinction.
Researcherswhowishtoavoidmakingthedistinctionshoulduseonlythe+.
.
.
symbol.
Self-InterruptedQuestion+//Iftheutterancebeingself-interruptedisaquestion,youcanusethe+//symbol.
TranscriptionBreak+.
Itisoftenconvenienttobreakutterancesatphrasalboundariesinordertomarkoverlaps.
Whenthisisdone,thefirstsegmentisendedwiththe+.
terminator,asinthisexample:*SAR:smellsgoodenoughforme+.
*MOT:but+.
*SAR:ifIcouldhavesome.
*MOT:whywouldyouwantitQuotation"and"Formarkingshortquotationstretchesinsideanutterance,thebegindouble-quote(",Unicode201C)andenddouble-quote(",Unicode201D)symbolscanbeused.
ThesecanbeenteredintheCLANeditorusingF2-'andF2-"respectively.
QuotationFollows+"/.
Duringstoryreadingandsimilaractivities,agreatdealoftalkmayinvolvedirectquo-tation.
Inordertomarkoffthismaterialasquoted,aspecialsymbolcanbeused,asinthefollowingexample:*CHI:andthenthelittlebearsaid+"/.
*CHI:+"pleasegivemeallofyourhoney.
*CHI:+"ifyoudo,I'llcarryyouonmyback.
Theuseofthe+"/.
symbolislinkedtotheuseofthe+"symbol.
Breakingupquotedmaterialinthiswayallowsustomaintaintherulethateachseparateutteranceshouldbeonaseparateline.
Thisformofnotationisonlyusedwhenthematerialbeingquotedisacompleteclauseorsentence.
ItisnotneededwhenafewwordsarebeingquotedinPart1:CHAT70noncomplementposition.
Inthosecases,usethestandardsingleanddoublequotationmarksdescribedjustabove.
QuotationPrecedes+".
Thissymbolisusedwhenthematerialbeingdirectlyquotedprecedesthemainclause,asinthefollowingexample:*CHI:+"pleasegivemeallofyourhoney.
*CHI:thelittlebearsaid+".
9.
12UtteranceLinkersThereisanothersetofsymbolsthatcanbeusedtomarkotheraspectsofthewaysinwhichutteranceslinktogetherintoturnsanddiscourse.
Thesesymbolsarenotutteranceterminators,bututteranceinitiators,orrather"linkers.
"Theyindicatevariouswaysinwhichanutterancefitsinwithanearlierutterance.
Eachofthesesymbolsbeginswiththe+sign.
QuotedUtterance+"Thissymbolisusedinconjunctionwiththe+"/.
and+".
symbolsdiscussedearlier.
Itisplacedatthebeginningofanutterancethatisbeingdirectlyquoted.
QuickUptake+^Sometimesanutteranceofonespeakerfollowsquicklyontheheelsofthelastutteranceoftheprecedingspeakerwithoutthecustomaryshortpausebetweenutterances.
Anexam-pleofthisis:*MOT:whydidyougo*SAR:+^Ireallydidn't.
SelfCompletion+,Thesymbol+,canbeusedatthebeginningofamaintierlinetomarkthecompletionofanutteranceafteraninterruption.
Inthefollowingexample,itmarksthecompletionofanutterancebyCHIafterinterruptionbyEXP.
Notethattheincompletedutterancemustbeterminatedwiththeincompletionmarker.
*CHI:soafterthetower+/.
*EXP:yeah.
*CHI:+,Igostraightahead.
OtherCompletion++Avariantformofthe+,symbolisthe++symbolwhichmarks"latching"orthecom-pletionofanotherspeaker'sutterance,asinthefollowingexample:*HEL:ifBillhadknown+.
.
.
Part1:CHAT71*WIN:++hewouldhavecome.
Part1:CHAT7210ScopedSymbolsUptothispoint,thesymbolswehavediscussedareinsertedatsinglepointsinthetranscript.
Theyrefertoeventsoccurringatparticularpointsduringthedialogue.
Thereisanothermajorclassofsymbolsthatrefersnottoparticularpointsinthetranscript,buttostretchesofspeech.
Thesemarkersymbolsareenclosedinsquarebracketsandthematerialtowhichtheyrelatecanbeenclosedinanglebrackets.
Thematerialinthesquarebracketsfunctionsasadescriptorofthematerialinanglebrackets.
Ifascopedsymbolappliesonlytothesinglewordprecedingit,theanglebracketsneednotbemarked,becauseCLANconsidersthatthematerialinsquarebracketsreferstoasingleprecedingwordwhentherearenoanglebrackets.
Thereshouldbenoothermaterialenteredbetweenthesquarebracketsandthematerialtowhichitrefers.
Dependingonthenatureofthematerialinthesquarebrackets,thematerialintheanglebracketsmaybeautomaticallyexcludedfromcertaintypesofanalysis,suchasMLUcountsandsoforth.
Scopedsymbolsareusefulformarkingawidevarietyofrelations,includingparalinguistics,explanations,andretracings.
10.
1AudioandVideoTimeMarksInordertolinksegmentsofthetranscripttostretchesofdigitizedaudioandvideo,CHATusesthefollowingnotation:TimeAlignment·0_1073·Thismarkerprovidesthebeginandendtimeinmillisecondsforasegmentinadigitizedvideofileoraudiofile.
Usually,thisinformationishidden.
However,ifyouusetheescape-Acommandintheeditor,thebulletwillexpandandyouwillseethetimevalues.
Eachsetoftimealignmentinformationhasanimplicitscopethatincludesallofthematerialtotheleftuptothenextsetofbullets.
Thesetimemarksallowforsingleutteranceplaybackorcontinuousplayback.
Ifyouinsertadashbeforethetime,asin·-5567_9888·thisindicatesthatcontinuousplaybackshouldnotactuallywaitthroughlongperiodsofsilencebetweenthebullets.
Bydefault,thesebulletsshouldoccurattheendofspeakerlines,afterthefinalterminatorandafteranypostcodes.
However,iftheoption"multiple"isselectedinthe@Optionsfield,thenbulletsmayalsooccurwithinutterances.
PicBullet·%pic:cat.
jpg·Thismarkerisusedtoinsertabulletthatcanbeclickedtodisplayapicture.
ThisfieldisalsousedinthegesturecodingsystemdiscussedintheCLANmanual.
TheformatofthesefilesisnotfixedbyCHAT,butmanyofthesameconventionsareused.
Oneadditionalcodeusedthereisthe@T:headerwhichmarkstheplaceoftheinsertionofavideopicturetakenfromamovieasathumbnailrepresentationofwhatishappeningataparticularmomentintheinteraction.
TextBullet·%txt:cat.
txt·Thismarkerisusedtoinsertabulletthatcanbeclickedtodisplayatextfile.
Part1:CHAT7310.
2ParalinguisticandDurationScopingParalinguisticMaterial[=!
text]Paralinguisticevents,suchas"coughing,""laughing,"or"yelling"canbemarkedbyusingsquarebrackets,the=!
symbol,aspace,andthentextdescribingtheevent.
*CHI:that'smine[=!
cries].
Thismeansthatthechildcrieswhilesayingtheword"mine.
"Ifthechildcriesthroughout,thetranscriptionwouldbe:*CHI:[=!
cries].
Inordertoindicatecryingwithnoparticularvocalization,youshouldusethe&=cries"simpleform"notationdiscussedearlier,asin*CHI:&=cries.
Thissameformatof[=!
text]canalsobeusedtodescribeprosodiccharacteristicssuchas"glissando"or"shouting"thatarebestcharacterizedwithfullEnglishwords.
Paralinguisticeffectssuchassoftspeech,yelling,singing,laughing,crying,whispering,whimpering,andwhiningcanalsobenotedinthisway.
Forafullsetofthesetermsanddetailsontheirusage,seeCrystal(1969)orTrager(1958).
Hereisanotherexample:*NAO:watchout[=!
laughing].
Stressing[!
]Thissymbolcanbeusedwithoutaccompanyinganglebracketstoindicatethatthepre-cedingwordisstressed.
Theanglebracketscanalsomarkthestressingofastringofwords,asinthisexample:*MOT:Billy,wouldyouplease[!
].
ContrastiveStressing[!
!
]Thissymbolcanbeusedwithoutaccompanyinganglebracketstoindicatethatthepre-cedingwordiscontrastivelystressed.
Ifawholestringofwordsiscontrastivelystressed,theyshouldbeenclosedinanglebrackets.
Duration[#time]Thissymbolindicatesthedurationinsecondsoftheprecedingmaterialthathasbeenmarkedwithanglebracketsasin:*MOT:Icoulduse[#2.
2]fortheparty.
Part1:CHAT7410.
3ExplanationsandAlternativesExplanation[=text]Thissymbolisusedforbriefexplanationsonthetexttier.
Thissymbolishelpfulforspecifyingthedeicticidentityofobjectsandpeople.
*MOT:don'tlookinthere[=closet]!
Explanationscanbemoreelaborateasinthisexample:*ROS:youdon'tscaremeanymore[=thecommand"don'tscaremeanymore!
"].
Analternativeformfortranscribingthisis:*ROS:youdon'tscaremeanymore.
%exp:meanstoissuetheimperative"Don'tscaremeanymore!
"Replacement[:text]Earlierwediscussedtheuseofavarietyofnonstandardformssuchas"gonna"and"hafta.
".
InorderforMORtomorphemicizesuchwords,thetranscribercanuseareplacementsymbolthatallowsclantosubstituteatargetlanguageformfortheformac-tuallyproduced.
Hereisanexample:*BEA:whenyagonna[:goingto]stopdoin(g)that*CHA:whyncha[:whydon'tyou]justbequiet!
Inthisexample,"gonna"isfollowedbyitsstandardforminbrackets.
ThecolonthatfollowsthefirstbrackettellsCLANthatthematerialinbracketsshouldreplacetheprecedingword.
Thereplacingstringcanincludeanynumberofwords,butthethingbeingreplacedcanonlybeasingleword,notaseriesofwords.
Theremustbeaspacefollowingthecolon,inordertokeepthissymbolseparatefromothersymbolsthatuselettersafterthecolon.
ThisexamplealsoillustratestwootherwaysinwhichCHATandclandealwithnonstandardforms.
Thelexicalitem"ya"istreatedasalexicalitemdistinctfrom"you.
"However,thesemanticequivalencebetween"ya"and"you"ismaintainedbytheformalizationofalistofdialectalspellingvariations.
Thestring"doin(g)"istreatedbyCLANasifitwere"doing.
"Thisisdonebysimplyhavingtheprogramsignoretheparentheses,unlesstheyaregiveninstructionstopayattentiontothem,asdiscussedinintheCLANmanual.
FromtheviewpointofCLAN,aformlike"doin(g)"isjustanotherincompleteform,suchas"bro(ther).
"Inorderforreplacementtofunctionproperly,nothingshouldbeplacedbetweenthereplacingstringandthestringtobereplaced.
Forexample,tomarkreplacementanderrorusingthe[*]code,oneshouldusetheform:goed[:went][*]ratherthan:goed[*][:went]Part1:CHAT75ReplacementofRealWord[::text]Whentheerrorinvolvestheincorrectuseofarealword,thedoublecolonformofthereplacementstringmaybeused,asin:piece[::peach][*]Forfurtherdetailsonthisusage,pleaseseethechapteronErrorCoding.
AlternativeTranscription[=text]Sometimesitisdifficulttochoosebetweentwopossibletranscriptionsforawordorgroupofwords.
Inthatcaseanalternativetranscriptioncanbeindicatedinthisway:*CHI:wewant[=onetoo].
CommentonMainLine[%text]Insteadofplacingcommentmaterialonaseparate%comline,itispossibletoplacecommentsoranytypeofcodedirectlyonthemainlineusingthe%symbolinbrackets.
Hereisanexampleofthisusage:*CHI:Ireallywishyouwouldn't[%saidwithstrongraisingofeyebrows]dothat.
Youshouldbecarefulwithusingcommentsonthemainline.
Overuseofthisparticularnotationalformcanmakeatranscriptdifficulttoreadandanalyze.
Becauseplacingacommentdirectlyontothemainlinetendstohighlightit,thisformshouldbeusedonlyformaterialthatiscrucialtotheunderstandingofthemainline.
BestGuess[]Oftenaudiotapesarehardtohearbecauseofinterferencefromroomnoise,recordermalfunction,vocalqualities,andsoforth.
Nonetheless,transcribersmaythinkthat,throughthenoise,theycanrecognizewhatisbeingsaid.
Thereissomeresidualuncertaintyaboutthis"bestguess.
"Thissymbolmarksthisinrelationtothesingleprecedingwordorthepreviousgroupofwordsenclosedinanglebrackets.
*SAR:Iwantafrog[]Inthisexample,thewordthatisunclearis"frog.
"Ingeneral,whenthereisasymbolinsquarebracketsthattakesscopingandtherearenoprecedinganglebrackets,thenthesingleprecedingwordisthescope.
Whenmorethanonewordisunclear,youcansurroundtheunclearportioninanglebracketsasinthefollowingexample:*SAR:[]10.
4Retracing,Overlap,andClausesOverlapFollows[>]Part1:CHAT76Duringthecourseofaconversation,speakersoftentalkatthesametime.
Transcribingtheseinteractionscanbetrying.
Thisandthefollowingtwosymbolsaredesignedtohelpsortoutthisdifficulttranscriptiontask.
The"overlapfollows"symbolindicatesthatthetextenclosedinanglebracketsisbeingsaidatthesametimeasthefollowingspeaker'sbracketedspeech.
Theyaretalkingatthesametime.
Thiscodemustbeusedincombinationwiththe"overlapprecedes"symbol,asinthisexample:*MOT:no(.
)Sarah(.
)youhaveto[>]!
*SAR:[[>]!
*SAR:[[>1]reallycuteandit[>2]intobed.
*MOT:[[[/]IwantedtoinviteMargie.
Iftherearepausesandfillersbetweentheinitialmaterialandtheretracing,theyshouldbeplacedaftertherepetitionsymbol,asin:*HAR:it'sum(.
)it's[/]it's(.
)a&-um(.
)dog.
Whenawordorgroupofwordsisrepeatedseveraltimeswithnofillers,alloftherepetitionsexceptforthelastareplacedintoasinglegroup,asinthisexample:*HAR:[/]it's(.
)a&-um(.
)dog.
Bydefault,alloftheclancommandsexceptmlu,mlt,andmodrepincluderepeatedmaterial.
Thisdefaultcanbechangedbyusingthe+r6switch.
MultipleRepetition[xN]Analternativewayofindicatingrepetitionsofasinglewordusesthisform:*HAR:it's[x4](.
)a&-um(.
)dog.
Thisformindicatesthefactthatawordhasbeensaidfourtimes.
Ifyouwanttoseewhatthiswouldlooklikeintheexpandednotation,youcanusethiscommand:kwal+d99*.
chaInthiscase,theresultwouldbe:*HAR:it's[/]it's[/]it's[/]it's(.
)a&-umdog.
Ifthe[xN]formisused,itisnotpossibletogetacountoftherepetitionstobeaddedtoMLU.
However,becausethisisnotusuallydesirableanyway,therearegoodreasonstousethismorecompactformwhensinglewordsarerepeated.
Forsomeillustrationsoftheuseofthistypeofcodingforthestudyofdisfluenciessuchasstuttering,consultBernsteinRatner,Rooney,andMacWhinney(1996).
Retracing[//]Thissymbolisusedwhenaspeakerstartstosaysomething,stops,repeatsthebasicphrase,changesthesyntaxbutmaintainsthesameidea.
Usually,thecorrectionmovesclosertothestandardform,butsometimesitmovesawayfromit.
Thematerialbeingretracedisenclosedinanglebrackets.
Iftherearenoanglebrackets,CLANassumesthatonlytheprecedingwordisbeingretraced.
Inretracingwithcorrection,itisnecessarilytruethatthematerialintheanglebracketsisdifferentfromwhatfollowstheretracingsymbol.
Hereisanexampleofthis:*BET:uhIthoughtIwantedtoinviteMargie.
Part1:CHAT78Retracingwithcorrectioncancombinewithretracingwithoutcorrection,asinthisexample:*CHI:[//]the[/]thefishareswimming.
Sometimesretracingscanbecomequitecomplexandlengthy.
Thisisparticularlytrueinspeakerswithlanguagedisorders.
Itisimportantnottounderestimatetheextenttowhichretracinggoesoninsuchtranscripts.
Bydefault,alloftheclancommandsexceptmlu,mlt,andmodrepincluderetracedmaterial.
Thisdefaultcanbechangedbyusingthe+r6switch.
Reformulation[///]Sometimesretracingsinvolvefullandcompletereformulationsofthemessagewithoutanyspecificcorrections.
Hereisanexampleofthistype:*BET:[///]uhwealldecidedtogohomeforlunch.
Whennoneofthematerialbeingcorrectedisincludedintheretracing,itisbettertousethe[///]markerthanthe[//]marker.
FalseStartWithoutRetracing[/-]Insomeprojectsthatplacespecialemphasisoncountsofparticulardisfluencytypes,itmaybemoreconvenienttocoderetracingsthroughaquitedifferentmethod.
Forexample,thesymbols[/]and[//]areusedwhenafalsestartisfollowedbyacompleterepetitionorbyapartialrepetitionwithcorrection.
Ifthespeakerterminatesanincompleteutteranceandstartsoffonatotallynewtangent,thiscanbecodedbyusingthe[/-]symbol:*BET:[/-]uhwhenisMargiecomingIfthematerialiscodedinthisway,CLANwillcountonlyoneutterance.
Ifthecoderwishestotreatthefragmentasaseparateutterance,the+.
.
.
and+//.
symbolsthatwerediscussedonpage67shouldbeusedinstead.
Bydefault,alloftheCLANprogramsexceptMLU,MLT,andMODREPincluderepeatedmaterial.
Thisdefaultcanbechangedbyusingthe+r6switch.
UnclearRetracingType[/]ThissymbolisusedprimarilywhenreformattingSALTfilestoCHATfiles,usingtheSALTINcommand.
SALTdoesnotdistinguishbetweenfilledpausessuchas"uh",repetitions([/]),andretracings([//]);allthreephenomenaandpossibleothersaretreatedas"mazes.
"Becauseofthis,SALTINusesthe[/]symboltotranslateSALTmazesintochathesitationmarkings.
ExcludedMaterial[e]Certaintypesofanalysisfocusonthespeaker'sabilitytoproducetask-relevantmaterial.
Forexample,inapicturedescriptiontask,itmaybeusefultoexcludematerialPart1:CHAT79thatisnotrelevanttotheactualdescriptionofthepicture.
Todothis,thematerialtobeexcludedcanbemarkedwith[e],asinthisexample:*BET:[e]thecatisupthetree.
Materialmarkedinthiswaywillautomaticallybeexcludedbyanalysisonthe%morlineandfromtheotherprogramssuchasDSS,IPSyn,VOCD,GRASPetcthatoperateonthatline.
ClauseDelimiter[^c]IfyouwishtoconductanalysessuchasMLUandMLTbasedonclausesratherthanutterancesasthebasicunitofanalysis,youshouldmarktheendofeachclausewiththissymbol.
Youshouldnotusethiscodetotreatcompletesentencesasiftheywereclauses.
Instead,eachsentenceshouldbetranscribeonitsownmainline.
Thismarkshouldonlybeusedtodemarcatetheclauseswithincomplexsentences.
Itisnotnecessarytomarkthescopeofthissymbol,sinceitisassumedtoapplytoallthematerialbeforeituptothebeginningoftheutteranceoruptothepreceding[^c]marker.
Itispossibletocreateadditionaluser-definedcodesusingtheformatof[^c*],suchas[^cerr]whichcouldbedefinedasamarkerofaclausethatincludesanerror,or[^c0s]foraclausewithnosubject,etc.
Then,insidetheMLUandMLTprograms,youneedtoaddthe+cswitchtospecifyexactlywhichcodesofthistypeshouldberecognized.
10.
5ErrorMarkingErrorsaremarkedbyplacingthe[*]symbolaftertheerror.
Usually,the[*]markeroccursrightaftertheerror.
However,ifthereisareplacementstring,suchas[:because],thatshouldcomefirst.
Inrepetitionsandretracingwitherrorsintheinitialpartoftheretracing,the[*]symbolisplacedbeforethe[/]mark.
Iftheerrorisinthesecondpartoftheretracing,the[*]symbolgoesafterthe[/].
Inerrorcoding,theformactuallyproducedisplacedonthemainlineandthetargetformisgivenonthe%errline.
Thefullsystemforerrorcodingispresentedinaseparatechapter.
10.
6InitialandFinalCodesThesymbolswehavediscussedsofarinthischapterusuallyrefertowordsorgroupsofwords.
CHATalsoallowsforcodesthatrefertoentireutterances.
Thesecodesareplacedintosquarebracketseitheratthebeginningoftheutteranceorafterthefinalutterancedelimiter.
Theyalwaysbeginwitha+sign.
Postcodes[+text]Postcodesaresymbolsplacedintosquarebracketsattheendoftheutterance.
Theyshouldincludetheplussignandaspaceaftertheleftbracket.
Thereisnopredefinedsetofpostcodes.
Instead,postcodescanbedesignedtofittheneedsofyourparticularproject.
Unlikescopedcodes,postcodesmustapplytothewholeutterance,asinthisexample:*CHI:notthisone.
[+neg][+req][+inc]Part1:CHAT80PostcodesarehelpfulinincludingorexcludingutterancesfromanalysesofturnlengthorutterancelengthbyMLTandMLU.
Thepostcodes,[+bch]and[+trn],whencombinedwiththe-sand+s+switch,canbeusedforthispurpose.
WhentheSALTINcommandtranslatescodesfromSALTformattoCHATformat,ittreatsthemaspostcodes,becausethescopeofcodesisnotusuallydefinedinSALT.
LanguagePrecodes[-text]Languageprecodesareusedtomarktheswitchtoadifferentlanguageinmultilingualinteractions.
Thetextinthesecodesshouldcomefromthethree-letterISOcodesusedinthe@Languagesheader.
ExcludedUtterance[+bch]Sometimeswewanttohaveawayofmarkingutterancesthatarenotreallyapartofthemaininteraction,butareinsome"backchannel.
"Forexample,duringaninteractionthatfocusesonachild,themothermaymakearemarktotheinvestigator.
WemightwanttoexcluderemarksofthistypefromanalysisbyMLTandMLU,asinthisinteraction:*CHI:hereone.
*MOT:no,here.
%sit:thedoorbellrings.
*MOT:justamoment.
[+bch]*MOT:I'llgetit.
[+bch]Inordertoexcludetheutterancesmarkedwith[+bch],the-s"[+bch]"switchmustbeusedwithmltandmlu.
IncludedUtterance[+trn]The[+trn]postcodecanforcetheMLTcommandtotreatanutteranceasaturnwhenitwouldnormallynotbetreatedasaturn.
Forexample,utterancescontainingonly"0"areusuallynottreatedasturns.
However,ifonebelievesthattheaccompanyingnonverbalgestureconstitutesaturn,onecannotethisusing[+trn],asinthisexample:*MOT:whereisit*CHI:0.
[+trn]%act:pointsatwall.
Later,whencountingutteranceswithMLT,onecanusethe+s+"[+trn]"switchtoforcecountingofactionsasturns,asinthiscommand:mlt+s+"[+trn]"sample.
chaPart1:CHAT8111DependentTiersInthepreviouschapters,wehaveexaminedhowCHATcanbeusedtocreatefilehead-ersandtocodetheactualwordsoftheinteractiononthemainline.
Thethirdmajorcom-ponentofaCHATtranscriptistheancillaryinformationgivenonthedependenttiers.
Dependenttiersarelinestypedbelowthemainlinethatcontaincodes,comments,events,anddescriptionsofinteresttotheresearcher.
Itisimportanttohavethismaterialonsepa-ratelines,becausetheextensiveuseofcomplexcodesinthemainlinewouldmakeitun-readable.
Therearemanycodesthatrefertotheutteranceasawhole.
Usingaseparatelinetomarktheseavoidshavingtoindicatetheirscopeorclutteringuptheendofanutterancewithcodes.
Itisimportanttoemphasizethatnooneexpectsanyresearchertocodealltiersforallfiles.
CHATisdesignedtoprovideoptionsforcoding,notrequirementsforcoding.
Theseoptionsconstituteacommonsetofcodingconventionsthatwillallowtheinvestigatortorepresentthoseaspectsofthedatathataremostimportant.
Itisoftenpossibletotranscribethemainlinewithoutmakingmuchuseatallofdependenttiers.
However,forsomeprojects,dependenttiersarecrucial.
Alldependenttiersshouldbeginwiththepercentsymbol(%)andshouldbeinlower-caseletters.
Asinthemainline,dependenttiersconsistofatiercodeandatierline.
Thedependenttiercodeisthepercentsymbol,followedbyathree-lettercodeIDandacolon.
Thedependenttierlineisthetextenteredafterthecolonthatdescribesfullytheelementsofinterestinthemaintier.
Exceptforthe%morand%gratiers,theselinesdonotrequireendingpunctuation.
Hereisanexampleofamainlinewithtwodependenttiers:*MOT:wellgogetit!
%spa:$IMP$REF$INS%mor:ADV|wellV|go&PRESV|get&PRESPRO|it!
Thefirstdependenttierindicatescertainspeechactcodesandthesecondindicatesamorphemicanalysiswithcertainpartofspeechcoding.
Codingsystemshavebeendevel-opedforsomedependenttiers.
Often,thesecodesbeginwiththesymbol$.
Ifthereismorethanonecode,theycanbeputinstringswithonlyspacesseparatingthem,asin:%spa:$IMP$REF$INSMultipledependenttiersmaybeaddedinreferencetoasinglemainline,givingyouasmuchrichnessindescriptivecapabilityasisneeded.
11.
1StandardDependentTiersWhenpossible,dependenttiersshouldbeselectedfromthestandardlistof3-lettertiersgivenhere.
However,ifthislistisinadequate,userscancreateextensiontiersusingthreelettersprecededby"x"asin%xtobforatierthatmarksToBIprosodicfeatures.
Herewelistallofthedependenttiertypesthatareusedforchildlanguagedata.
Itisunlikelythatagivencorpuswouldeverbetranscribedinalloftheseways.
Thelistingthatfollowsisalphabetical.
ActionTier%act:Part1:CHAT82Thistierdescribestheactionsofthespeakerorthelistener.
Hereisanexampleoftextaccompaniedbythespeaker'sactions:*ROS:Idoit!
%act:runstotoyboxThe%acttiercanalsobeusedinconjunctionwiththe0symbolwhenactionsareperformedinplaceofspeaking:*ADA:0.
%act:kickstheballThiscouldalsobecodedas:*ADA:0[%kickstheball].
Inthiscasethe0onthemainlineisusedtoindicatethatthereisanactionbutnospeech.
Oronecanusethe&=form,asin:*ADA:&=kicks:ball.
Thechoiceamongthesethreeformsdependsontheextenttowhichthecoderwantstokeeptrackofaparticulartypeofdependenttierinformation.
AddresseeTier%add:Thistierdescribeswhotalkstowhom.
Usethethree-letteridentifiergiveninthepar-ticipantsheadertoidentifytheaddressees.
*MOT:bequiet.
%add:ALI,BEAInthisexample,MotheristellingAliceandBeatriceto"bequiet.
"Alternatetranscriptiontier%alt:Thistierisusedtoprovideanalternativepossibletranscription.
Ifthetranscriptionisintendedtoprovideanalternativeforonlyoneword,itmaybebettertousethemainlineformofthiscodingtierintheform[=text].
CONNLTier%cnl:ThistierisusedforrepresentingmorphologicalcategoriesinCONNLformattoallowgrammaticalrelationstaggingusingaCONNLtagger.
CodingTier%cod:Thisisthegeneralpurposecodingtier.
Itcanbeusedformixingcodesintoasingletierforeconomyoreaseofentry.
Hereisanexample.
*MOT:youwantMommytodoit%cod:$MLU=6$NMV=2$RDE$EXPPart1:CHAT83CohesionTier%coh:Thistierisusedtocodetextcohesiondevices.
CommentTier%com:Thisisthegeneralpurposecommenttier.
Oneofitsmanyusesistonoteoccurrenceofaparticularconstructiontype,asinthisexample:*EVE:that'snasty(.
)isit%com:notetagquestionNotationsonthislineshouldusuallybeincommonEnglishwords,ratherthancodes.
Ifspecialsymbolsandcodesareincluded,theyshouldbeplacedinquotationmarks,sothatCHECKdoesnotflagthemaserrors.
DefinitionsTier%def:ThistierisneededonlyforfilesthatarereformattedfromtheSALTsystembytheSALTINcommand.
EnglishRenditionTier%eng:Thislineprovidesafluent,nonmorphemicizedEnglishtranslationfornon-Englishda-ta.
*MAR:yonotengonada.
%eng:Idon'thaveanything.
ErrorcodingTier%err:Thistiercodesadditionalinformationabouterrorsthatcannotbefullyexpressedonthemainline.
ExplanationTier%exp:Thistierisusefulforspecifyingthedeicticidentityofobjectsorindividuals.
Briefexplanationscanalsoappearonthemainline,enclosedinsquarebracketsandprecededbythe=signandfollowedbyaspace.
FacialGestureTier%fac:Thistiercodesfacialactions.
Ekman&Friesen(1969,1978)havedevelopedacom-pleteandexplicitsystemforthecodingoffacialactions.
Thissystemtakesabout100hourstolearntouseandprovidesextremelydetailedcodingofthemotionsofparticularmusclesintermsoffacialactionunits.
KearneyandMcKenzie(1993)havedevelopedcomputation-altoolsfortheautomaticinterpretationofemotionsusingthesystemofEkmanandFriesen.
Part1:CHAT84FlowTier%flo:Thistiercodesa"flowing"versionofthetranscriptthatisasfreeaspossibleoftran-scriptionconventionsandthatreflectsaminimalnumberoftranscriptiondecisions.
Hereisanexampleofa%floline:*CHI:[//]Idon'twanna[:wantto]lookina[*the]badroom[*bedroom]orBill'sroom.
%flo:Idon'tIdon'twannalookinabadroomorBill'sroom.
Mostresearcherswouldagreethatthe%flolineiseasiertoreadthanthe*CHIline.
However,itgainsreadabilitybysacrificingprecisionandutilityforcomputationalanaly-ses.
The%flolinehasnorecordsofretracings;wordsaresimplyrepeated.
Thereisnoregularizationtostandardmorphemes.
StandardEnglishorthographyisusedtogiveageneralimpressionofthenatureofphonologicalerrors.
Thereisnoneedtoenterthislinebyhand,becauseFLOcommandcanenteritautomatically.
GlossTier%gls:Thistiercanbeusedtoprovidea"translation"ofthechild'sutteranceintotheadultlanguage.
Unlikethe%engtier,thistierdoesnothavetobeinEnglish.
Itshoulduseanexplanationinthetargetlanguage.
Thistierdiffersfromthe%flotierinthatitisbeingusednottosimplifytheformoftheutterancebuttoexplainwhatmightotherwisebeunclear.
Finally,thistierdiffersfromthe%exptierinthatitisnotusedtoclarifydeicticreferenceorthegeneralsituation,buttoprovideatargetlanguageglossofimmaturelearnerforms.
Gestural-ProxemicTier%gpx:Thistiercodesgesturalandproxemicmaterial.
Sometranscribersfindithelpfultodis-tinguishbetweengeneralactivitythatcanbecodedonthe%actlineandmorespecificallygesturalandproxemicactivity,suchasnoddingorreaching,whichcanbecodedonthe%gpxline.
GrammaticalRelationsTier%gra:Thistierisusedtocodedependencystructureswithtaggedgrammaticalrelations(Sagae,Davis,Lavie,MacWhinney,&Wintner,2007;Sagae,Lavie,&MacWhinney,2005;Sagae,MacWhinney,&Lavie,2004).
GrammaticalRelationsTraining%grt:ThistierisusedfortrainingoftheMEGRASPgrammaticalrelationstagger.
Ithasthesameformatthe%gratier.
IntonationalTier%int:Part1:CHAT85Thistiercodesintonations,usingstandardlanguagedescriptions.
ModelTier%mod:Thistierisusedinconjunctionwiththe%photiertocodethephonologicalformoftheadulttargetormodelforeachofthelearner'sphonologicalforms.
MorphologicalTier%mor:Thistiercodesmorphemicsegmentsbytypeandpartofspeech.
Hereisanexampleofthe%mortier:*MAR:Iwantedatoy.
%mor:PRO|I&1SV|want-PASTDET|a&INDEFN|toy.
OrthographyTier%ort:Thistierisusedforlanguageswithanon-Romanscript.
WhenRomanscriptisinsertedonthemainline,thislinecanbeusedforthelocalscript.
OritcanbeusedintheotherwaywithlocalscriptonthemainlineandRomanonthe%ortline.
Thereshouldbeaone-to-onecorrespondencebetweentheitemsonthetwolines.
ParalinguisticsTier%par:Thistiercodesparalinguisticbehaviorssuchascoughingandcrying.
PhonologyTier%pho:Thisdependenttierisusedtoprovideaphonologicaltranscription.
Whentheresearcherisattemptingtodescribephonologicalerrors,the%errlineshouldbeusedinstead.
The%pholineistobeusedwhentheentireutteranceisbeingcodedinIPA.
Hereisanexampleofthe%photierinuse.
*SAR:Igotaboo+boo.
%pho:aigtbubuTranscriptiononthe%pholineshouldbedoneusingtheIPAsymbolsinUnicode.
Wordsonthemaintiershouldaligninaone-to-onefashionwithformsonthephonologicaltier.
Thisalignmenttakesallformsproducedintoaccountanddoesnotexcluderetracesornon-wordforms.
Onthe%pholineitissometimesimportanttodescribeseveralwordsasformingasinglephonologicalgroupinordertodescribeliaisonandotherassimilationeffectswithinthegroup.
Tomarkthis,theUnicodecharactersforU+2039andU+203A,whichappearasand,shouldbeenteredonboththemainand%pholinesusingF2+.
SigningTier%sin:Part1:CHAT86Parentsofdeafchildrenoftensignorgesturealongwithspeech,asdothechildrenthemselves.
Totranscribethis,researchersoftenplacethespokenmaterialonthemainlineandthesignedmaterialonthe%sinline.
Wordsonthe%sintiercanconsistofanyalphanumbericcharactersandcolons,asinformssuchasg:point:toy.
Likethe%phoand%mortiers,thewordsonthe%sintiermustbeplacedintoone-to-onecorrespondencewithwordsonthemaintier.
Todothis,itmaybenecessarytoentermany"0"formsonthe%sintierwhenawordisnotmatchedbyasignorgesture.
Atothertimes,severalwordsonthemaintiermayalignwithasinglegesture.
Tomarkthisgrouping,youcangrouptheformsonthemainlinewithtwoUnicodebracketingsymbols.
ThebeginningofthegroupismarkedbyUnicodeU+23A8andtheclosebyUnicodeU+23AC.
SituationTier%sit:Thistierdescribessituationalinformationrelevantonlytotheutterance.
Thereisalsoan@Situationheader.
Situationalcommentsthatrelatemorebroadlytothefileasawholeortoamajorsectionofthefileshouldbeplacedina@Situationheader.
*EVE:whatthat*EVE:woof@owoof@o.
%sit:dogisbarkingSpeechActTier%spa:Thistierisforspeechactcoding.
Manyresearcherswishtotranscribetheirdatawithreferencetospeechacts.
Speechactcodesdescribethefunctionofsentencesindiscourse.
Oftenresearchersexpressapreferenceforthemethodofcodingforspeechacts.
Manysystemsforcodingspeechactshavebeendeveloped.
AsetofspeechactcodesadaptedfromamoregeneralsystemdevisedbyNinioandWheelerisprovidedinthechapteronspeechactcoding.
TimingTier%tim:Thistierisusedforolderdataforwhichthereisnopossiblelinkingofthetranscripttothemedia.
ItshouldnotbeconfusedwiththemillisecondaccuratetimingfoundinthebulletsinsertedbysonicCHAT.
The%timtierisusedjusttomarklargeperiodsoftimeduringthecourseoftaping.
Thesereadingsaregivenrelativetothetimeofthefirstutteranceinthefile.
Thetimeofthatutteranceistakentobetime00:00:00.
Itsabsolutetimevaluecanbegivenbythe@TimeStartheader.
Elapsedtimefromthebeginningofthefileisgiveninhours:minutes:seconds.
Thus,a%timentryof01:20:55indicatesthepassageof1hour,20minutes,and55secondsfromtimezero.
Ifyouonlywanttotracktimeinminutesandseconds,youcanusetheformminutes:seconds,asin09:22for9minutesand22seconds.
NoneoftheCLANprogramsusetheinformationencodedinthe%timtier.
Itisjustincludedforhandanalyses.
*MOT:whereareyou%tim:00:00:00.
.
.
(40pagesoftranscriptfollowandthen)*EVE:thatone.
Part1:CHAT87%tim:01:20:55Ifthereisabreakintheinteraction,itmaybenecessarytoestablishanewtimezero.
Thisisdonebyinsertinganew@TimeStartheader.
Youcanalsousethistiertomarkthebeginningandendofatimeperiodbyusingaformsuchas:*MOT:whereareyou%tim:04:20:23-04:21:01TrainingTier%trn:ThisisthetrainingtierforthePOSTtagger.
Ithasthesameformasthe%morline.
11.
2SynchronyRelationsFordependenttierswhosecodesrefertotheentireutterance,itisoftenimportanttodistinguishwhethereventsoccurbefore,during,oraftertheutterance.
OccurrenceBeforeIfthecommentreferstosomethingthatoccurredimmediatelybeforetheutteranceinthemainline,youmayusethesymbol,asinthisexample:*MOT:itisherturn.
%act:movestothedoorOccurrenceAfterIfacommentreferstosomethingthatoccurredimmediatelyaftertheutterance,youmayusetheform.
Inthisexample,Motheropenedthedooraftershespoke:*MOT:itisherturn.
*MOT:goahead.
%act:opensthedoorIfneitherorarecoded,itisassumedthatthematerialinthecodingtieroccursduringthewholeutteranceorthattheexactpointofitsoccurrenceduringtheutter-anceisnotimportant.
AlthoughCHATprovidestranscriberswiththeoptionofindicatingthepointofeventsusingthe%comtierandandscoping,itmayoftenbebesttousethe@Com-mentheadertierinstead.
Theadvantageofusingthe@Commentheaderisthatitindicatesinaclearermannerthepointatwhichanactivityactuallyoccurs.
Forexample,insteadoftheform:*MOT:itisherturn.
%act:movestothedooronecouldusetheform:@Comment:Motmovestothedoor.
*MOT:itisherturn.
Part1:CHAT88ThethirdoptionprovidedbyCHATistocodecommentsinsquarebracketsrightonthemainline,asinthisform:*MOT:[^Motmovestothedoor]itisherturn.
Ofthesealternativeforms,thesecondseemstobethebestinthiscase.
ScopeonMainTier$sc=nWhenyouwantaparticulardependenttiertorefertoaparticularwordonthemaintier,youcanusethisadditionalcodetomarkthescope.
Forexample,herethecodemarksthefactthatthemother'swords4through7areimitatedbythechild.
*MOT:wanttocomesitinmylap%act:$sc=4-7$IMIT*CHI:sitinmylap.
Part1:CHAT8912CHAT-CATranscriptionCHATalsoallowstranscriptionthatismorecloselyinaccordwiththerequirementsofCA(ConversationalAnalysis)transcription.
CAisasystemdevisedbySacks,Schegloff,andJefferson(Sacks,Schegloff,&Jefferson,1974)forthepurposeofunderstandingtheconstructionofconversationalturnsandsequencing.
Itisnowusedbyhundredsofresearchersinternationallytostudyconversationalbehavior.
RecentapplicationsandformulationsofthisapproachcanbefoundinOchs,Schegloff,andThompson(1996),aswellastherelated"GAT"formulationofSelting(1998).
WorkersinthistraditionfindCAnotationeasiertousethanCHAT,becausetheconventionsofthissystemprovideaclearermappingoffeaturesofconversationalsequencing.
Ontheotherhand,CAtranscriptionhaslimitsintermsofitsabilitytorepresentconventionalmorphemes,orthography,andsyntacticpatterns.
BysupplementingCHATtranscriptiononthewordlevelwithadditionalutterancelevelcodesforCA,thestrengthsofbothsystemscanbemaintained.
Toachievethismerger,someoftheformsofbothCHATandCAmustbemodified.
ToimplementCAformat,CHAT-CAusesthesefunctions:1.
ThefactthatatranscriptisusingCAnotationisindicatedbyinsertingthetermCAinthe@Optionsheader.
Oldercorporacanbemaintainedintheiroriginalnon-CHATformatbyenteringtheword"heritage"onthe@Optionsheadertierbeforethe@IDtiers.
2.
Utterancesandinter-TCUpausesarenumberedbytheautomaticlinenumberingfunction.
3.
LinenumberscanbeturnedonandoffforviewingandprintingbyusingCLANoptions.
LinenumbersarenotstoredbythemselvesinCHAT,althoughtheyareencodedintheXMLversionofCHAT.
4.
AfterthelinenumbercomesanasteriskandthenthespeakerIDcodeandacolonandatab,asinCHATformat.
5.
Tabsarenotusedelsewhere.
6.
CAOverlaps,asmarkedwiththespecialsymbolsand,arealignedautomaticallybytheINDENTprogram,sohandindentationisnotneeded.
7.
Tomaintainproperalignment,CLANusesaspecialfixed-widthUnicodefont.
8.
CHATrequiresobligatoryutteranceterminators.
However,CAusesterminalcontoursinstead,asnotedinthetablebelow,andtheseareoptional.
9.
Insteadofmarkingcommentsindoubleparentheses,CHATusesthe[%com]notation.
However,commonsounds,gestures,andactivitiesoccurringatapointinanutterancearemarkedusingthe&=gestureform.
10.
CHATusesthefollowingformsformarkingdisfluencies,asfurtherdiscussedinthenextchapter.
11.
pairsofUnicode21ABleftwardsarrowwithlooptomarkinitialsegmentrepetitionasinb-b-bboy12.
pairsofUnicode2260not-equal-tosigntomarkblockedsegmentsasinru≠b-b-b≠bber13.
thecolonformarkingdrawlsorextensions14.
the^symbolformarkingabreakinsideawork15.
formssuchas&-umformarkingfilledpauses16.
silentpausesasmarkedby(.
)or(0.
6)etc.
Part1:CHAT9017.
[/]stringforwordorphraserepetition18.
string[//]stringforretracing19.
+…fortrailingoffInadditiontothesebasicutterance-levelCAforms,CHAT-CArequiresthestandardCHATheaderssuchasthese:@Beginand@End.
Usingtheseguaranteesthatthefileiscomplete.
@Comment:Thisisausefulgeneralpurposefield@Bg,@Eg:Thesemark"gems"forlaterretrieval@Participants:Thisfieldidentifiesthespeakers.
%gpx:dependenttierssuchas%gpx,%spacanbeaddedasneeded.
GailJeffersoncontinuallyelaboratethecodingofCAfeaturesthroughspecialmarksduringhercareer.
Hercreationofnewmarkswaslimited,formanyyears,bywhatwasavailableonthetypewriter.
WiththeadventofUnicode,weareabletocaptureallofthemarksshehadproposedalongwithothersthatsheoccasionallyused.
ThefollowingtablesummarizesthesemarksofCHAT-CA.
CharacterNameCharFunctionF1+Unicode1up-arrow↑shifttohighpitchuparrow21912down-arrow↓shifttolowpitchdownarrow21933doublearrowtilteduprisingtohigh121D74singlearrowtiltedup↗risingtomid221975levelarrow→level321926singlearrowtilteddown↘fallingtomid421987doublearrowdownfallingtolow521D88infinitymark∞unmarkedending6221E9doublewavyequals≈+≈nobreakcontinuation=224810triplewavyequals+technicalcontinuation+224B11tripleequal≡≡uptake(internal)u226112raisedperiodinhalation.
221913openbrackettoptopbeginoverlap[230814closebrackettoptopendoverlap]230915openbracketbottombottombeginoverlapshift[230A16closedbracketbottombottomendoverlapshift]230B17uptrianglefasterrightarrow220618downtriangleslowerleftarrow220719lowasteriskcreaky*204E20doublequestionmarkunsure/204721degreesign°°softer°zero00B022fisheyelouder)25C923lowbarlowpitchd258124highbarhighpitchh259425smileysmilevoicel263A26doubleintegralwhisperw222C27upsilonwithdialytikayawny03ABPart1:CHAT9128clockwiseintegralsingings222E29sectionmarker§§precise§p00A730tildeconstrictionn223E31halfcirclepitchresetr21BB32capitalHwithdasialaughinawordc1F2933lowerquotetagorfinalparticlet201E34doubledaggervocativeorsummonsv202135dotArabicdot,032336raisedhArabicaspirationH02B037macronāstressedsyllable-030438glottalglottalstopq029439reverseglottalHebrewglottalQ029540caroncaron;030CThecolumnmarkedF1intheprevioustablegivesmethodsforinsertingthevariousnon-ASCIIUnicodecharacters.
ForexamplethesmilevoicesymbolisisinsertedbyF1andthentheletterl.
Itmustbeusedbothbeforeandafterthestretchofmaterialwiththesmileorlaughingvoice.
Afterrow32,theitemsareinsertedusingF2,insteadofF1.
Forthemostrecentversionofthissymbolset,pleaseconsultthecurrentlistontheweb.
Ofthesevarioussymbols,therearefourthatmustbeplacedeitheratthebeginningofwordsorinsidewords.
Theseincludethearrowsforpitchriseandfall,theinvertedquestionmarkforinhalation,andthe≡symbolforquickTCUinternaluptake.
Thepairedsymbolsforintonationalstretchessuchaslouder,faster,andslowercanbeplacedanywhere,exceptinsidecomments.
Theymustbeusedinpairstomarkthebeginningandendofthefeatureinquestion.
Thetriplewavysymbolwithaplus(+)isusedtomarkabreakinaTCUcausedbyinterruptionfromanotherspeaker.
Useofthissymbolcanimprovereadabilityandoverlapalignment.
Inthiscasethetriplewavywithoutapluscanbeplacedattheendofthelastwordofthefirstsegmentandthenatthebeginningofthecontinuation,whereitisjoinedwithaplussignandfollowedbyaspace,asin+.
The≈symbolisusedinaparallelwaytomarkaTCUcontinuationthatisnotforcedbyaninterruptionfromanotherspeaker.
Itoccursattheendofthelastwordofthefirstsegmentandintheform+≈withafollowingspaceatthebeginningofthefollowingline.
CAtranscriberscanalsouseunderliningtorepresentemphasisonawordorapartofaword.
However,iftextistakenfromaCHATfiletoWordtheunderliningwillbelost.
Ingeneral,CAmarksmustoccureitherinsidewordsoratthebeginningsorendsofwords.
Inmostcases,theyshouldnotoccurbythemselvessurroundedbyspaces.
Theexceptiontothisistheutterancecontinuatormark+whichshouldbeprecededbythetabmarkandfollowedbyaspace.
InadditiontothesefeaturesthatarebasictoCA,ourimplementationrequirestranscrib-erstobegintheirtranscriptwithan@Beginlineandtoenditwithan@Endline.
Commentscanbeaddedusingthe@Commentformat,andtranscribersshouldusethe@Participantsheaderinthisform:@Participants:geo,mom,timThislineusesonlythree-lettercodesforparticipantnames.
Byaddingthisline,itispossibletohavequickerentryofspeakercodesinsidetheeditor.
Part1:CHAT9213DisfluencyTranscriptionCHATusesthefollowingformsformarkingdisfluencies.
Stuttering-likedisfluencies(SLDs)CodeExampleNotesprolongation:s:paghettiPlaceafterprolongedsegmentbrokenword^spa^ghettiPausewithinwordblocking≠≠butterAblockbeforewordonsetrepeatedsegmentr-r-rrabbitlikeike-ikeThebracketstherepetition;hyphensmarkiterationslengthenedrepeatedsegmentanddoublingrr-rr-rrabbitThedoublingof"r"indicateslengtheningofthe"r"segmentmonosyllabicwordmultiplerepetition[xN]dog[x3]Mustbe[x3]orgreater,mustbemonosyllabicTypicalDisfluencies(TDs)CodeExampleNoteswholewordsinglerepetition[/]dog[/]dogSinglerepetition,i.
e.
[x2]polysyllabicwordmultiplerepetition[xN]butter[x7]Indicatesthattheword'butter'wasproducedseventimesphraserepetition[/][/]thatisadog.
isusedtomarkrepeatedmaterialwordrevision[//]adog[//]beastRevisioncountsoncephraserevisionwhatdidyou>[//]howcanyouseeitRevisioncountsoncephonologicalfragment&+&+sndogChangesfrom"snake"to"dog"pause(.
)or(.
.
)or(…)(.
)Countsthenumberofshort,medium,longpausespauseduration(2.
4)(2.
4)Addsupthetimevalues,ifmarkedfilledpause&-&-um&-you_knowFillerswithunderscorecountasonewordThe≠charactertomarkblockingisenteredbytypingF2and=ThecharactertomarksegmentrepetitionisenteredbytypingF2and/Blockingoffilledpausesisindicatedinthisway:&-≠you_knowThesedisfluencytypescanbetracedinFREQandKWALcommandsthroughthesearchstringsgiveninthefilescalledfluency-sep.
cutandfluency-comb.
cutinthe/lib/fluencyfolderinCLAN.
TheycanbecountedautomaticallyusingtheFLUCALCprogram.
Part1:CHAT9314TranscribingAphasicLanguageHerearesometipsfortranscribingtypicalfeaturesofaphasiclanguage.
Theseconventionsarealldiscussedelsewhereinthismanual,butarerepeatedherefortheconvenienceofresearchersandcliniciansworkingspecificallywithspeechfrompersonswithaphasia.
Commascanbeusedasneededtomarkphrasaljunctions,buttheyarenotusedbytheprogramsandhavenotightprosodicdefinition.
Fragments(phonological)getenteredwiththeampersand-plussymbolattachedatthebeginning.
So,forallincompletewords,use&+followedbythegraphemesthatcapturethesoundsproduced.
*PAR:sonowIcan&+spspeakalittlebit.
*PAR:andthen&+sh&+s&+wwecamehome.
Ifyouwanttomarkdisfluenciesmoreprecisely,youshouldusethecodesintheprecedingsectiononDisfluencycoding.
Gesturescanbecapturedinseveralways.
Youcancomposecodesusingpartsofthebodytoindicateheadnodsandshakes,forexample,usingtheampersand,theequalsign,thebodypart,colon,andthenthemovementoritsmeaning.
Youcanuseuptotwocolonsforeachgesturecodeandyoucanusemorethanonewordafterthecolonifyouconnectthewordswithanunderscoresymbol.
*PAR:&=head:no.
*PAR:&=hand:hello.
*PAR:seeyoulater&=ges:wave.
*PAR:thewoman&=ges:fishingfishingpolewater&=casts:pole.
Youcanalsousethe%facand%gpxcodesforfacialorbodilygesturesthatextendthroughoutlongerperiods,includingthewholesentence.
*PAR:shewasfish[/]fish.
%gpx:raisingherarmupanddownThevariouswaysofmarkingIncompleteutterancesaredescribedinsection8.
11onspecialutteranceterminators.
Interjections,Exclamations,andInteractionalMarkersareallcalledcommunicatorsor"co"intheMORgrammar.
ThecompletelistofallwordformsrecognizedbyMORisgiveninthe0allwords.
cexfileatthetopoftheENG-MORgrammar.
Toseejustthelistofcommunicatorforms,likeinthefilesinthe/lexfolderthatbeginwith"co".
Fillersarelistedinthefileco-fil.
txtinthatsamefolder.
Therearejustafewofthese.
Theyareallenteredinthisformat&-uhor&-um.
*INV:howdoyouthinkyourlanguageisthesedays*PAR:well&-uh&-uhprettygood.
Part1:CHAT94Ifaspeakerlaughsorsighs,forexample,andyouwanttocapturethat,youcantranscribeitwiththeampersandandequalsign.
*PAR:well&=laughstellyouthetruth,Ican'tsaywhatIsaid.
Youcanputthelaughorsighonitsownlineifitservesasthespeaker'sturn.
*PAR:&=laughs.
AlistoftheseSimpleEventsappearsintheCHATmanualandincludescough,groan,sneeze,etc.
Neologismscanbemarkedbyputtingthe@nsymbolsnexttotheneologism.
*PAR:ohyes,thisisalittlesakov@nthat'sall.
Overlappingspeakerscanbehandledinseveralways.
Theeasiestistousealazyoverlapmarking+)andfollowedimmediatelybythesquarebrackets([/])withoneslashmarkenclosed.
Ifonlyonewordhasbeenrepeatedonce,anglebracketsarenotneededandCLANwillassumethattheonewordbeforethesquarebracketswiththeslashwasrepeated.
Youdonotneedtouseanglebracketsorsquarebracketswiththeslashmarkwhenfillers(e.
g.
,uh,um)arerepeated.
*PAR:[/]itwassobad.
*PAR:andthe[/]thewindowwasopen.
*PAR:andshe&+sspilled[/]&-uh&-uhthewateronthefloor.
Youcanindicaterepetitionsofasinglewordorofaphrasebyusingthesquarebracketsandinsertinganx,aspace,andthenumberoftimesthewordwasproduced.
*PAR:it's[x4]&-umadog.
RevisionsarecalledRetracingWithCorrectionandoccurwhenthespeakerchangessomething(usuallythesyntax)ofanutterancebutmaintainsthesameidea.
Thematerialbeingretracedisenclosedinanglebrackets,followedimmediatelybythesquarebracketswith2slashmarksenclosed.
Ifonlyonewordhasbeenchanged,anglebracketsarenotneededandCLANwillassumethattheonewordbeforethesquarebracketswiththeslashwasrevised.
Achange,orcorrection,shouldbesomethingclearlyidentifiablethatchangesthesyntaxbutmaintainsthesameideaofthephras*PAR:welluhCinderellaisanicegirl.
*PAR:andthensometimeswe[//]Iwasscaredaboutthetraffic.
Part1:CHAT96Self-interruptionsoccurwhenaspeakerbreaksoffanutteranceandstartsupanother.
Thesearecodedusing+//.
(or+//foraquestion).
*PAR:wellthenthe[//]&-uhyou_know[/]theairplanethat[/]thathis+//.
*PAR:notheairplanethat&-uhlanded+//.
*PAR:no[/]nothat'snotright.
Invitedinterruptionsoccurwhenonespeakerpromptstheotherspeakertocompleteanutterance.
Thesearecodedusingthe+…symbolsfortrailingoffandthe++symbolsfortheotherspeaker'scompletion.
Thismaybeintentional(cuing)orunintentional.
*INV:howabout&+ra+…*PAR:++aradio.
*HEL:ifBillhadknown+…*WIN:++hewouldhavecome.
Shorteningsoccurwhenaspeakerdropssoundsoutofwords.
Forexample,aspeakermayleavethefinal"g"offof"running",saying"runnin"instead.
InCHAT,thisshortenedformshouldappearasrunnin(g).
Otherexamplesthatdemonstratesoundomissionsare(be)cause,prob(ab)ly,(a)bout,(re)member,(ex)cept.
Assimilationsincludewordssuchas"gonna"and"kinda".
MostofthesewillberecognizedbyCLANsonoreplacements(e.
g.
,[:goingto])areneeded.
TableswithlistsofshorteningsandassimilationsappearintheCHATmanualinsections6.
6.
7and6.
6.
8,respectively.
Themostupdatedrecords,however,arealwaysintheMORlexicon.
Unintelligiblesegmentsofutterancesshouldbetranscribedasxxx.
Untranscribedmaterialcanbeindicatedwiththeletterswww.
Thissymbolisusedonamainlinetoindicatematerialthatatranscriberdoesnotwanttotranscribebecauseitisnotrelevanttotheinteractionofinterest.
Thissymbolmustbefollowedbythe%expline,explainingwhatwastranspiring.
*PAR:www.
%exp:talkingtospouse*PAR:www.
%exp:lookingthroughpicturesUtterancesegmentationdecisionscanbechallenging.
SeetheguidelinesinthefirstsectionofthechapteronUtterancesinthismanual.
Part1:CHAT9715ArabicandHebrewTranscriptionInordertotranscribeArabicandHebrewinRomancharacters,wemakeuseoffivespecialcharactersthatcanbeenteredintheCHATeditorinthisway:1.
Forthesuperscripth,typeF2andthenh.
2.
Forthesubscriptdot,typeF2andthencomma.
Thisisalsousedtomarkschwa.
3.
ForthemacrononaHebrewstressedvoweltypeF2anddash(-).
4.
FortheHebrewglottaltypeF2andQ.
5.
Forthebasicglottalstopsymbol,typeF2andq.
6.
Forlongvowels,typeF2andthen:(colon)toinsertthetriangularUnicodecolon02D0,ratherthanthestandardcolonwhichisUnicode003A.
Also,toallowformarkingofHebrewandArabicprefixes,the#signisallowedattheendoftheprefix,whichisthenseparatedfromthestembyaspace,asinwe#tiqfoc.
Whentranscribinggeminatesusedoubleconsonantsordoublevowels.
Thissystemisexpressedinthefollowingtwocharts:VowelsIPAArabicNameCHATiyaii,ikasraieya(ba'den)eee-eaalefmaddaemphaticaaa,ɑshortaafatHaaealefmaddanon-emphatic:uwaw,longuuuwaw,shortdameuowaw(bantalon)ooo,shortonotinArabicePart1:CHAT98ConsonantsIPAArabicNameCHAThamzabbabppttatθthat.
jimjhahxxakχqddalddhaldrrarzzenzssinsshinsssadsddaddttatzzaz'aynghayngffafqqafqɡgimgkkafkllamlPart1:CHAT99mmimmnnunnhhahwwawwjyaytsdjvvPart1:CHAT10016SpecificApplicationsThebasicCHATcodescanbeadaptedtoworkwithavarietyofmorespecificapplications.
Inthischapter,wereferfoursuchapplicationstoillustratetheadaptationofthegeneralcodestospecificuses.
Aseparatedocument,availablefromthisserver,describestheBTS(BerkeleyTranscriptionSystem)forsignlanguage.
Whencodescannotbeadaptedforspecificprojects,itmaybenecessarytomodifytheunderlyingXMLschemaforCHAT.
Whenthisbecomesnecessary,pleasesendemailtomacw@cmu.
edu.
16.
1Code-SwitchingTranscriptioniseasiestwhenspeakersavoidoverlaps,speakinfullutterances,anduseasinglestandardlanguagethroughout.
However,therealworldofconversationalinterac-tionsisseldomsosimpleanduniform.
Oneparticularlychallengingtypeofinteractioninvolvescode-switchingbetweentwooreventhreedifferentlanguages.
Insomecases,itmaybepossibletoidentifyadefaultlanguageandtomarkafewwordsasintrusionsintothedefaultlanguage.
Inothercases,mixingandswitchingaremoreintense.
CHATreliesonasystemofinterlacedmarkingforidentifyingthelanguagesbeingusedincode-switchedinteractions.
1.
Thelanguagesspokenbythevariousparticipantsmustbenotedwiththe@Lan-guagesheadertier.
Seesection7.
2fortherelevantISO-639codes.
Thefirstlanguageonthislineisconsideredtobethedefaultlanguageuntilaswitchismarked.
2.
Utterancesthatrepresentaswitchtothesecondlanguagearemarkedwithprecodes,asin[-eng]foraswitchtoEnglish.
Hereisanexample:*MOT:canyousee*CHI:[-spa]nopuedo.
3.
Individualwordsthatswitchawayfromthedefaultlanguagetothesecondlanguagearemarkedwiththe@sterminator.
Ifthe@Languagesheaderhas"spa,eng",thenthe@smarkedindicatesaswithtoEnglish.
Ifthe@Languagesheaderhas"eng,spa"thenthe@sindicatesaswitchtoSpanish.
Iftheswitchistoalanguagenotincludedinthe@Languagesheader,thenthefullformmustbeusedasinword@s:porforswitchtoaPortugueseword.
4.
Whenthedefaultlanguageoftheinteractionchanges,thechangecanbemarkedwith@NewLanguage.
The@sspecialformmarkercodemayalsobeusedtoexplicitlymarktheuseofaparticularlanguage,evenifitisnotincludedinthe@Languagesheader.
Forexample,thecodeschlep@s:yidcanbeusedtomarktheinclusionoftheYiddishword"schlep"inanytext.
The@scodecanalsobefurtherelaboratedtomarkcode-blendedwords.
Theformwell@s:eng&cymindicatesthattheword"well"couldbeeitheranEnglishoraWelshword.
Thecombinationofastemfromonelanguagewithaninflectionfromanothercanbemarkedusingtheplussignasinswallowni@s:eng+hunforanEnglishstemwithaHungarianinfinitivalmarking.
Allofthesecodescanbefollowedbyacodewiththe$signtoexplicitlymarkthepartsofspeech.
Thus,theformrecordar@s$infindicatesthatPart1:CHAT101thisSpanishwordisaninfinitive.
Themarkingofpartofspeechwiththe$signcanalsobeusedwithoutthe@s.
Thesetechniquesarealldesignedtofacilitatetheretrievalofmaterialinonelanguageseparatelyfromtheotherwithouthavingtotageachandeveryword.
However,ifonewantstoseetagsoneveryword,atranscriptcreatedusingtheaboverulescanbereformattedusingthiscommand,inwhichthe–lswitchaddslanguagetagstoeveryword:kwal+d+t*+t@+t%-lfilename.
chaRelyingfurtheronthe–lswitch,itispossibletolocatecode-switchesontheutterancelevelinatranscriptbyusingaCOMBOcommandofthistypeforswitchesfromEnglishtoFrench:combo+b2-l+s"\**:^*s:eng^s:fra"*.
chaProblemssimilartothoseinvolvedincode-switchingoccurinstudiesofnarrativeswhereaspeakermayassumeavarietyofrolesorvoices.
Forexample,achildmaybespeakingeitherasthedragoninastoryorasthenarratorofthestoryorasherself.
Thesedifferentrolesaremosteasilycodedbymarkingthesix-charactermainlinecodewithformssuchas*CHIDRG,*CHINAR,and*CHISELforchild-as-dragon,child-as-narrator,andchild-as-self.
16.
2ElicitedNarrativesandPictureDescriptionsOftenresearchersuseasetofstructuredmaterialstoelicitnarrativesanddescriptions.
Thesemaybeaseriesofpicturesinastorybook,asetofphotos,afilm,oraseriesofactionsinvolvingobjects.
Thetranscriptsthatarecollectedduringthisprocesscanbestudiedmosteasilybyusinggemnotation.
Thesimplestformofthissystem,asetofnumbersareusedforeachpictureorpageofthebook.
HereisanexamplefromthebeginningofanItalianfilefromtheBolognafrogstorycorpus:@G:1*AND:questoe'unbimbopoic'e'ilcaneelarana.
*AND:questae'lacasa.
@G:2*AND:ilbimbodorme.
Thefirst@Gmarkerindicatesthefirstpageofthebookwiththeboy,thedog,andthefrog.
Thesecond@Gmarkerindicatesthesecondpageofthebookwiththeboysleeping.
Whenusingthislazygemtypeofmarking,itisassumedthatthebeginningofeachnewgemistheendofthepreviousgem.
ProgramssuchasGEMandGEMLISTcanthenbeusedtofacilitateretrievalofinformationlinkedtoparticularpicturesorstimuli.
16.
3WrittenLanguageCHATcanalsobeadaptedtoprovidecomputerizedrecordsofwrittendiscourse.
Typ-ically,researchersareinterestedintranscribingtwotypesofwrittendiscourse:(1)writtenproductionsproducedbyschoolstudents,and(2)printedtextssuchasbooksandnewspa-pers.
Thisformatisparticularlyusefulforcodingwrittenproductionsbyschoolchildren.
InordertouseCHATeffectivelyforthispurpose,thefollowingadaptationsorextensionscanbeused.
Part1:CHAT102ThebasicstructureofaCHATfileshouldbemaintained.
The@Beginand@Endfieldsshouldbekept.
However,the@Participantlineshouldlooklikethis:@Participants:TEXWriter's_NameTextEachwrittensentenceshouldbetranscribedonaseparatelinewiththe*TEX:fieldatthebeginning.
Additional@Commentand@Situationfieldscanbeaddedtoadddescriptivedetailsaboutthewritingassignmentandotherrelevantinformation.
Forresearchprojectsthatdonotdemandahighdegreeofaccuraterenditionoftheactualformofthewrittenwords,itissufficienttotranscribethewordsonthemainlineinnormalizedstandard-languageorthographicform.
However,iftheresearcherwantstotrackthedevelopmentofpunctuationandorthography,thenormalizedmainlineshouldbesup-plementedwitha%speline.
Herearesomeexamples:*TEX:EachofuswantedtogetgoinghomebeforetheSteeler'sgameletout.
%spe:etchof/uswantedtoogitgoinhome*,be/foretheStillersgameletout0.
Inthisexample,thestudenthadwritten"ofus"withoutaspaceandhadincorrectlyplacedaspaceinthemiddleof"before".
Theslashatthebeginningofawordmarksanomissionandtheinternalslashmarksanextraspace.
Thesetwomarksareusedtoachieveone-to-onealignmentbetweenthemainlineandthe%speline.
ThisalignmentcanbeusedtofacilitatetheuseofMODREPintheanalysisoforthographicerrors.
Itwillalsobeusedinthefuturebyprogramsthatperformautomaticcomparisonsbetweenthemainlineandthe%spelinetodiagnoseerrortypes.
Theonlypurposeofthe%spelineistocodeword-levelspellingerrors,nottocodeanyhigherlevelgrammaticalerrorsorwordomissions.
Also,thewordsonthemainlineareallgivenintheirstandardtarget-languageorthographicform.
Forclarity,finalpunctuationonthemainlineisprecededbyaspace.
Ifapunctuationmarkisomitted,itiscodedwithazero.
Formsthatappearonthe%spelinethathavenoroleinthemainline,suchasextraneouspunctuation,aremarkedwithanasterisk.
Theseconventionsfocusonthewritingofindividualwords.
However,itmayalsobenecessarytonotelargerfeaturesofcomposition.
Whenthestudentcrossesoffaseriesofwordsandrewritesthem,youcanusethestandardCHATconventionsforretracingwithscopingmarkedbyanglebracketsandthe[//]symbol.
Ifyouwanttomarkpagebreaks,youcanuseaheadersuchas@Stim:Page3.
Ifyouwishtomarkashiftinink,ororthographicstyle,youcanuseageneral@Commentfield.
16.
4SignandSpeechCHATcanalsobeusedtoanalyzeinteractionsthatcombinesignedandspokenlanguage.
Forexample,theymayoccurintheinputofhearingparentstodeafchildrenorwithhearingchildreninteractingwithdeafparents.
Thissystemoftranscriptionusesthe%sinlinetorepresentsignsandgestures.
Gesturesaretaggedwithg:(e.
g.
,"g:cat")whichmeansagesturefor"cat.
"Thiscodecanbefurtherelaboratedinaformlike"g:cat:dpoint"toindicatethatthegesturewasadeicticpoint(org:cat:iconforaniconicgesture,g:cat:dreqforadeicticrequest,etc.
).
Forsign,aformlike"s:cat"toindicatethatthechildsignedPart1:CHAT103"cat.
"Thereshouldbeaone-to-onecorrespondencebetweenthemainlineandthe%sinline.
Thiscanbedonebyusing0tomarkcaseswhereonlyoneformisused:*CHI:0.
%sin:g:baby:dpointThechildgesturedwithoutanyspeech.
*CHI:baby.
%sin:g:baby:dpointThechildpointedatthebabyatthesametimeshesaidbaby.
*CHI:baby0.
%sin:0g:baby:dpointThechildsaidbabyandthenpointedatthebaby.
*CHI:baby0.
%sin:g:baby:dpoints:babyThechildsaidbabywhilepointingatthebabyandthensignedbaby.
Part1:CHAT10417SpeechActCodesOnewayofcodingspeechactsistoseparatethecomponentofillocutionaryforcefromthoseaspectsthatdealwithinterchangetypes.
Onecanalsodistinguishasetofcodesthatrelatetothemodalityormeansofexpression.
Codesofthesethreetypescanbeplacedtogetheronthe%spatier.
Oneformofcodingprecedeseachcodetypewithanidentifier,suchas"x"forinterchangetypeand"i"forillocutionarytype.
Hereisanexampleofthecombineduseofthesevariouscodes:*MOT:areyouokay%spa:$x:dhs$i:yqAlternatively,onecancombinethecodesinahierarchicalsystem,sothatthepreviousexamplewouldhaveonlythecode$dhs:yq.
Choiceofdifferentformsforcodesdependsonthegoalsoftheanalysis,thestructureofthecodingsystem,andthewaythecodesinterfacewithclan.
Userswilloftenneedtoconstructtheirowncodingschemes.
However,oneschemethathasreceivedextensiveattentionisoneproposedbyNinio&Wheeler(1986).
Ninio,Snow,Pan,&Rollins(1994)providedasimplifiedversionofthissystemcalledINCA-A,orInventoryofCommunicativeActs-Abridged.
ThenexttwosectionsgivethecategoriesofinterchangetypesandillocutionaryforcesintheproposedINCA-Asystem.
17.
1InterchangeTypesInterchangeTypeCodesCodeFunctionExplanationCMOcomfortingtocomfortandexpresssympathyformisfortuneDCAdiscussingclarificationofactiontodiscussclarificationofhearer'snonverbalcommunicativeactsDCCdiscussingclarificationofcommunicationtodiscussclarificationofhearer'sambiguousverbalcommunicationoraconfirmationofthespeaker'sunderstandingofitDFWdiscussingthefantasyworldtoholdaconversationwithinfantasyplayDHAdirectinghearer'sat-tentiontoachievejointfocusofattentionbydirectinghearer'sattentiontoobjects,persons,andeventsDHSdiscussinghearer'ssentimentstoholdaconversationabouthearer'snonobservablethoughtsandfeelingsDJFdiscussingajointfocusofattentiontoholdaconversationaboutsomethingthatbothpar-ticipantsareattendingto,e.
g.
,objects,persons,on-goingactionsofhearerandspeaker,ongoingeventsDNPdiscussingthenonpresenttoholdaconversationabouttopicsthatarenotob-servableintheenvironment,e.
g.
,pastandfutureeventsandactions,distantobjectsandpersons,ab-stractmatters(excludinginnerstates)DREdiscussingarecenteventtoholdaconversationaboutimmediatelyPart1:CHAT105pastactionsandeventsDRPdiscussingtherelated-to-presenttodiscussnonobservableattributesofobjectsorper-sonspresentintheenvironmentortodiscusspastorfutureeventsrelatedtothosereferentsDSSdiscussingspeaker'ssentimentstoholdaconversationaboutspeaker'snonobservablethoughtsandfeelingsMRKmarkingtoexpresssociallyexpectedsentimentsonspecificoccasionssuchasthanking,apologizing,ortomarksomeeventNCSnegotiatecopresenceandseparationtomanagethetransitionNFAnegotiatinganactivityinthefuturetonegotiateactionsandactivitiesinthefarfutureNIAnegotiatingtheimme-diateactivitytonegotiatetheinitiation,continuation,endingandstoppingofactivitiesandacts;todirecthearer'sandspeaker'sacts;toallocateroles,moves,andturnsinjointactivitiesNINnoninteractivespeechtoengageinprivatespeechorproducesutterancesnotaddressedtopresenthearerNMAnegotiatemutualatten-tiontoestablishmutualattentivenessandproximityorwithdrawalPROperformingverbalmovestoperformmovesinagameorotheractivitybyut-teringtheappropriateverbalformsPSSnegotiatingpossessionofobjectstodiscusswhoisthepossessorofanobjectSATshowingattentivenesstodemonstratethatspeakerispayingattentiontothehearerTXTreadingwrittentexttoreadorrecitewrittentextaloudOOOunintelligibletomarkunintelligibleutterancesYYYuninterpretabletomarkuninterpretableutterances17.
2IllocutionaryForceCodesDirectivesACAnswercalls;showattentivenesstocommunications.
ADAgreetocarryoutanactrequestedorproposedbyother.
ALAgreetodosomethingforthelasttime.
CLCallattentiontohearerbynameorbysubstituteexclamations.
CSCounter-suggestion;anindirectrefusal.
DRDareorchallengehearertoperformanaction.
GIGivein;acceptother'sinsistenceorrefusal.
GRGivereason;justifyarequestforanaction,refusal,orprohibition.
RDRefusetocarryoutanactrequestedorproposedbyother.
RPRequest,propose,orsuggestanactionforhearer,orforhearerandspeaker.
RQYes/noquestionorsuggestionabouthearer'swishesandintentionsPart1:CHAT106SSSignaltostartperforminganact,suchasrunningorrollingaball.
WDWarnofdanger.
SpeechElicitationsCXCompletetext,ifsodemanded.
EAElicitonomatopoeicoranimalsounds.
EIElicitimitationofwordorsentencebymodellingorbyexplicitcommand.
ECElicitcompletionofwordorsentence.
EXElicitcompletionofrote-learnedtext.
RTRepeatorimitateother'sutterance.
SCCompletestatementorotherutteranceincompliancewithrequest.
CommitmentsFPAskforpermissiontocarryoutact.
PAPermithearertoperformact.
PDPromise.
PFProhibit/forbid/protesthearer'sperformanceofanact.
SIStateintenttocarryoutactbyspeaker.
TDThreatentodo.
DeclarationsDCCreateanewstateofaffairsbydeclaration.
DPDeclaremake-believereality.
NDDisagreewithadeclaration.
YDAgreetoadeclaration.
MarkingsCMCommiserate,expresssympathyforhearer'sdistress.
EMExclaimindistress,pain.
ENExpresspositiveemotion.
ESExpresssurprise.
MKMarkoccurrenceofevent(thank,greet,apologize,congratulate,etc.
).
TOMarktransferofobjecttohearer.
XAExhibitattentivenesstohearer.
StatementsAPAgreewithpropositionorproposalexpressedbypreviousspeaker.
CNCount.
DWDisagreewithpropositionexpressedbypreviousspeaker.
STMakeadeclarativestatement.
WSExpressawish.
QuestionsAQAggravatedquestion,expressionofdisapprovalbyrestatingaquestion.
AAAnswerintheaffirmativetoyes/noquestion.
Part1:CHAT107ANAnswerinthenegativetoyes/noquestion.
EQElicitingquestion(e.
g.
,hmm).
NAIntentionallynonsatisfyinganswertoquestion.
QAAnsweraquestionwithawh-question.
QNAskaproduct-question(wh-question).
RARefusetoanswer.
SAAnswerawh-questionwithastatement.
TAAnsweralimited-alternativequestion.
TQAskalimited-alternativeyes/noquestion.
YQAskayes/noquestion.
YAAnsweraquestionwithayes/noquestion.
PerformancesPRPerformverbalmoveingame.
TXReadorrecitewrittentextaloud.
EvaluationsABApproveofappropriatebehavior.
CRCriticizeorpointouterrorinnonverbalact.
DSDisapprove,scold,protestdisruptivebehavior.
EDExclaimindisapproval.
ETExpressenthusiasmforhearer'sperformance.
PMPraiseformotoracts,i.
e.
fornonverbalbehavior.
DemandsforclarificationRRRequesttorepeatutterance.
TexteditingCTCorrect,providecorrectverbalforminplaceoferroneousone.
VocalizationsYYMakeaword-likeutterancewithoutclearfunction.
OOUnintelligiblevocalization.
CertainotherspeechactcodesthathavebeenwidelyusedinchildlanguageresearchcanbeencounteredintheCHILDESdatabase.
ThesegeneralcodesshouldnotbecombinedwiththemoredetailedINCA-Acodes.
TheyincludeELAB(Elaboration),EVAL(Evaluation),IMIT(Imitation),NR(NoResponse),Q(Question),REP(Repetition),N(Negation),andYN(Yes/NoQuestion.
Part1:CHAT10818ErrorCoding18.
1WordlevelerrorcodesErrorsatthewordlevelaremarkedbyplacingthe[*]symbolaftertheerroneousword.
Ifthereisareplacementstring,suchas[:because],thatshouldcomebeforetheerrorcode.
Whenanerroroccursintheinitialpartofaretracing,the[*]symbolisplacedaftertheerror,butbeforethe[/]mark.
18.
1.
1Phonologicalerrors[*p]p:wword,asinboaterforbutterp:nnon-word,asinbutherforbutterp:mmetathesis,asinstisserzforsistersTobeconsideredaphonologicalerror,theerrormustmeetthesecriteria:1.
Forone-syllablewords,consistingofanonset(initialphonemeorphonemes)plusvowelnucleuspluscoda(finalphonemeorphonemes),theerrormustmatchon2outof3ofthoseelements(e.
g.
,onsetplusvowelnucleusORvowelnucleuspluscodaORonsetpluscoda).
Thepartofthesyllablethatisinerrormaybeasubstitution,addition,oromission.
Forone-syllablewordswithnoonset(e.
g.
,eat)ornocoda(e.
g.
,pay),theabsenceoftheonsetorcodaintheerrorwouldalsocountasamatch.
2.
Formulti-syllabicwords,theerrormusthavecompletesyllablematchesonallbutonesyllable,andthesyllablewiththeerrormustmeettheone-syllablewordmatchcriteriastatedabove.
Note:Ifusingothercriteriaforphonologicalerrorcoding(e.
g.
,overlapof>50%ofphonemesbetweenerrorproductionandtargetword),someofthen:kands:urerrorsmayqualify.
18.
1.
2Semanticerrors[*s]s:rrelatedword,targetknown,asinmotherforfathers:urunrelatedword,targetknown,asincombforumbrellas:ukword,unknowntarget,asin"Igowolf"s:perperseveration,asin"hekickedtheballthroughtheball"Forerrorswithrelatedwordsforknowntargets,onecanaddtheseadditionaldistinctions:s:r:prepwrongpreposition,asinonforinoroffforouts:r:segwordthatisapartialsegmentofthetarget,asinfireforfiremans:r:derderivationalerrorusingarealword,asinassessforassessmentorhumblenessforhumilityErrorsinvolvinggrammaticalcategories,suchasnumber,case,definiteness,orgenderarecodedas[*s:r:gc].
Thesecanbefurthercodedusingtherelevantpartofspeech,suchas"art"forarticleor"pro"forpronoun,and"der"forderivation,asintheseexamples:s:r:gc:artdefiniteforindefinite,indefinitefordefinite,definiteforzeros:r:gc:prohisforher,yourforyours,myforminePart1:CHAT10918.
1.
3Neologisms[*n]n:kneologism,knowntarget,doesnotmeetphonologicalerrorcriterian:ukneologism,unknowntargetn:k:sneologism,knowntarget,stereotypy(recurringnon-word)n:uk:sneologism,unknowntarget,stereotypy(recurringnon-word)n:k:derneologism,knowntarget,asinintegrativityforintegration,orfoundamentforfoundation18.
1.
4Morphologicalerrors[*m:a]Forthecodingofmorphologicalerrors,thesenineabbreviationscanbeused:-ingprogressive-3s3rdpersonsingular-edpast-enperfective-snounplural-'spossessive-s'possessiveplural-ercomparative-estsuperlativeMissingregularforms,inwhichthebaselemmaappearswithnosuffix,arecodedwithm:0,asinm:0ingmissingprogressivesuffixm:03s*missing3rdpersonsingularsuffixm:0edmissingregularpastsuffixm:0s*missingregularpluralsuffixm:0'smissingpossessivesuffixm:0s'missingpossessivepluralsuffixHowever,thetwocodesmarkedabovewith*shouldusuallybecodedinsteadasagreementerrors,asnotedbelow.
Substitutionsofthebaseformforirregulars,withomissionoftheexpectedmarking,arecodedwithm:base:*,asinm:base:schildforchildren,oxforoxenm:base:edcomeforcame,bringforbroughtm:base:entakefortakenorfreezeforfrozenm:base:erbadderforworsem:base:estbaddestforworstSubstitutionsofanirregularforthebaseformarecodedwithm:irr:*inthisway:m:irr:schildrenforchildm:irr:edfoundforfindm:irr:entakenfortakeSubstitutionsbetweenpastandperfectiveirregularsarecodedwithm:sub:*inthisway:m:sub:edfrozenforfroze,seenforsawPart1:CHAT110m:sub:enfrozeforfrozen,sawforseenOverregularizationsofirregularsarecodedinthisway:m:=edoverregularized-edpast,asinseedforsawm:=enoverregularized-enperfective,asintakedfortakenm:=soverregularized–splural,asinchildsforchildrenSuperfluousmarkingsofregularsarecodedinthisway:m:+ingsuperfluousprogressive,asinrunningforrunm:+3s*superfluous3rdpersonsingular–ssuffix,asingoesforgom:+edsuperfluousregularpast,asinwalkedforwalkm:+ensuperfluousperfective,asintakenfortakem:+s*superfluousplural,asingownsforgownm:+'ssuperfluouspossessiveorpluralpossessive,asinJohn'sforJohn.
However,thetwocodesmarkedabovewith*shouldusuallybecodedasagreementerrors,asnotedbelow.
Doublemarkingsofregularsandirregularsarecodedinthisway:m:++ingrunningingm:++3swantsesm:++edtalkededm:++enchangededm:++ssevensesm:++'sboys's'sDoublemarkingsofirregularsaremarkedusingtheabovecodesplusafinal:im:++ed:itookedm:++en:ibrokened,takenenm:++s:ifeetsAgreementerrorsforirregularsaremarkedinthisway:m:vsg:averb3rdsingularforunmarked:hasforhave,isforare,wasforwerem:vun:averbunmarkedfor3rdsingularhaveforhas,areforis,wereforwasWhenagreementerrorsinvolveregulars,the:ashouldbeaddedtothebasiccode,asinm:03s:ahewantforhewantsm:+3s:awewantsforwewantm:0s:anounsingularforplural,asintwodogfortwodogsm:+s:anounpluralforsingular,asinthisdogsforthisdogAllomorphyerrorsinthestemorbasearecodedinthisway:m:alloknifesforknives,anforaPart1:CHAT11118.
1.
5Dysfluencies[*d]d:swdysfluencywithinword,asininsuhsideforinside18.
1.
6MissingWordsMissingwordsorpartsofspeecharecodedwithaninitial"0",asin0det,0aux,0does,etc.
Foragrammaticandjargonaphasicspeech,itisoftenbesttoavoidtryingtoguessatwhatismissingandtojustmarkincompleteutteranceswiththepostcode[+gram](explainedbelow).
18.
1.
7GeneralConsiderationsIftheerrorisanon-wordandthetargetisknown,thetargetismarkedwiththesquarebracketsandasinglecolon.
DoingthisallowstheMORprogramtousetherealwordtargetforparsinginitsanalysis.
Whentheerrorisarealword,eventhoughitisthewrongrealword,thetargetcanbemarkedwithasinglecolonoradoublecolon.
ThedoublecolonallowstheMORprogramtousetheactualwordproducedratherthanthetargetinitsanalysis,butalsoallowsprogramssuchasFREQtouseeitherformwhenneeded.
*PAR:theycutthe&+llockoffthedoorandtall[:call][*p:w]theparamedics.
OR*PAR:theycutthe&+llockoffthedoorandtall[::call][*p:w]theparamedics.
Multiplecodesmaybeusedifanerroris,forexample,bothasemanticandphonemicparaphasia,asinthisexample:*PAR:itwassinging[:ringing][*s:r][*p:w]inmyears.
Also,iftheerrorisrepeated,youcanadd"-rep"totheerrorcode;iftheerrorisrevised(toanothererrororthecorrectword),add"-ret"(retraced)totheerrorcode,asinthisexample:*PAR:it'salittledog[:cat][*s:r-ret][//]cat.
18.
2Utterancelevelerrorcoding(post-codes)Inadditiontoprovidingmethodsforcodingerrorsatthewordlevel,CHATincludesaseriesofcodesforerrorsinvolvinglargersegmentsofanutteranceorthewholeutterance.
Mutliplecodescanbeusedforagivenutterance.
Thecodesare:[+gram]grammaticalerror[+per]perseveration[+jar]jargon[+cir]circumlocution[+es]emptyspeechGrammaticalerror–[+gram]–includesagrammaticandparagrammaticutterances:Part1:CHAT112telegraphicspeechspeechinwhichcontentwords(mainlynouns,verbs,andadjectives)arerelativelypreservedbutmanyfunctionwords(articles,prepositions,conjunctions)aremissing(adaptedfromBrookshire,1997)utteranceswithfrankgrammaticalerrors(withoutrequiringthateachutterancebeacompletesentencewithasubjectandpredicate)utteranceswitherrorsinwordorder,syntacticstructure,orgrammaticalmorphology(ButterworthandHoward,1987)utterancelevelgrammaticalerrorsasopposedtowordlevelagreementerrorsormissingpartsofspeech*PAR:onetwobread.
[+gram]*PAR:whateverI'mthinkup.
[+gram]*PAR:isgettingwanttobewasn't.
[+gram][+jar]*PAR:wheneverythingthatwegoingoutnow.
[+gram][+jar]Jargon–[+jar]–mostlyfluentandprosodicallycorrectbutlargelymeaninglessspeech,containingparaphasias,neologisms,orunintelligiblestrings;resemblesEnglishsyntaxandinflection(adaptedfromKertesz,2007)*PAR:goand&+hahack[*s:uk]thegets[*s:uk]beablegable[*s:uk]get&+susm@u[:x@n][*n:uk].
[+jar]*PAR:getthiscare[*s:uk-ret][//]kf@u[:x@n][*n:uk]toeathere.
[+jar]*PAR:andxxx.
[+jar]Emptyspeech–[+es]–speechthatissyntacticallycorrectbutconveyslittleornooverallmeaning,oftenaresultofsubstitutinggeneralwords(e.
g.
,thing,stuff)formorespecificwords(Brookshire,1997).
Differentiatingamong"emptyspeech","jargon",and"grammaticalerror"codesmaybechallenging.
Intruth,allthesesentencesmaybemeaninglessintheconversationalcontext.
Briefly,emptyspeechutterancesshouldcontaingeneral,vague,unspecificreferents;jargonutterancesshouldcontainparaphasiasand/orneologisms;andparagrammaticutterances(inthegrammaticalerrorcategory)shouldhaveinappropriatejuxtapositionsofgrammaticalelements.
*PAR:wegotlittlethingsoverhere.
[+es]*PAR:therewasnothinginthatonethere.
[+es]Perseveration–[+per]–repetitionofanutterancewhenitisnolongerappropriate(Brookshire,1997)Circumlocution–[+cir]–talkingaroundwords/concepts*PAR:andthroughthehelpof[//]theladythatishelpingCinderella&-umshehas[//]theprincecheckthe&+s&-uhshoe.
[+cir][+gram]Part1:CHAT113ReferencesAllen,G.
D.
(1988).
ThePHONASCIIsystem.
JournaloftheInternationalPhoneticAssociation,18,9-25.
Ament,W.
(1899).
DieEntwicklungvonSprechenundDenkenbeimKinder.
Leipzig:ErnstWunderlich.
Augustine,S.
(1952).
TheConfessions,original397A.
D.
(Vol.
Volume18).
Chicago:EncyclopediaBritannica.
Bates,E.
,&MacWhinney,B.
(1982).
Functionalistapproachestogrammar.
InE.
Wanner&L.
Gleitman(Eds.
),Languageacquisition:Thestateoftheart(pp.
173-218).
NewYork,NY:CambridgeUniversityPress.
BernsteinRatner,N.
,Rooney,B.
,&MacWhinney,B.
(1996).
AnalysisofstutteringusingCHILDESandCLAN.
ClinicalLinguisticsandPhonetics,10(3),169-188.
Bloom,P.
(2000).
Howchildrenlearnthemeaningsofwords.
Cambridge,MA:MITPress.
Brown,R.
(1973).
Afirstlanguage:Theearlystages.
Cambridge,MA:Harvard.
Chafe,W.
(Ed.
)(1980).
ThePearstories:Cognitive,cultural,andlinguisticaspectsofnarrativeproduction.
Norwood,NJ:Ablex.
Clark,E.
(1987).
ThePrincipleofContrast:Aconstraintonlanguageacquisition.
InB.
MacWhinney(Ed.
),MechanismsofLanguageAcquisition(pp.
1-34).
Hillsdale,NJ:LawrenceErlbaumAssociates.
Crystal,D.
(1969).
ProsodicsystemsandintonationinEnglish.
Cambridge:CambridgeUniversityPress.
Crystal,D.
(1979).
Prosodicdevelopment.
InP.
Fletcher&M.
Garman(Eds.
),Languageacquisition:Studiesinfirstlanguagedevelopment.
NewYork,NY:CambridgeUniversityPress.
Darwin,C.
(1877).
Abiographicalsketchofaninfant.
Mind,2,292-294.
Edwards,J.
(1992).
Computermethodsinchildlanguageresearch:fourprinciplesfortheuseofarchiveddata.
JournalofChildLanguage,19,435-458.
Ekman,P.
,&Friesen,W.
(1969).
Therepertoireofnonverbalbehavior:Categories,origins,usage,andcoding.
Semiotica,1,47-98.
Ekman,P.
,&Friesen,W.
(1978).
Facialactioncodingsystem:Investigator'sguide.
PaloAlto,CA:ConsultingPsychologistsPress.
Fletcher,P.
(1985).
Achild'slearningofEnglish.
Oxford:Blackwell.
Goldman-Eisler,F.
(1968).
Psycholinguistics:Experimentsinspontaneousspeech.
NewYork,NY:AcademicPress.
Gvozdev,A.
N.
(1949).
Formirovaniyeurebenkagrammaticheskogostroya.
Moscow:AkademijaPedagogikaNaukRSFSR.
Halliday,M.
(1966).
NotesontransitivityandthemeinEnglish:Part1.
Journaloflinguistics,2,37-71.
Halliday,M.
(1967).
NotesontransitivityandthemeinEnglish:Part2.
Journaloflinguistics,3,177-274.
Halliday,M.
(1968).
NotesontransitivityandthemeinEnglish:Part3.
Journaloflinguistics,4,153-308.
Part1:CHAT114Jefferson,G.
(1984).
Transcriptnotation.
InJ.
Atkinson&J.
Heritage(Eds.
),Structuresofsocialinteraction:Studiesinconversationanalysis(pp.
134-162).
Cambridge:CambridgeUniversityPress.
Kearney,G.
,&McKenzie,S.
(1993).
Machineinterpretationofemotion:Designofmemory-basedexpertsystemforinterpretingfacialexpressionsintermsofsignaledemotions.
CognitiveScience,17,589-622.
Kenyeres,E.
(1926).
Agyermekelsszavaiesaszófajókfllépése.
Budapest:Kisdednevelés.
Kenyeres,E.
(1938).
Commentunepetitehongroisedeseptansapprendlefranais.
ArchivesdePsychologie,26,521-566.
Leopold,W.
(1939).
Speechdevelopmentofabilingualchild:alinguist'srecord:Vol.
1.
Vocabularygrowthinthefirsttwoyears(Vol.
1).
Evanston,IL:NorthwesternUniversityPress.
Leopold,W.
(1947).
Speechdevelopmentofabilingualchild:alinguist'srecord:Vol.
2.
Sound-learninginthefirsttwoyears.
Evanston,IL:NorthwesternUniversityPress.
Leopold,W.
(1949a).
Speechdevelopmentofabilingualchild:alinguist'srecord:Vol.
3.
Grammarandgeneralproblemsinthefirsttwoyears.
Evanston,IL:NorthwesternUniversityPress.
Leopold,W.
(1949b).
Speechdevelopmentofabilingualchild:alinguist'srecord:Vol.
4.
Diaryfromage2.
Evanston,IL:NorthwesternUniversityPress.
LIPPS.
(2000).
TheLIDESmanual:Adocumentforpreparingandanalysinglanguageinteractiondata.
InternationalJournalofBilingualism,4,1-64.
Low,A.
A.
(1931).
AcaseofagrammatismintheEnglishlanguage.
ArchivesofNeurologyandPsychiatry,25,556-597.
MacWhinney,B.
(1989).
Competitionandlexicalcategorization.
InR.
Corrigan,F.
Eckman,&M.
Noonan(Eds.
),Linguisticcategorization(pp.
195-242).
Philadelphia,PA:Benjamins.
MacWhinney,B.
,&Osser,H.
(1977).
Verbalplanningfunctionsinchildren'sspeech.
ChildDevelopment,48,978-985.
Malvern,D.
,Richards,B.
,Chipere,N.
,&Purán,P.
(2004).
Lexicaldiversityandlanguagedevelopment.
NewYork,NY:PalgraveMacmillan.
Miller,J.
,&Chapman,R.
(1983).
SALT:SystematicAnalysisofLanguageTranscripts,User'sManual.
Madison,WI:UniversityofWisconsinPress.
Moerk,E.
(1983).
ThemotherofEveasafirstlanguageteacher.
Norwood,N.
J.
:ABLEX.
Ninio,A.
,Snow,C.
E.
,Pan,B.
,&Rollins,P.
(1994).
Classifyingcommunicativeactsinchildren'sinteractions.
JournalofCommunicationDisorders,27,157-188.
Ninio,A.
,&Wheeler,P.
(1986).
Amanualforclassifyingverbalcommunicativeactsinmother-infantinteraction.
TranscriptAnalysis,3,1-83.
Ochs,E.
(1979).
Transcriptionastheory.
InE.
Ochs&B.
Schieffelin(Eds.
),Developmentalpragmatics(pp.
43-72).
NewYork,NY:Academic.
Ochs,E.
A.
,Schegloff,M.
,&Thompson,S.
A.
(1996).
Interactionandgrammar.
Cambridge:CambridgeUniversityPress.
Parisse,C.
,&LeNormand,M.
-T.
(2000).
Automaticdisambiguationofthemorphosyntaxinspokenlanguagecorpora.
BehaviorResearchMethods,Instruments,andComputers,32,468-481.
Part1:CHAT115Parrish,M.
(1996).
AlanLomax:Documentingfolkmusicoftheworld.
SingOut!
:TheFolkSongMagazine,40,30-39.
Pick,A.
(1913).
DieagrammatischerSprachstrungen.
Berlin:Springer-Verlag.
Preyer,W.
(1882).
DieSeeledesKindes.
Leipzig:Grieben's.
Sacks,H.
,Schegloff,E.
,&Jefferson,G.
(1974).
Asimplestsystematicsfortheorganizationofturn-takingforconversation.
Language,50,696-735.
Sagae,K.
,Davis,E.
,Lavie,A.
,MacWhinney,B.
,&Wintner,S.
(2007).
High-accuracyannotationandparsingofCHILDEStranscripts.
InProceedingsofthe45thMeetingoftheAssociationforComputationalLinguistics(pp.
1044-1050).
Prague:ACL.
Sagae,K.
,Lavie,A.
,&MacWhinney,B.
(2005).
Automaticmeasurementofsyntacticdevelopmentinchildlanguage.
InProceedingsofthe43rdMeetingoftheAssociationforComputationalLinguistics(pp.
197-204).
AnnArbor,MI:ACL.
Sagae,K.
,MacWhinney,B.
,&Lavie,A.
(2004).
Addingsyntacticannotationstotranscriptsofparent-childdialogs.
InLREC2004(pp.
1815-1818).
Lisbon:LREC.
Selting,M.
,&al.
,e.
(1998).
GesprchsanalytischesTranskriptionssystem(GAT).
LinguistischeBerichte,173,91-122.
Slobin,D.
(1977).
Languagechangeinchildhoodandinhistory.
InJ.
Macnamara(Ed.
),Languagelearningandthought(pp.
185-214).
NewYork,NY:AcademicPress.
Sokolov,J.
L.
,&Snow,C.
(Eds.
).
(1994).
HandbookofResearchinLanguageDevelopmentusingCHILDES.
Hillsdale,NJ:Erlbaum.
Stemberger,J.
(1985).
Thelexiconinamodeloflanguageproduction.
NewYork,NY:Garland.
Stern,C.
,&Stern,W.
(1907).
DieKindersprache.
Leipzig:Barth.
Trager,G.
(1958).
Paralanguage:Afirstapproximation.
StudiesinLinguistics,13,1-12.
Wernicke,C.
(1874).
DieAphasischeSymptomenkomplex.
Breslau:Cohn&Weigart.

搬瓦工:香港PCCW机房即将关闭;可免费升级至香港CN2 GIA;2核2G/1Gbps大带宽高端线路,89美元/年

搬瓦工怎么样?这几天收到搬瓦工发来的邮件,告知香港pccw机房(HKHK_1)即将关闭,这也不算是什么出乎意料的事情,反而他不关闭我倒觉得奇怪。因为目前搬瓦工香港cn2 GIA 机房和香港pccw机房价格、配置都一样,可以互相迁移,但是不管是速度还是延迟还是丢包率,搬瓦工香港PCCW机房都比不上香港cn2 gia 机房,所以不知道香港 PCCW 机房存在还有什么意义?关闭也是理所当然的事情。点击进...

ATCLOUD-KVM架构的VPS产品$4.5,杜绝DDoS攻击

ATCLOUD.NET怎么样?ATCLOUD.NET主要提供KVM架构的VPS产品、LXC容器化产品、权威DNS智能解析、域名注册、SSL证书等海外网站建设服务。 其大部分数据中心是由OVH机房提供,其节点包括美国(俄勒冈、弗吉尼亚)、加拿大、英国、法国、德国以及新加坡。 提供超过480Gbps的DDoS高防保护,杜绝DDoS攻击骚扰,比较适合海外建站等业务。官方网站:点击访问ATCLOUD官网活...

HostDare($33.79/年)CKVM和QKVM套餐 可选CN2 GIA线路

关于HostDare服务商在之前的文章中有介绍过几次,算是比较老牌的服务商,但是商家背景财力不是特别雄厚,算是比较小众的个人服务商。目前主流提供CKVM和QKVM套餐。前者是电信CN2 GIA,不过库存储备也不是很足,这不九月份发布新的补货库存活动,有提供九折优惠CN2 GIA,以及六五折优惠QKVM普通线路方案。这次活动截止到9月30日,不清楚商家这次库存补货多少。比如 QKVM基础的五个方案都...

sonicchat为你推荐
易烊千玺弟弟创魔方世界纪录王俊凯和王源两人和易烊千玺的弟弟玩过吗淘宝门户淘宝社区怎么进?地图应用什么地图导航最好用最准确陈嘉垣反黑阿欣是谁演的 扮演者介绍丑福晋谁有好看的言情小说介绍下丑福晋八阿哥胤禩有几个福晋 都叫啥名儿呀sss17.com一玩棋牌吧(www.17wqp.com)怎么样?www.e12.com.cn上海高中除了四大名校,接下来哪所高中最好?顺便讲下它的各方面情况www.javmoo.comjavimdb是什么网站为什么打不开www.22zizi.com乐乐电影天堂 http://www.leleooo.com 这个网站怎么样?
广州服务器租用 香港服务器租用99idc 国外免费vps 域名备案网站 smartvps 漂亮qq空间 国外主机 godaddy域名优惠码 iis安装教程 云图标 空间出租 老左来了 可外链相册 linux使用教程 电信托管 无限流量 厦门电信 服务器托管价格 汤博乐 register.com 更多