After completion of fieldwork, a subset of specimens from the transect surveys were chosen for DNA barcoding to confirm or amend field identifications. These specimens included (i) at least one specimen from each field-ID (except obvious species such as Mastigias papua) and (ii) several specimens representing the range of phenotypic variation of field-IDs that showed considerable variation or were challenging to distinguish (e.g. small sponge specimens of similar color and texture). Additionally, specimens from a previously collected voucher collection (indicated with “V_” in prefix of sequence ID) were barcoded and identified by taxonomic experts. Specimens from population genetic collections (indicated with “PG_” in prefix of sequence ID) were also barcoded. DNA was purified using a modified phenol-chloroform CTAB extraction protocol (1) or AcroPrep PALL 5053 glass fiber plates procedure (2, 3). We amplified the Cytochrome c Oxidase subunit I (COI) barcode locus using 0.5 µL of purified DNA in a 25-µL polymerase chain reaction (PCR) with 0.05 µL AMPLITAQ (Applied Biosystems, Foster City, California, USA), 2.5 µL 10x buffer (Applied Biosystems), 0.63 µL of 20 µM primers (Operon Biotechnologies Inc., Huntsville, Alabama, USA), 2.5 µL of 25 mM MgCl2 (Applied Biosystems), 0.5 µL of 10 mg/mL bovine serum albumin (BSA) and 0.5 µL of 10 mM dNTPs. Several primer sets were used (Table 1). Amplicons were sequenced at the University of California Berkeley DNA Sequencing Facility (Berkeley, California, USA). Base calls in electropherograms were visually checked and manually corrected for errors and forward and reverse reads were assembled in Sequencher 4.8 (GeneCodes, Ann Arbor, Michigan, USA). We used Basic Local Alignment Search Tool (BLASTn) to determine the higher level taxonomic assignment for each sequence (which we used to process batches of similar sequences) — ascidians, bivalves, bryozoans, cnidarians, crustaceans, echinoderms, gastropods, polychaetes, and poriferans. Sequences organized by these broad groups were then aligned using Muscle v3.8.425 (4). For each group, alignments were manually adjusted and trimmed to the same length in Mesquite v3.5 (5) to balance total individuals retained and sequence length. The resulting alignment lengths were: ascidians 395bp, bivalves 567bp, bryozoans 622bp, cnidarians 612bp, crustaceans 299bp, echinoderms 357bp, gastropods 562bp, polychaetes 509bp, and poriferans 688bp. Sequences were translated to amino acid sequence to confirm an open reading frame. Short sequences were excluded from further analysis, but percent pairwise identity with the closest match was recorded for each based on the shortest sequence. Pairwise sequence distance was calculated using dist.dna with Kimura’s 2-parameter distance model of evolution (6) in the ape package v4.1 (7) in R (8). OTUs, or clusters of sequences, similar at 97% were identified using tclust in the spider package v1.5.0 (9) in R (8) for each taxonomic group, except for poriferans, which were clustered at 99% sequence similarity given their slow sequence evolution (10).
1. Dawson MN, Raskoff KA, Jacobs DK (1998) Field preservation of marine invertebrate tissue for DNA analyses. Mol Mar Biol Biotechnol 7(2):145–52.
2. Ivanova N V., Dewaard JR, Hebert PDN (2006) An inexpensive, automation-friendly protocol for recovering high-quality DNA. Mol Ecol Notes 6(4):998–1002.
3. Schiebelhut LM, Abboud SS, Gómez Daglio LE, Swift HF, Dawson MN (2017) A comparison of DNA extraction methods for high-throughput DNA analyses. Mol Ecol Resour 17(4):721–729.
4. Edgar RC (2004) MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797.
5. Maddison WP, Maddison DR (2018) Mesquite: a modular system for evolutionary analysis.
6. Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16(2):111–120.
7. Paradis E, Claude J, Strimmer K (2004) APE: Analyses of phylogenetics and evolution in R language. Bioinformatics 20(2):289–290.
8. R Core Team (2018) R: A language and environment for statistical computing (R Foundation for Statistical Computing, Vienna, Austria).
9. BROWN SDJ, et al. (2012) Spider: An R package for the analysis of species identity and evolution, with particular reference to DNA barcoding. Mol Ecol Resour 12(3):562–565.
10. Huang D, Meier R, Todd PA, Chou LM (2008) Slow mitochondrial COI sequence evolution at the base of the metazoan tree and its implications for DNA barcoding. J Mol Evol 66(2):167–174.
See Table 1. Primers and thermocycle conditions used for PCR of macroinvertebrates by taxonomic group in Supplemental Documents, below.
For the sequence alignment files (.fas) mentioned in the methods above, see the Supplemental Files section below.