Dataset: IODP360 - iTAG and metatranscriptome data
Deployment: IODP-360

View Data: For data, See Dataset Metadata Page: https://osprey.bco-dmo.org/dataset/813173

Principal Investigator:

Virginia P. Edgcomb (Woods Hole Oceanographic Institution, WHOI)

Contact:

Virginia P. Edgcomb (Woods Hole Oceanographic Institution, WHOI)

BCO-DMO Data Manager:

Karen Soenen (Woods Hole Oceanographic Institution, WHOI BCO-DMO)

Project:

Collaborative Research: Delineating The Microbial Diversity and Cross-domain Interactions in The Uncharted Subseafloor Lower Crust Using Meta-omics and Culturing Approaches (Subseafloor Lower Crust Microbiology)

Version:

Deployment Synonyms:

JOIDES Resolution, International Ocean Discovery Program (IODP) Expedition 360 (X360)

Expand/Collapse All

Description

Supplementary Table 4C: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences: Statistics of reads retained through bioinformatic processing of iTAG data for the 11 samples and control samples and metatranscriptome data. Samples taken on board of the R/V JOIDES Resolution between November 30, 2015 and January 30, 2016

Methods & Sampling

Dataset acquisition description

Rock material was crushed while still frozen in a Progressive Exploration Jaw Crusher (Model 150) whose surfaces were sterilized with 70% ethanol and RNase AWAY (Thermo Fisher Scientific, USA) inside a laminar flow hood. Powdered rock material was returned to the -80°C freezer until extraction.

DNA was extracted from 20, 30, or 40 grams of powdered rock material, depending on the quantity of rock available. A DNeasy PowerMax Soil Kit (Qiagen, USA) was used following the manufacturer’s protocol modified to included three freeze/thaw treatments prior to the addition of Soil Kit solution C1. Each treatment consisted of 1 minute in liquid nitrogen followed by 5 minutes at 65 °C. DNA extracts were concentrated by isopropanol precipitation overnight at 4°C.

The low biomass in our samples required whole genome amplification (WGA) prior to PCR amplification of marker genes. Genomic DNA was amplified by Multiple Displacement Amplification (MDA) using the REPLI-g Single Cell Kit (Qiagen) as directed. MDA bias was minimized by splitting each WGA sample into triplicate 16 μL reactions after 1 hr of amplification and then resuming amplification for the manufacturer-specified 7 hrs (8 hrs total).

DNA was also recovered from samples of drilling mud and drilling fluid (surface water collected during the coring process) for negative controls, as well as two “kit control” samples, in which no sample was added, to account for any contaminants originating from either the DNeasy PowerMax Soil Kit or the REPLI-g Single Cell Kit.

Bacterial SSU rRNA gene fragments were PCR amplified from MDA samples and sequenced at Georgia Genomics and Bioinformatics Core (Univ. of Georgia). The primers used were: Bac515-Y and Bac926R. Dual-indexed libraries were prepared with (HT) iTruS (Kappa Biosystems) chemistry and sequencing was performed on an Illumina MiSeq 2 x 300 bp system with all samples combined equally on a single flow cell.

Raw sequence reads were processed through Trim Galore [http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/], FLASH (ccb.jhu.edu/software/FLASH/) and FASTX Toolkit [http://hannonlab.cshl.edu/fastx_toolkit/] for trimming and removal of low quality/short reads.

Quality filtering included requiring a minimum average quality of 25 and rejection of paired reads less than 250 nucleotides.

Operational Taxonomic Unit (OTU) clusters were constructed at 99% similarity with the script pick_otus.py within the Quantitative Insights Into Microbial Ecology (QIIME) v.1.9.1 software and ‘uclust’. Any OTU that matched an OTU in one of our control samples (drilling fluids, drilling mud, extraction and WGA controls) was removed (using filter_otus_from_otu_table.py) along with any sequences of land plants and human pathogens that may have survived the control filtering due to clustering at 99% (filter_taxa_from_otu_table.py). As an additional quality control measure, genera that are commonly identified as PCR contaminants were removed. Unclassified OTUs were queried using BLAST against the GenBank nr database and further information about these OTUs is provided in the Supplementary Discussion text under the section “Taxonomic diversity information from iTAGs.” OTUs that could not be assigned to Bacteria or Archaea were removed from further analysis. For downstream analyses, any OTUs not representing more than 0.01% of relative abundance of sequences overall were removed as those are unlikely to contribute significantly to in situ communities. The OTU data table was transformed to a presence/absence table and the Jaccard method was used to generate a distance matrix using the dist.binary() function in the R package ade4.

Data Processing Description

Dataset Processing Description

BCO-DMO processing notes:

Reformatted table structure
Added columns Latitude, Longitude and Depth
Adjusted column header names to comply with database requirements

More information about this dataset deployment

Funding

Award Number	Funding Source
OCE-1658031	NSF Division of Ocean Sciences

Instruments

Automated DNA Sequencer

Supplied Name: Illumina MiSeq 2 x 300 bp platform

Supplied Description:

DNA sequencing performed using the Illumina MiSeq 2 x 300 bp platform (Univ. of Georgia)

Instrument Type

Generic Name: Automated DNA Sequencer

Acronym: Automated Sequencer

Generic Description:

General term for a laboratory instrument used for deciphering the order of bases in a strand of DNA. Sanger sequencers detect fluorescence from different dyes that are used to identify the A, C, G, and T extension reactions. Contemporary or Pyrosequencer methods are based on detecting the activity of DNA polymerase (a DNA synthesizing enzyme) with another chemoluminescent enzyme. Essentially, the method allows sequencing of a single strand of DNA by synthesizing the complementary strand along it, one base pair at a time, and detecting which base was actually added at each step.

Parameters

Supplied Name	Supplied description	Supplied Units	Standard Name
Sample_ID	Sample ID	unitless	sample
Latitude	Latitude of sample, south is negative	decimal degrees	no_bcodmo_term
Longitude	Longitude of samples, west is negative	decimal degrees	lon
Depth	Depth - meters below seafloor (mbsf)	meters (m)	depth
iTAG_Raw	iTAG data - Raw reads	number of reads	no_bcodmo_term
iTAG_Paired_QC	iTAG data - paired reads after QC	number of reads	no_bcodmo_term
iTAG_Paired_Contmnt_Rem	iTAG data - Paired reads surviving removal of potential contaminants matching sequences in control samples or known contaminants.	number of reads	no_bcodmo_term
iTAG_OTU	iTAG data - Number of OTUs at 99% identity	number of OTUs	no_bcodmo_term
Metatr_Raw	Metatranscriptome data - Raw reads from sequencing	number of reads	no_bcodmo_term
Metatr_Paired_QC	Metatranscriptome data - Paired reads after QC	number of reads	no_bcodmo_term
Metatr_Paired_Contmnt_Rem	Metatranscriptome data - Paired reads surviving removal of potential contaminants matching sequences in control samples or known contaminants.	number of reads	no_bcodmo_term
Metatr_Reads_Remaining	Metatranscriptome data - Percent of original paired reads remaining	percentage (%)	no_bcodmo_term

Database

Contribute Data

Dataset: IODP360 - iTAG and metatranscriptome data
Deployment: IODP-360

Dataset acquisition description

Dataset Processing Description

Database

Contribute Data

Dataset: IODP360 - iTAG and metatranscriptome dataDeployment: IODP-360

Dataset acquisition description

Dataset Processing Description

Dataset: IODP360 - iTAG and metatranscriptome data
Deployment: IODP-360