Methods are reported in Cohen et al. 2023 (biorxiv preprint doi: 10.1101/2023.11.20.567900) and are summarized below.
* This section describes how this and related datasets were generated (see "Related Datasets" section).
One half of the 142 mm filters (0.2-51 μm) collected by Clio were processed for metaproteomics. Proteins were extracted in an 1% SDS-based detergent in 50 mM HEPES at pH 8.5, reduced with dithiothreitol, alkylated with iodoacetamide, and purified using a polyacrylamide electrophoresis tube gel method. Protein quantification was performed using a BSA assay. Trypsin was added to the protein-bead mixture in a 1:20 trypsin:protein ratio. Peptides were purified using C18 tips and diluted to a concentration of 0.1 μg μL−1.
Approximately 2-5 µg of purified peptides were injected onto a Dionex UltiMate 3000 RSLCnano LC system with an additional RSLCnano pump, run in online 2D active modulation mode interfaced with a Thermo Fusion mass spectrometer. The mass spectrometer acquired MS1 scans from 380 to 1,580 m/z at 240K resolution in the Orbitrap. MS2 were collected in data dependent mode in the ion trap with a cycle time of 2 seconds between scans and acquisition of charge states 2 to 10. MS2 scans had 1.6 m/z isolation window, 50 ms maximum injection time and 5 s dynamic exclusion time.
Note: This dataset contains two different missing data identifiers "NA" and "-". If there were partial matches to the functional annotation database, the missing ones were denoted with "-". If there were no matches at all, when the data frames were merged, the empty columns were denoted with "NA".
example lines in opp_TOTAL_spectralcounts.csv
"6","megahit_HN001_k141_101642.p1","-","-","-","SBP_bac_1,SBP_bac_8"...
vs
"4","megahit_HN001_k141_100671.p1",NA,NA,NA,NA,NA,NA,"X1_30_0.2"...