Tag Archives: Eastern oyster

Data Management – SRA Submission LSU C.virginica Oil Spill MBD BS-seq Data

Submitted the Crassostrea virginica (Eastern oyster) MBD BS-seq data we received on 20150413 to NCBI Sequence Read Archive.

Data was uploaded via the web browser interface, as the FTP method was not functioning properly.

SRA deets are below (assigned FASTQ files to new BioProject and created new BioSamples).

SRA Study: SRP139854
BioProject: PRJNA449904

BioSamples Table

Sample Treatment BioSample
HB2 oil 25,000ppm SAMN08919868
HB16 oil 25,000ppm SAMN08919921
HB30 oil 25,000ppm SAMN08919953
NB3 unexposed SAMN08919461
NB6 unexposed SAMN08919577
NB11 unexposed SAMN08919772
Share

TrimGalore/FastQC/MultiQC – Trim 10bp 5’/3′ ends C.virginica MBD BS-seq FASTQ data

Steven found out that the Bismarck documentation (Bismarck is the bisulfite aligner we use in our BS-seq pipeline) suggests trimming 10bp from both the 5′ and 3′ ends. Since this is the next step in our pipeline, we figured we should probably just follow their recommendations!

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Hey! Look at that! Everything is much better! Thanks for the excellent documentation and suggestions, Bismarck!

Share

TrimGalore/FastQC/MultiQC – 2bp 3′ end Read 1s Trim C.virginica MBD BS-seq FASTQ data

Earlier today, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch and hard trimmed the first 14bp from each read. Things looked better at the 5′ end, but the 3′ end of each of the READ1 seqs showed a wonky 2bp blip, so decided to trim that off.

I ran TrimGalore (using the built-in FastQC option), with a hard trim of the last 2bp of each first read set that had previously had the 14bp hard trim and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Well, this is a bit strange, but the 2bp trimming on the read 1s looks fine, but now the read 2s are weird in the same region!

Regardless, while this was running, Steven found out that the Bismarck documentation (Bismarck is the bisulfite aligner we use in our BS-seq pipeline) suggests trimming 10bp from both the 5′ and 3′ ends. So, maybe this was all moot. I’ll go ahead and re-run this following the Bismark recommendations.

Share

TrimGalore/FastQC/MultiQC – 14bp Trim C.virginica MBD BS-seq FASTQ data

Yesterday, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch with the default settings (i.e. “auto-trim”). There was still some variability in the first ~15bp of the reads and Steven wanted to see how a hard trim would change things.

I ran TrimGalore (using the built-in FastQC option), with a hard trim of the first 14bp of each read and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

OK, this trimming definitely took care of the variability seen in the first ~15bp of all the reads.

However, I noticed that the last 2bp of each of the Read 1 seqs all have some wonky stuff going on. I’m guessing I should probably trim that stuff off, too…

Share

TrimGalore/FastQC/MultiQC – Auto-trim C.virginica MBD BS-seq FASTQ data

Yesterday, I ran FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch. Steven wanted to trim it and see how things turned out.

I ran TrimGalore (using the built-in FastQC option) and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer.

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Overall, the auto-trim didn’t alter things too much. Specifically, Steven is concerned about the variability in the first 15bp (seen in the Per Base Sequence Content section of the MultiQC output). It was reduced, but not greatly. Will perform an independent run of TrimGalore and employ a hard trim of the first 14bp of each read and see how that looks.

Share

FastQC/MultiQC – C. virginica MBD BS-seq Data

Per Steven’s GitHub Issues request, I ran FastQC on the Eastern oyster MBD bisulfite sequencing data we recently got back from ZymoResearch.

Ran FastQC locally with the following script: 20180409_fastqc_Cvirginica_MBD.sh


#!/bin/bash
/home/sam/software/FastQC/fastqc 
--threads 18 
--outdir /home/sam/20180409_fastqc_Cvirginica_MBD 
/mnt/owl/nightingales/C_virginica/zr2096_10_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_10_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_1_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_1_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_2_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_2_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_3_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_3_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_4_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_4_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_5_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_5_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_6_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_6_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_7_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_7_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_8_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_8_s1_R2.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_9_s1_R1.fastq.gz 
/mnt/owl/nightingales/C_virginica/zr2096_9_s1_R2.fastq.gz

MultiQC was then run on the FastQC output files.

All files were moved to Owl after the jobs completed.

Results:

FastQC Output folder: 20180409_fastqc_Cvirginica_MBD/

MultiQC Output folder: 20180409_fastqc_Cvirginica_MBD/multiqc_data/

MultiQC report (HTML): 20180409_fastqc_Cvirginica_MBD/multiqc_data/multiqc_report.html

Everything looks good to me.

Steven’s interested in seeing what the trimmed output would look like (and, how it would impact mapping efficiencies). Will initiate trimming.

See the GitHub issue linked above for the full discussion.

Share

Data Received – Crassostrea virginica MBD BS-seq from ZymoResearch

Received the sequencing data from ZymoResearch for the <em>Crassostrea virginica</em> gonad MBD DNA that was sent to them on 20180207 for bisulfite conversion, library construction, and sequencing.

Gzipped FASTQ files were:

  1. downloaded to Owl/nightingales/C_virginica
  2. MD5 checksums verified
  3. MD5 checksums appended to the checksums.md5 file
  4. readme.md file updated
  5. Updated nightingales Google Sheet

Here’s the list of files received:

zr2096_10_s1_R1.fastq.gz
zr2096_10_s1_R2.fastq.gz
zr2096_1_s1_R1.fastq.gz
zr2096_1_s1_R2.fastq.gz
zr2096_2_s1_R1.fastq.gz
zr2096_2_s1_R2.fastq.gz
zr2096_3_s1_R1.fastq.gz
zr2096_3_s1_R2.fastq.gz
zr2096_4_s1_R1.fastq.gz
zr2096_4_s1_R2.fastq.gz
zr2096_5_s1_R1.fastq.gz
zr2096_5_s1_R2.fastq.gz
zr2096_6_s1_R1.fastq.gz
zr2096_6_s1_R2.fastq.gz
zr2096_7_s1_R1.fastq.gz
zr2096_7_s1_R2.fastq.gz
zr2096_8_s1_R1.fastq.gz
zr2096_8_s1_R2.fastq.gz
zr2096_9_s1_R1.fastq.gz
zr2096_9_s1_R2.fastq.gz

Here’s the sample processing history:

Share

DNA Quantification – MspI-digested Crassostrea virginica gDNA

Quantified the two MspI-digested DNA samples for the Qiagen project from earlier today with the Qubit 3.0 (ThermoFisher).

Used the Qubit dsDNA Broad Range (BR) Kit (ThermoFisher).

Used 1μL of DNA from each sample (including undigested gDNA from initial isolation 20171211

Results:

Quantification (Google Sheet): 20180111_qubit_DNA_MspI_virginica

Yields are good and are sufficient for submission to Qiagen:

MspI_virginica_01 – 53.4ng/μL (1335ng; 89% recovery after phenol/chloroform/EtOH precip)
MspI_virginca_02 – 31.0ng/μL (775ng; ~52% recovery after phenol/chloroform/EtOH precip)

Share