# Project Progress – Olympia Oyster Genome Assemblies by Sean Bennett

Here’s a brief overview of what Sean has done with the Oly genome assembly front.

Metassembler

• Assemble his BGI assembly and Platanus assembly? Confusing terms here; not sure what he means.
• Failed due to 32-bit vs. 64-bit installation of MUMmer. He didn’t have the chance to re-compile MUMmer as 64-bit. However, a recent MUMmer announcement suggests that MUMmer can now handle genomes of unlimited size.
• I believe he was planning on using (or was using?) GARM, which relies upon MUMmer and may also include a version of MUMmer (outdated version that led to Sean’s error message?).
• Notebook entry

Canu

Redundans

Platanus

# Data Management – Illumina Geoduck HiSeq & MiSeq Data

The HDD we received from Illumina last week only had data (i.e. fastq files) from the NovaSeq runs they performed – nothing from either the MiSeq, nor the HiSeq runs.

We contacted them about the missing data, they confirmed it was missing, and uploaded the remaining data to BaseSpace.

Files will be temporarily stored in these locations:

/volume1/web/nightingales/Geoduck_MiSeq/170317_M03814_0172_000000000-B2K79/Data/GeoDuckRNAMiSeq-35978947

/volume1/web/nightingales/Geoduck_HiSeq/170228_ST-K00104_0382_BHHGTLBBXX/Data/Ironman-35682656

/volume1/web/nightingales/Geoduck_HiSeq/170228_ST-K00104_0381_AHHHWNBBXX/Data/Ironman-35682656

# Data Received – Geoduck Genome Sequencing by Illumina

We previously sent some geoduck samples to Illumina, as part of a pilot project for them to test out a new sequencing platform. The data has finally arrived!

It was sent on a 4TB Seagate external hard drive.

Due to weird connection issues we’ve recently encountered with our server, Owl (Synology DS1812+), I connected the HDD directly to Owl via USB (instead of connecting to a computer and transferring). I transferred the data using the Synology web interface to avoid any computer/NAS connection issues that might interrupt the transfer.

We have a meeting with the Illumina people tomorrow afternoon to review the data they’ve provided (looks like it’s going to take awhile, though). Once that meeting takes place, we’ll figure out how to document this project in our data management plan.

# DNA Isolation – Geoduck gDNA for Illumina-initiated Sequencing Project

We were previously approached by Cindy Lawley (Illumina Market Development) for possible participation in an Illumina product development project, in which they wanted to have some geoduck tissue and DNA on-hand in case Illumina green-lighted the use of geoduck for testing out the new sequencing platform on non-model organisms. Well, guess what, Illumina has give the green light for sequencing our geoduck! However, they need at least 4μg of gDNA, so I’m isolating more.

Isolated DNA from ctenidia tissue from the same Panopea generosa individual used for the BGI sequencing efforts. Tissue was collected by Brent & Steven on 20150811.

Used the E.Z.N.A. Mollusc Kit (Omega) to isolate DNA from five separate ~60mg pieces of ctenidia tissue according to the manufacturer’s protocol, with the following changes:

• Samples were homogenized with plastic, disposable pestle in 350μL of ML1 Buffer
• Incubated homogenate at 60C for 1hr
• No optional steps were used
• Performed three rounds of 24:1 chloroform:IAA treatment
• Eluted each in 50μL of Elution Buffer and pooled into a single sample

Quantified the DNA using the Qubit dsDNA BR Kit (Invitrogen). Used 1μL of DNA sample.

Concentration = 162ng/μL (Quant data is here [Google Sheet]: 20170105_gDNA_geoduck_qubit_quant

Yield is great (total = ~32μg).

Evaluated gDNA quality (i.e. integrity) by running 162ng (1μL) of sample on 0.8% agarose, low-TAE gel stained with ethidium bromide.

Used 5μL of O’GeneRuler DNA Ladder Mix (ThermoFisher).

Results:

DNA looks good: bright high molecular weight band, minimal smearing, and minimal RNA carryover (seen as more intense “smear” at ~500bp).

Will send off 10μg (they only requested 4μg) so that they have extra to work with in case they come across any issues.

# Data Received – Geoduck RRBS Sequencing Data

Hollie Putnam prepared some reduced representation bisulfite Illumina libraries and had them sequenced by Genewiz.

IMPORTANT: MD5 checksums have not yet been provided by Genewiz! We cannot verify the integrity of these data files at this time! Checksums have been requested. Will create new notebook entry (and add link to said entry) once the checksums have been received and we can compare them.

UPDATE 20161230 – Have received and verified checksums.

Jupyter notebook: 20161229_docker_genewiz_geoduck_RRBS_data.ipynb

# RNAseq Data Receipt – Geoduck Gonad RNA 100bp PE Illumina

Received notification that the samples sent on 20150601 for RNAseq were completed.

Downloaded the following files from the GENEWIZ servers using FileZilla FTP and stored them on our server (owl/web/nightingales/P_generosa):

Geo_Pool_F_GGCTAC_L006_R1_001.fastq.gz
Geo_Pool_F_GGCTAC_L006_R2_001.fastq.gz
Geo_Pool_M_CTTGTA_L006_R1_001.fastq.gz
Geo_Pool_M_CTTGTA_L006_R2_001.fastq.gz

Generated md5 checksums for each file:

$for i in *; do md5$i >> checksums.md5; done

# Sample Submission – Geoduck Gonad for RNA-seq

Prepared two pools of geoduck RNA for RNA-seq (Illumina HiSeq2500, 100bp, PE) with GENEWIZ, Inc.

I pooled a set of female and a set of male RNAs that had been selected by Steven based on the Bioanalyzer results from Friday.

The female RNA pool used 210ng of each sample, with the exception being sample #08. This sample used 630ng. The reason for this was due to the fact that there weren’t any other female samples to use from this developmental time point. The two other developmental time points each had three samples contributing to the pool. So, three times the quantity of the other individual samples was used to help equalize the time point contribution to the pooled sample. Additionally, 630ng used the entirety of sample #08.

The male RNA pool used 315ng of each sample. This number differs from the 210ng used for the female RNAs so that the two pools would end up with the same total quantity of RNA. However, now that I’ve typed this, this doesn’t matter since the libraries will be equalized before being run on the Illumina HiSeq2500. Oh well. As long as each sample in each pool contributed to the total amount of RNA, then it’s all good.

The two pools were shipped O/N on dry ice.

• Geo_pool_M
• Geo_pool_F