Tag Archives: BLAST

BLAST – C.gigas Larvae OA Illumina Data Against GenBank nt DB

In an attempt to figure out what’s going on with the Illumina data we recently received for these samples, I BLASTed the 400ppm data set that had previously been de-novo assembled by Steven: EmmaBS400.fa.

Jupyter (IPython) Notebook : 20150501_Cgigas_larvae_OA_BLASTn_nt.ipynb

Notebook Viewer : 20150501_Cgigas_larvae_OA_BLASTn_nt


BLASTn Output File: 20150501_nt_blastn.tab

BLAST e-vals <= 0.001: 20150501_Cgigas_larvae_OA_blastn_evals_0.001.txt

Unique BLAST Species: 20150501_Cgigas_larvae_OA_unique_blastn_evals.txt


Firstly, since this library was bisulfite converted, we know that matching won’t be as robust as we’d normally see.

However, the BLAST matches for this are terrible.

Only 0.65% of the BLAST matches (e-value <0.001) are to Crassostrea gigas. Yep, you read that correctly: 0.65%.

It’s nearly 40-fold less than the top species: Dictyostelium discoideum (a slime mold)

It’s 30-fold less than the next species: Danio rerio (zebra fish)

Then it’s followed up by human and mouse.

I think I will need to contact the Univ. of Oregon sequencing facility to see what their thoughts on this data is, because it’s not even remotely close to what we should be seeing, even with the bisulfite conversion…

BLASTN – C.gigas OA Larvae to C.gigas Ensembl 1.24 BLAST DB

In an attempt to figure out what’s going on with the Illumina data we recently received for these samples, I BLASTed the 400ppm data set that had previously been de-novo assembled by Steven: EmmaBS400.fa.

I also created a nucleotide BLAST database (DB) from the Crassostrea_gigas.GCA_000297895.1.24.fa

Jupyter (IPython) Notebook: 20150429_Gigas_larvae_OA_BLASTn.ipynb

Notebook Viewer: 20150429_Gigas_larvae_OA_BLASTn



The results are not great.

All query contigs successfully BLAST to sequences in the C.gigas Ensembl BLAST DB. However, only 33 of the sequences (out of ~37,000) have an e-value of 0.0. The next best e-value for any matches is 0.001. For the uninitiated, that value is not very good, especially when you’re BLASTing against the same exact species DB.

Will BLASTn the C.gigas contigs against the entire GenBank nt (all nucleotides) to see what the taxonomy breakdown looks like of these sequences.

Sequencing – PGS Hi 4 (PGS2/COX2)

Sent plasmid prep to ASU (5uL of plasmid + 1uL of 10uM M13F/R). SJW01 = M13F, SJW02 = M13R.


Sequencing looks great! Definitely have a portion of the second isoform of COX/PGS!! Here’s the result of the consensus BLASTed in GenBank>Nucleotide (others)>blastn:

Top hit in the db is COX1/PGS1, and, clearly, there are differences between the two sequences confirming that we have the second isoform (COX2/PGS2). Will design more RACE primers in hopes of obtaining the full-length cDNA sequence.

qPCR – Emma’s New 3KDSqPCR Primers

Due to previous contamination issues with Emma’s primers, Emma asked me to order new primers, reconstitute them and run a qPCR for her to see if we could eliminate her contamination issues with this primer set. cDNA template was supplied by Emma (from 2/2/11) and was from a C.gigas 3hr Vibrio vulnificus challenge. Samples were run in duplicate, as requested. Master mix calcs are here. Plate layout, cycling params, etc. can be found in the qPCR Report (see Results). Primer set used was:

Cg_3KDSqPCR_F/R (SR IDs: 1186, 1187)


qPCR Data File (BioRad CFX96)

qPCR Report (PDF)

The negative controls (NTC) are negative, meaning they do not cross the threshold set by the BioRad software. However, there is clearly amplification in the NTCs, but they come up late enough that they do not cross the threshold and, thus, generate a Cq value. Additionally, the melt curve reveals peaks in the NTCs that are at the same melting temperature as the product produced in the cDNA qPCR reactions. This would potentially imply some sort of contamination, as Emma has experienced.

Honestly, I do no think contamination is the problem. I believe that the “contamination” being seen in the NTCs is actually primer dimer. Increasing the annealing temperature (I’m not sure if Emma tried this during her troubleshooting) could potentially alleviate this issue. However, I’m not sure she’s amplifying the target that she wants to. Based on my analysis, I think she needs to re-design primers for her 3KDS target. Read my analysis and why I came to this conclusion below.

It seems unlikely that two independent people (and multiple primer stock replacements!) would have contamination, so I looked in to things a bit further.

I BLASTed the primer sets (NCBI, blastn, est_others db, C.gigas only) and the BLAST results reveal the primers matching with a C.gigas EST sequence that would produce a band of only 63bp. Here’s a screen capture of the BLAST results:

This result does NOT agree with what is entered in our Primer Database. As entered in our sheet, the expected PCR product would be ~102bp. However, taking in to account the BLAST results, it would be difficult to distinguish the difference between primer dimers and PCR product in a melt curve analysis.

Emma has previously run a conventional PCR with these primers and ran a gel (see below). At the time, it was thought to be contamination, but in retrospect (knowing the results of the qPCRs and the BLAST results) it seems likely that what she’s seeing in the negative controls was actually primer dimer, which was the same size of her PCR product (which she thought should be larger). Additionally, the gel was difficult to interpret because no ladder was run. A ladder might have revealed that her PCR product was half the size that she was expecting: