Progress on generating bedgraphs from our Olympia oyster transcriptome continues.
Transcriptome assembly with Trinity completed 20180919.
Next up, align transcriptome to Olympia oyster genome.
Alignment and creation of BAM files was done using Bowtie2 on our HPC Mox node.
SBATCH script file:
Alignment was done using the following version of the Olympia oyster genome assembly:
Sorted BAM file:
Sorted & indexed BAM file (for IGV):
Will get the sorted BAM file converted to a bedgraph for use in IGV.
Per Steven’s request, mapped our Olympia oyster 2bRAD data.
This was run on our Mox computing node.
Slurm script: 20180515_oly_2bRAD_bowtie2_mapping.sh
The script is far too long to paste here, due to the shear number of input files. However, here’s a snippet to show the command and options that were used:
See the linked Slurm script above for the entire thing.
SAM file (104GB)
20180515_oly_2bRAD_bowtie2_mapping$ cat slurm-180337.out
729797535 reads; of these:
729797535 (100.00%) were unpaired; of these:
273989476 (37.54%) aligned 0 times
310581308 (42.56%) aligned exactly 1 time
145226751 (19.90%) aligned >1 times
62.46% overall alignment rate
We have an upcoming meeting with Illumina to discuss how the geoduck genome project is coming along and to decide how we want to proceed.
So, we wanted to get a quick idea of how well our geoduck assemblies are by performing some quick alignments using Bowtie2.
Used the following assemblies as references:
The analysis is documented in a Jupyter Notebook.
Jupyter Notebook (GitHub):
NOTE: Due to large amount of stdout from first genome index command, the notebook does not render well on GitHub. I recommend downloading and opening notebook on a locally install version of Jupyter.
Here’s a brief overview of the process:
- Generate Bowtie2 indexes for each of the genome assemblies.
- Map 1,000,000 reads from the following Illumina NovaSeq FastQ files:
Bowtie2 Genome Indexes:
Bowtie2 sn_ph_01 alignment folder:
Bowtie2 sparse_03 alignment folder:
Bowtie2 pga_02 alignment folder:
MAPPING SUMMARY TABLE
All mapping data was pulled from the respective *.err file in the Bowtie2 alignment folders.
||Alignment Rate (%)
||Hi-C (Phase Genomics)
Mapping efficiency is similar for all assemblies. After speaking with Steven, we’ve decided we’ll begin exploring genome annotation pipelines.