Goals – October 2017

I guess one of my primary goals is to make sure I actually write my monthly goals each month.

Is it bad that I’m writing goals about writing goals? Or, is it meta?

Regardless, I’m actually going to put a lot down on paper, as much has happened since my last set of goals were posted.

We had a “hack week” back in August. For us, “hacking” means organizing and updating lab documentation.

We took on the following tasks:

  • “Decommission” the LabDocs GitHub repo. This had been the canonical location for all of our online lab resources and had served as the starting point. However, it was not organized particularly well for what we were using it for, and was out of date in a number of places. Additionally, this is a personal GitHub repository of Steven’s and it didn’t make logical sense to use it as a dedicated lab repo.

As part of the decommission, we migrated all of the open issues (we used this great little web-based tool: Issue Mover for GitHub to our organization’s GitHub repository: Roberts Lab @ SAFS

  • A massive reorganization, updating, and cleansing of files. We now have separate repositories for our onboarding practices(including an official Lab Code of Conduct), laboratory resources, and (https://github.com/RobertsLab/code). Wiki pages have been created for each of these repos, and readme files have been created/updated to improve instructions on how to locate needed information. Overall, we feel it simplifies the ability for lab members to find the information they need.

Here’s a graphic of the amount of love that went into the old LabDocs repo since it’s inception (337,000 additions to files!):

Anyway, on to the current stuff.

Primary goal will be to perform a comparison of Olympia oyster genome assemblies.

Next will be to continue generating a joint assembly of Illumina and PacBio sequencing data for the Olympia oyster genome. This will take over from where Sean Bennett left off.

After that, writing my November 2017 goals…

DNA Isolation – Ava Withering Syndrome Transmission Study Tissues

Isolated DNA from 144 red abalone digestive gland tissue samples.

Tissue was weighed, minced with a razor blade, and transferred to 2mL snap cap tube containing 1mL of InhibtEX Buffer.

DNA was extracted using the QIAmp Fast DNA Stool Mini Kit (Qiagen) following the manufacturer’s protocol with the following options:

Minced tissue was incubated at 70C O/N
Followed “human DNA analysis” protocol (to maximize sample recovery)
Eluted DNA with 100μL Buffer ATE
Sample information is in this spreadsheet (Google Sheet): ava_abalone_master_extraction_list

DNA Isolation – Ava Withering Syndrome Transmission Study Tissues

Isolated DNA from 58 red abalone digestive gland tissue samples.

Tissue was weighed, minced with a razor blade, and transferred to 2mL snap cap tube containing 1mL of InhibtEX Buffer.

DNA was extracted using the QIAmp Fast DNA Stool Mini Kit (Qiagen) following the manufacturer’s protocol with the following options:

  • Minced tissue was incubated at 70C O/N
  • Followed “human DNA analysis” protocol (to maximize sample recovery)
  • Eluted DNA with 100μL Buffer ATE

Sample information is in this spreadsheet (Google Sheet): ava_abalone_master_extraction_list

DNA Isolation – Ava Withering Syndrome Transmission Study Tissues

Isolated DNA from 26 red abalone digestive gland tissue samples.

Tissue was weighed, minced with a razor blade, and transferred to 2mL snap cap tube containing 1mL of InhibtEX Buffer.

DNA was extracted using the QIAmp Fast DNA Stool Mini Kit (Qiagen) following the manufacturer’s protocol with the following options:

  • Minced tissue was incubated at 70C O/N
  • Followed “human DNA analysis” protocol (to maximize sample recovery)
  • Eluted DNA with 100μL Buffer ATE

Sample information is in this spreadsheet (Google Sheet): ava_abalone_master_extraction_list

Genome Assembly – Olympia oyster PacBio minimap/miniasm/racon

In this GitHub Issue, Steven had suggested I try out the minimap/miniasm/racon pipeline for assembling our Olympia oyster PacBio data.

I followed the pipeline described by this paper: http://matzlab.weebly.com/uploads/7/6/2/2/76229469/racon.pdf.

This notebook entry just contains the racon execution. This produced this assembly:

http://owl.fish.washington.edu/Athaliana/201709_oly_pacbio_assembly_minimap_asm_racon/20170918_oly_pacbio_racon1_consensus.fasta

All intermediate files generated from this pipeline are here:

http://owl.fish.washington.edu/Athaliana/201709_oly_pacbio_assembly_minimap_asm_racon/

I’ll put together a TL;DR post that provides an overview of the pipeline and an assessment of the final assembly.

Previously ran minimap
and then miniasm.

Jupyter Notebook (GitHub): 20170918_docker_pacbio_oly_racon0.5.0.ipynb

Genome Assembly – Olympia oyster PacBio minimap/miniasm/racon

In this GitHub Issue, Steven had suggested I try out the minimap/miniasm/racon pipeline for assembling our Olympia oyster PacBio data.

I followed the pipeline described by this paper: http://matzlab.weebly.com/uploads/7/6/2/2/76229469/racon.pdf.

Previously, ran the first part of the pipeline: minimap

This notebook entry just contains the miniasm execution. Will follow with racon.

Jupyter Notebook (GitHub): 20170918_docker_pacbio_oly_miniasm0.2.ipynb

Samples Received – Pinto Abalone DNased RNA from UC-Irvine

Received DNased pinto abalone RNA from Alyssa Braciszewski at UC-Irvine. These are subset of the samples I sent her back in February.

Here’s the samples list provided by Alyssa (Google Sheet): shipment to UW of RNA samples.xlsx

The samples need to be confirmed to be free if residual RLO gDNA via qPCR. If they are clean, then will proceed to making cDNA, using provided reagents.

Reagents were stored in door of -20C in FSH 240.

Samples were stored in the provided box in the “new” -80C in FSH 235.

Genome Assembly – Olympia oyster PacBio minimap/miniasm/racon

In this GitHub Issue, Steven had suggested I try out the minimap/miniasm/racon pipeline for assembling our Olympia oyster PacBio data.

I followed the pipeline described by this paper: http://matzlab.weebly.com/uploads/7/6/2/2/76229469/racon.pdf.

This notebook entry just contains the initial minimap execution. Followed up with miniasm and then racon.

Jupyter Notebook (GitHub): 20170907_docker_pacbio_oly_minimap2.ipynb

Data Aggregation – Ava’s Complete Sample List

I received Ava’s master sheet of all the samples she collected for this project. I needed to aggregate a full list of the samples I’ve previously extracted DNA from, so that I can compare to her master sample list and generate a list of the remaining samples that I need to extract DNA from..

Here are the files I needed to work with (Google Sheets):

The files required multiple formatting steps in order to produce accession numbers that were formatted in the same fashion across all three sheets. This was needed in order to be able to successfully merge all of the sheets into a single sheet containing all of the data, which will make it easy to sort, and generate a list of samples that need to be extracted.

Text file manipulations were performed in a Jupyter notebook, which is linked below. All files were downloaded from Google Sheets as tab-delimited files prior to working on them.

Jupyter Notebook file: 20170831_ava_ab_samples_aggregation.ipynb
Jupter Notebook on NBviewer: 20170831_ava_ab_samples_aggregation.ipynb

Now that we have the tables formatted, we can use the accession number as a common field by which to combine the two tables. This will allow easy sorting and identification of the remaining samples that I need to extract. I’ll do this by using SQLite3.

Use SQLite3 (in Linux Ubuntu):

Change to directory containing files:

cd ~/Dropbox/Sam Friedman Lab/tmp

Start SQLite3:

sqlite3

Set field separator as tab-delimited:

.separator "t"

Create databases by importing files and providing a name for corresponding databases:

.import ava_master_ab_list_formatted.tsv master_list
.import Ava_WS_Transmission_DNA_Extractions_all.tsv extracted_list

Set output display mode to tabs:

.mode tabs

Set output display to include column headers:

.headers on

Set the output to write to a file instead of the screen:

.output 20170905_master_extraction_list.tsv

SELECT statement to combine the two tables:

SELECT * FROM (SELECT * FROM master_list UNION ALL SELECT * FROM extracted_list) s GROUP BY accession_number ORDER BY accession_number;

The SELECT statement above works in the following fashion:

Uses a sub-query (contained in the parentheses) that combines all of the rows in both tables and creates an intermediate table (that’s the s after the sub-query). Then, all of the columns in that intermediate table are selected by the initial SELECT * FROM and organized by the GROUP BY clause (which combines any rows with identical values in the accession_number column) and then sorts them with the ORDER BY clause.

After that’s finished, we want to reset the output to the screen so we don’t overwrite our file:

.output stdout

The output file is here (Google Sheet): ava_abalone_master_extraction_list