# Data Aggregation – Black Abalone DNA for RLO, RLOv, & Prophage Portal qPCRs

I need to identify samples from the 1st and 2nd black abalone experiments in order to run qPCRs on them, using the primers mentioned in the title of this post. However, I only want to use samples that are RLO+. The existing qPCR data is all over the place (in multiple spreadsheets and split across multiple tabs within those spreadsheets) and is a serious pain in the neck to track down.

Here are the three different spreadsheets that I’ve found that have existing withering syndrome RLO (RLO) qPCR data:

I took the time to aggregate all of this data into a somewhat messy spreadsheet that contains the “raw” qPCR data from all of the black abalone qPCR data. I also calculated the mean copy number for all of the replicates:

Since that spreadsheet isn’t pleasant to look at, I’ve slimmed it down with just the mean qPCR data:

Additionally, I’ve also added the mean RLO qPCR data to Lisa’s Black Abalone Expt 1 spreadsheet. This spreadsheet is currently the most comprehensive aggregation of black abalone data, since it also contains histology scoring of various tissues, as well as qPCR data for a slew of black abalone genes:

Google Sheet: Black Abalone: Expt 1 – WS & Phage

This will greatly simplify locating qPCR data for any black abalone samples in the future!

I will select samples from this list for qPCR.

For posterity, here’s how I slimmed down the messy “raw” spreadsheet using SQLite.

Created a sqlite database using GitBash for Windows:
Change to directory where file is located:

$cd Downloads Start sqlite: $sqlite3

Tell sqlite that the field separator will be commas (i.e. CSV file):

sqlite>.separator ","

Import the CSV file and provide a name for the resulting database:

sqlite>.import 20160420_BlackAb_RLO_qPCR_compilation_raw.csv BlackAbRLOqPCR

Set output display mode to column for easier reading:

sqlite>.mode column

Set output display to include column headers:

sqlite>.headers on

Change output from screen to designated .csv file:

sqlite> .output 20160420_BlackAb_RLO_qPCR_compilation_means.csv 

Change mode to comma separated:

sqlite> .mode csv

SELECT statement:

sqlite> SELECT Sample, qPCR_date, mean_copies, data_source FROM BlackAbRLOqPCR WHERE mean_copies>=0;

The SELECT statement above selects the columns (Sample, qPCR_date, etc..) from the database (i.e. table) we created earlier and only picks the rows where the value in the “mean_copies” column is greater than or equal to 0. This ensures that only the rows with values are selected and gets rid off all the “mess” that we don’t want in the final spreadsheet.

Change output back to screen (so we don’t continue to write to the csv file we made a few steps ago):

sqlite> .output stdout