Data Management – Script for Compiling qPCR Data

Over the last few weeks, I’ve wrestled with tracking down data (primarily qPCR data) from the litany of projects the Friedman Lab has had over the last decade or so. I’ve also noticed that it’s increasingly difficult for me to track down my own data from individual projects where data is not generated continuously over time, but in chunks. Tracking down 6 different qPCR runs that were conducted over the course of a year is tedious.

In order to save myself, and others who might need/want to review my data, a lot of time in the future, I decided to write a script that will allow me to compile all of my qPCR data into a single, massive CSV file. Accessing this CSV file via spreadsheet (Excel/Calc/Sheets) or database (SQL) means I’ll always be able to quickly search for all the related data I need, since it will reside in a single file!

The script is written in bash, called qpcr_aggregation.sh, and is currently hosted here: https://github.com/kubu4/Scripts/blob/master/bash/qpcr_aggregation.sh

Basically, the script does the following:

  • Replace the spaces in the BioRad filenames with underscores (used to simplify parsing of the filename for use in downstream steps in the script)
  • Replace the header row to accommodate two new fields: qPCR_filename and qPCR_date
  • Add qPCR_filename and qPCR_date to each file.
  • Concatenate all the files into a single “master” CSV file.

Now, the easy part – exporting the data from hundreds of qPCR files…

Sample ID – Black Abalone DNA for RLOv qPCRs

Carolyn & Stan Langevin have agreed that the DNA helicase qPCR should be tested on 10 black abalone DNA extractions that fall into multiple levels of the Friedman Lab’s withering syndrome histology scoring.

Downloaded the (Google Sheet) Black Abalone: Expt 1 – WS & Phage as a CSV file. After downloading, I renamed the file (Black_Abalone.csv) to facilitate easier usage in the following steps.

Created a sqlite database using GitBash for Windows:
Change to directory where file is located:

$cd Downloads

Start sqlite:

$sqlite3

Tell sqlite that the field separator will be commas (i.e. CSV file):

sqlite>.separator ","

Import the CSV file and provide a name for the resulting database:

sqlite>.import Black_Abalone.csv BlackAbs

Set output display mode to column for easier reading:

sqlite>.mode column

Set output display to include column headers:

sqlite>.headers on

 

To select all the samples that have scores of 0 in both PE and DG RLO fields (screen cap does not show entire output list):

 

To select all the samples that have scores of 1 in both PE and DG RLO fields:

 

To select all the samples that have scores of 2 in both PE and DG RLO fields:

 

Here are the full set of results in a table

RLO/RLOv 0 RLO/RLOv 1 RLO/RLOv 2
06:5-03 06:5-35A 06:5-31
06:5-04 06:50-08 06:5-32B
06:5-08 06:50-10 06:6-46
06:5-09 06:6-32 06:6-49
06:5-10 06:6-39 08:3-05
06:5-11 06:6-42 08:3-07
06:5-14 06:6-44 08:3-15
06:5-16 06:6-52 08:3-16
06:5-18 06:6-54
06:5-20 07:12-18
06:5-21 08:3-08
06:5-22 08:3-10
06:5-24
06:5-30
06:50-04
06:50-05
06:50-11
06:50-12
06:50-13
06:50-15
06:50-16
06:6-01
06:6-02
06:6-03
06:6-05
06:6-08
06:6-11
06:6-12
06:6-13
06:6-15
06:6-16
06:6-17
06:6-18
06:6-20
06:6-21
06:6-22
06:6-23
06:6-24
06:6-25
06:6-26
06:6-27
06:6-28
07:12-01
07:12-02
07:12-03
07:12-04
07:12-05
07:12-06
07:12-07
07:12-09
07:12-10
07:12-13
07:12-19
08:3-01
08:3-02
08:3-03
08:3-04
08:3-13
08:4-01
08:4-02
08:4-03
08:4-04
08:4-05
08:4-06
08:4-07
08:4-08
08:4-09
08:4-10
08:4-11
08:4-12
08:4-13
08:4-14
08:4-15
08:4-16
08:4-17
08:4-18
08:4-19
08:4-20
08:4-21
08:4-22
08:4-23
08:4-24
08:4-25
08:5-06

Will select just 10 of those in the RLO/RLOv 0 column for use in qPCR.

I was able to track down the boxes where are these DNAs were stored (see images below).

Boxes that were not labeled with accession numbers of the samples contained therein are now labeled.

Boxes that contained samples that belonged in other boxers were transferred to the appropriate box.

All boxes were located, and returned, to the big -20C in 240 on Lisa’s shelf.