Data Management – O.lurida 2bRAD Dec2015 Undetermined FASTQ files

An astute observation by Katherine Silliman revealed that the FASTQ files I had moved to our high-throughput sequencing archive on our server Owl, only had two FASTQ files labeled as “undetermined”. Based on the number of lanes we had sequenced, we should have had many more. Turns out that the undetermined FASTQ files that were present in different sub-folders of the Genewiz project data were not uniquely named. Thus, when I moved them over (via a bash script), the undetermined files were continually overwritten, until we were left with only two FASTQ files labeled as undetermined.

So, I re-downloaded the entire project folder from Genewiz servers and renamed the FASTQ files labeled as undetermined and then copied them over to the archive on Owl:

I also zipped up the raw data project from Genewiz and moved it to the same archive location and updated the checksum.md5 and files.

Details can be found in the Jupyter (iPython) notebook below.

Jupyter Notebook file: 20160308_find_rename_2bRAD_undetermined_fastqs.ipynb

Notebook Viewer: 20160308_find_rename_2bRAD_undetermined_fastqs.ipynb



Leave a Reply

Your email address will not be published. Required fields are marked *

e.g. 0000-0002-7299-680X

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>