Tag Archives: ostrich

FASTQC – Oly BGI GBS Raw Illumina Data Demultiplexed

Last week, I ran the two raw FASTQ files through FastQC. As expected, FastQC detected “errors”. These errors are due to the presence of adapter sequences, barcodes, and the use of a restriction enzyme (ApeKI) in library preparation. In summary, it’s not surprising that FastQC was not please with the data because it’s expecting a “standard” library prep that’s already been trimmed and demultiplexed.

However, just for comparison, I ran the demultiplexed files through FastQC. The Jupyter notebook is linked (GitHub) and embedded below. I recommend viewing the Jupyter notebook on GitHub for easier viewing.


Pretty much the same, but with slight improvements due to removal of adapter and barcode sequences. The restriction site still leads to FastQC to report errors, which is expected.

Links to all of the FastQC output files are linked at the bottom of the notebook.

Jupyter notebook (GitHub): 20170306_docker_fastqc_demultiplexed_bgi_oly_gbs.ipynb


Data Management – Integrity Check of Final BGI Olympia Oyster & Geoduck Data

After completing the downloads of these files from BGI, I needed to verify that the downloaded copies matched the originals. Below is a Jupyter Notebook detailing how I verified file integrity via MD5 checksums. It also highlights the importance of doing this check when working with large sequencing files (or, just large files in general), as a few of them had mis-matching MD5 checksums!

Although the notebook is embedded below, it might be easier viewing via the notebook link (hosted on GitHub).

At the end of the day, I had to re-download some files, but all the MD5 checksums match and these data are ready for analysis:

Final Ostrea lurida genome files

Final Panopea generosa genome files

Jupyter Notebook: 20161214_docker_BGI_data_integrity_check.ipynb


Computer Management – Additional Configurations for Reformatted Xserves

Sean got the remaining Xserves configured to run independently from the master node of the cluster they belonged to and installed OS X 10.11 (El Capitan).

The new computer names are Ostrich (formerly node004) and Emu (formerly node002).


He enabled remote screen sharing and remote access for them.

Sean also installed a working hard drive on Roadrunner and got that back up and running.

I went through this morning and configured the computers with some other changes (some for my user account, others for the entire computer):

  • Renamed computers to reflect just the corresponding bird name (hostnames had been labeled as “bird name’s Xserve”)

  • Created srlab user accounts

  • Changed srlab user accounts to Standard instead of Administrative

  • Created steven user account

  • Turned on Firewalls

  • Granted remote login access to all users (instead of just Administrators)

  • Installed Docker Toolbox

  • Changed power settings to start automatically after power failure

  • Added computer name to login screen via Terminal:

sudo defaults write /Library/Preferences/com.apple.loginwindow LoginwindowText "TEXT GOES HERE"
  • Changed computer HostName via Terminal so that Terminal displays computer name:
sudo scutil --set HostName "TEXT GOES HERE"
  • Installed Mac Homebrew (I don’t know if installation of Homebrew is “global” – i.e. installs for all users)

  • Used Mac Homebrew to install wget

  • Used Mac Homebrew to install tmux