Tag Archives: emu

Software Installation – RepeatMasker v4.0.7 on Emu/Roadrunner

Steven asked that I re-run some Olympia oyster transposable elements analysis using RepeatMasker and a newer version of our Olympia oyster genome assembly.

Installed the software on both of the Apple Xserves (Emu and Roadrunner) running Ubuntu 16.04.

Followed the instructions outlined here:

Starting with the prerequisites:

1. Download and install RMBlast

- NCBI Blast 2.6.0 source

- isb 2.6.0 patch

Unfortunately, the make command continually failed:

cd /home/shared/ncbi-blast-2.6.0+-src/c++

While trying to troubleshoot this issue, continued with the other prerequisites:

2. Downloaded Tandem Repeat Finder v.4.09

- Saved file (```trf409.linux64```) to ```/home/shared/bin```. NOTE: ```/home/shared/bin``` is part of the system PATH. See the ```/etc/environment``` file.

- Changed permissions to be executable: <pre><code>sudo chmod 775 trf409.linux64</code></pre>

3. Downloaded RepBase RepeatMasker Edition 20170127 (NOTE: This requires registration in order to obtain a username/password to download the file).

Installed RepeatMasker:

4. Downloaded RepeatMasker 4.0.7

- Saved to ```/home/shared/RepeatMasker-4.0.7```

5. Installed RepBase RepeatMasker Edition 20170127 in /home/shared//home/shared/RepeatMasker-4.0.7/Libraries

Currently re-building RMBlast and it takes forever… Will report back when I have it running.


Software Install – MSMTP For Email Notices of Bash Job Completion on Emu (Ubuntu)

After I finally resolved the installation of PB Jelly on Emu (running Ubuntu 16.04), I’ve had a PB Jelly assembly running for the past two weeks! I’ve gotten tired of checking on its status (i.e. is it still running?) every day, so I dove in and figured out how to set up Emu to email me when the job is complete!

To get this going, I mainly followed this msmtp ArchWiki guide., but here are the specifics of how I set it up.

Step 1. Installed a mail server:

sudo apt-get install sendmail

Step 2. Installed msmtp:

sudo apt-get install msmtp

Step 3. Created the following file in my home directory (/home/sam/): ~/.msmtprc

The original contents of the file for testing were:

       # Example for a user configuration file ~/.msmtprc
       # This file focuses on TLS and authentication. Features not used here include
       # logging, timeouts, SOCKS proxies, TLS parameters, Delivery Status Notification
       # (DSN) settings, and more.

       # Set default values for all following accounts.

       # Use the mail submission port 587 instead of the SMTP port 25.
       port 587

       # Always use STARTTLS.
       tls on
       tls_starttls on
       tls_certcheck off
       # A freemail service
       account uw

       # Host name of the SMTP server
       host smtp.washington.edu

       # Envelope-from address
       from emu@uw.edu

       # Authentication. The password is given using one of five methods, see below.
       auth on
       user samwhite

       # Password method 3: Store the password directly in this file. Usually it is not
       # a good idea to store passwords in plain text files. If you do it anyway, at
       # least make sure that this file can only be read by yourself.
       password myuwpassword

       account default : uw

This is a configuration to allow emails to get sent via the Univ. of Washington email servers. Yes, I currently had UW password saved in this file, but will be addressing this issue below.

Step 4. Changed permissions on ~/.msmtprc to be readable/writable only by me (important, particularly if you’ve stored your password in this file!):

chmod 600 ~/.msmtprc

Step 5. Assigned sendmail to use msmtp with the set command (this sets the following command as a positional parameter by adding to the /etc/mail.rc file:

echo "set sendmail=/usr/bin/msmtp" | sudo tee -a /etc/mail.rc

This command pipers the output of echo to sudo and uses tee -a to append to our desired file (/etc/mail.rc).

Step 5. Send a test email:

echo "Job complete!" | msmtp myuwemail@uw.edu

That will send an email with no subject and the body of the email will contain “Job complete!”.

That’s the basic set up for this.

To use it in your workflow, you’d append that command to the end of any Bash command or in a separate Jupyter notebook cell that is queued to run after a previous cell completes it’s job.


echo "This counts as a command"; echo "Job complete!" | msmtp myuwemail@uw.edu

This will run the first echo command. When that finishes, then the email command will run. You can get fancy and have different emails in response to how the running program exits (i.e. fails or is successful) and send different email responses, but I’m not going to get into that.

Anyway, not bad! However, we want to make this a bit nicer and more secure.

Improve security:

Step 1. Generate a GPG Key:

Follow the instructions under the Creating an Encryption Key section at this link.


Technically, this is does not follow proper security protocols, but this is better than having a plain text password, and setting it up this way is the only way the mail program will send without prompting the user for a password (which kills the automation we’re trying to achieve).

Step 2. Create an encrypted password file:

gpg --encrypt -o ~/.msmtp-password.gpg -r youremailaddress -

After entering that, type your UW email password(NOTE: You will not receive a new prompt, so just type it in), and then Enter. Then, press Ctrl-d.

Step 3. Add the following line to your ~/.msmtprc file:

passwordeval    "gpg --quiet --for-your-eyes-only --no-tty --decrypt ~/.msmtp-password.gpg"

Here’s what the file looks like now:

       # Example for a user configuration file ~/.msmtprc
       # This file focuses on TLS and authentication. Features not used here include
       # logging, timeouts, SOCKS proxies, TLS parameters, Delivery Status Notification
       # (DSN) settings, and more.

       # Set default values for all following accounts.

       # Use the mail submission port 587 instead of the SMTP port 25.
       port 587

       # Always use STARTTLS.
       tls on
       tls_starttls on
       tls_certcheck off

       # Email account nickname
       account uw

       # Host name of the SMTP server
       host smtp.washington.edu

       # Envelope-from address
       from emu@uw.edu

       # Authentication. The password is given using one of five methods, see below.
       auth on
       user samwhite

       # Password method 2: Store the password in an encrypted file, and tell msmtp
       # which command to use to decrypt it. This is usually used with GnuPG, as in
       # this example. Usually gpg-agent will ask once for the decryption password.
       passwordeval    &quot;gpg --quiet --for-your-eyes-only --no-tty --decrypt ~/.msmtp-password.gpg&quot;

       account default : uw

Step 4. Change permissions on ~/.msmtp-password.gpg so it’s only readable/writable by you:

chmod 600 ~/.msmtp-password.gpg

Step 5. Send a test email like before:

echo "Job complete!" | msmtp myuwemail@uw.edu

That’s it for security.

Add a subject to the emails:

Step 1. Create ~/.default_subject.mail and add the following lines to the file (substitute your own email address):

To: myuwemail@uw.edu
From: [EMU]

Feel free to change the Subject and/or From info to whatever you’d like.

Step 2. Send message using ~/.default_subject.mail:

cat ~/.default_subject.mail | msmtp myuwemail@uw.edu

To use this in your workflow, you’ll do just like before (but using the command immediately above) and append to the end of any Bash command.

Make it short & sweet

Appending those lines is going to be difficult to remember, is annoying to type out, and displays your email address (particularly if using a publicly hosted Jupyter notebook like most of us in lab do). Here’s a nice way to remedy that.

Step 1. Add email address as variable in ~/.bashrc:

Add the following lines to the end of your ~/.bashrc file:

# Email address
export EMAIL=myuwemail@uw.edu

Your email address is now saved in the variable $EMAIL. You will need to use the following command to load that information:

source ~/.bashrc

Verify that it worked:

echo "$EMAIL"

That should spit out your email address and is ready to be used!

Step 2. Add alias for full mail command to ~/.bash_aliases file:

echo "alias emailme='cat ~/.default_subject.mail | msmtp "$EMAIL"'" >> ~/.bash_aliases

Verify that it worked:

source ~/.bash_aliases

So, from now on, all you have to do is append the command emailme to the end of any Bash commands and you’ll get email when the job is finished!!! You can edit Steps 1 & 2 to use a variable other than “EMAIL” and an alias other than “emailme” – use whatever you’d like.


Troubleshooting – PB Jelly Install on Emu Continued

The last “fix” didn’t fix everything.

This time, I received an error message that was related to blasr. Some internet searching revealed that I needed to have various library files saved to a variable named: $LD_LIBRARY_PATH

To fix this, I added the following line to the /etc/bash.bashrc file:

export "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/home/shared/lib:"

The line uses a fancy bash test to determine if the $LD_LIBRARY_PATH variable already exists. This is to prevent the $LD_LIBRARY_PATH from having a leading ":".

As usual, the solution to that problem was found courtesy of StackExchange (#162891).

Also, by putting this line in the /etc/bash.bashrc file, it makes the variable available for all users.

Below are some screen caps to document the process:

Realization that PB Jelly still wasn't going to work:

Identify location of file listed in error message:

Add command to /etc/bash.bashrc to set $LD_LIBRARY_PATH:


Verify blasr can run:


Troubleshooting – PB Jelly Install on Emu

I previously installed and ran PB Jelly. Despite no error messages being output, I noticed something odd during my quick post-assembly stats check: The PB Jelly numbers were identical to the input reference file. This seemed very strange and made me decide to look a bit deeper in the PB Jelly output files.

As it turns out, PB Jelly did not complete successfully! Here’s a look at one of the output files (notice the error messages!):

Looking around the internet seemed to suggest that the issue could be that the blasr program wasn’t in my system PATH (blasr is located in: /home/shared/bin). So, I updated that, since /home/shared/bin wasn’t in the system PATH!:

After doing this, I noticed that the PATH assignment in the /etc/environment file is incorrect – it has the $PATH variable appended to the front of the list. This results in the system PATH appending itself to itself over and over again, resulting in a ridiculously long list (like in the screen cap directly above this text). So, I removed that portion and re-sourced the /etc/environment file to tidy things up.

Fingers crossed this will resolve the issue…


Software Installation – PB Jelly Suite and Blasr on Emu

I followed along with what Sean previously did when installing on Emu, but it appears he didn’t install it in the shared location to make it accessible to all users. So, I’m installing it in the /home/shared/ directory.

First, I need to install legacy blasr from PacBio:

Installed in

cd /home/shared
git clone https://github.com/PacificBiosciences/pitchfork.git
cd pitchfork
git checkout legacy_blasr
make init PREFIX=/home/shared
make blasr  PREFIX=/home/shared

Ran into this error:

make[1]: Leaving directory '/home/shared/pitchfork/ports/thirdparty/zlib'
make -C ports/thirdparty/hdf5 do-install
make[1]: Entering directory '/home/shared/pitchfork/ports/thirdparty/hdf5'
/home/shared/pitchfork/bin/pitchfork fetch --url https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.16/src/hdf5-1.8.16.tar.gz
fetching https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8.16/src/hdf5-1.8.16.tar.gz
tar zxf hdf5-1.8.16.tar.gz -C /home/shared/pitchfork/workspace

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Makefile:23: recipe for target '/home/shared/pitchfork/workspace/hdf5-1.8.16' failed
make[1]: *** [/home/shared/pitchfork/workspace/hdf5-1.8.16] Error 2
make[1]: Leaving directory '/home/shared/pitchfork/ports/thirdparty/hdf5'
Makefile:211: recipe for target 'hdf5' failed
make: *** [hdf5] Error 2

Luckily, I came across this GitHub Issue that addresses this exact problem.

I found the functional URL and downloaded the hdf5-1.8.16.tar.gz file to pitchfork/ports/thirdparty/hdf5. Re-ran make blasr PREFIX=/home/shared and things proceeded without issue. As Sean noted, this part takes a long time.

Load the setup-env.sh (this is located here: /home/shared/pitchfork/setup-env.sh

source setup-env.sh

Blasr install is complete!

Then, install networkx v1.1, per the PB Jelly documentation:

python pip -m install networkx==1.1

On to PB Jelly!

Edited the setup.sh file and entered in the path to the PB Jelly install on Emu (/home/shared/PBSuite_15.8.24/):


#If you use a virtual env - source it here
#source /hgsc_software/PBSuite/pbsuiteVirtualEnv/bin/activate

#This is the path where you&#039;ve install the suite.
export SWEETPATH=/home/shared/PBSuite_15.8.24/
#for python modules 
#for executables 

Test it out with the test data:

  1. Edit the following file to reflect the paths on Emu to find this test data: /home/shared/PBSuite_15.8.24/docs/jellyExample/Protocol.xml

    <blasr>-minMatch 8 -minPctIdentity 70 -bestn 1 -nCandidates 20 -maxScore -500 -nproc 4 -noSplitSubreads</blasr>
    <input baseDir="/home/shared/PBSuite_15.8.24/docs/jellyExample/data/reads/">

I went through all the stages of the test data and got through it successfully. Seems ready to roll!


Computer Management – Additional Configurations for Reformatted Xserves

Sean got the remaining Xserves configured to run independently from the master node of the cluster they belonged to and installed OS X 10.11 (El Capitan).

The new computer names are Ostrich (formerly node004) and Emu (formerly node002).


He enabled remote screen sharing and remote access for them.

Sean also installed a working hard drive on Roadrunner and got that back up and running.

I went through this morning and configured the computers with some other changes (some for my user account, others for the entire computer):

  • Renamed computers to reflect just the corresponding bird name (hostnames had been labeled as “bird name’s Xserve”)

  • Created srlab user accounts

  • Changed srlab user accounts to Standard instead of Administrative

  • Created steven user account

  • Turned on Firewalls

  • Granted remote login access to all users (instead of just Administrators)

  • Installed Docker Toolbox

  • Changed power settings to start automatically after power failure

  • Added computer name to login screen via Terminal:

sudo defaults write /Library/Preferences/com.apple.loginwindow LoginwindowText "TEXT GOES HERE"
  • Changed computer HostName via Terminal so that Terminal displays computer name:
sudo scutil --set HostName "TEXT GOES HERE"
  • Installed Mac Homebrew (I don’t know if installation of Homebrew is “global” – i.e. installs for all users)

  • Used Mac Homebrew to install wget

  • Used Mac Homebrew to install tmux