Posted by & filed under Miscellaneous.

Getting back into gear, I am assisting Andrew ID some targets from a salmonid transcriptome. With said transcriptome I am taking the blast output and getting some protein names sans SQLshare.

The tldr can be seen here, but if you have the time I will point out the key code aspects and leave you with a tabular file.

First we had the good ol tr.


http://www.uniprot.org/uniprot/?query=reviewed%3ayes&force=yes&format=tab&columns=id,entry%20name,go-id,interactor,database(GO),go,reviewed,interpro,pathway,protein%20names,genes,tools,organism,length"



Before joining I needed to sort.

And with the join I needed a few parameters

!join -t \$'\t' -1 3 -2 1 \ blastx_sprot.sort \ /Users/sr320/git-repos/nb-2016/uniprot-reviewed.sort

And because we need to get to Excel
!open blastx-join-uniprot-info.tab -a "Microsoft Excel"

Volia a tab file is created that can be examined further.