Tag Archives: blast

Annotation 101

DIRECT LINK to nbviewer

http://nbviewer.ipython.org/github/sr320/eimd/blob/master/ipynbs/c_Annotation.ipynb


 

Annotation (Blast)

In [1]:
#Contigs from Assembly
!head /Volumes/web/cnidarian/SeaStar_transc_v2.fa
>3291_5903_10007_H94MGADXX_V_CF71_ATCACG_R1_(paired)_trimmed_(paired)_contig_1

CAAATATATGAACGGTTGATTGTCAACGATTAGTACATGTTTTCATTGTTCCCCACGCCC

GCCCCCCCCCACTCAAACATTTAAAGTGTGAAATATTATTTATCCACAAATTTCCTTAAA

CCTGCAAACTTGTCTGCTGTCTCTTATTGGAAGTTATGAAAAAGAACAACGGGTTTTCTT

TAAAGGGTCTGCGTGCGATTTTCAACCTTTTGAGTAATAGCAGTTATTTTGATAACCGAT

TTTTTTCAAAGCTCAACAGCTTTTTAAAATAAGGAATCCTATAATGGCCAAACGAATACT

ATAAAAATAAGGGTTCTCTTAATTGTATAAAACGTATAATTTTATCAATTTTGGGACCGT

GTAATTTTTTAAAGACCACAAGAATGTTACATACAACAAATAGACGAAACTCGTAGCTTT

GGAAACTACGTCATGGGCGTTTGGTCAAAAGCTGGAGAGAAAGAGAGGTGGGGTGCCAGA

CTTAAGTAGTCACGTGATCTGACCAACGCACATCGGAAGCTCGATCGGATGAAATCTTCT


In [2]:
#Making the query look better
!sed 's/3291_5903_10007_H94MGADXX_V_CF71_ATCACG_R1_(paired)_trimmed_(paired)/Phel_clc/g' </Volumes/web/cnidarian/SeaStar_transc_v2.fa> /Users/sr320/Dropbox/Steven/eimd/data/Phel_transcriptome_clc.fa
In [3]:
!head /Users/sr320/Dropbox/Steven/eimd/data/Phel_transcriptome_clc.fa
>Phel_clc_contig_1

CAAATATATGAACGGTTGATTGTCAACGATTAGTACATGTTTTCATTGTTCCCCACGCCC

GCCCCCCCCCACTCAAACATTTAAAGTGTGAAATATTATTTATCCACAAATTTCCTTAAA

CCTGCAAACTTGTCTGCTGTCTCTTATTGGAAGTTATGAAAAAGAACAACGGGTTTTCTT

TAAAGGGTCTGCGTGCGATTTTCAACCTTTTGAGTAATAGCAGTTATTTTGATAACCGAT

TTTTTTCAAAGCTCAACAGCTTTTTAAAATAAGGAATCCTATAATGGCCAAACGAATACT

ATAAAAATAAGGGTTCTCTTAATTGTATAAAACGTATAATTTTATCAATTTTGGGACCGT

GTAATTTTTTAAAGACCACAAGAATGTTACATACAACAAATAGACGAAACTCGTAGCTTT

GGAAACTACGTCATGGGCGTTTGGTCAAAAGCTGGAGAGAAAGAGAGGTGGGGTGCCAGA

CTTAAGTAGTCACGTGATCTGACCAACGCACATCGGAAGCTCGATCGGATGAAATCTTCT


In [4]:
!fgrep -c ">" /Users/sr320/Dropbox/Steven/eimd/data/Phel_transcriptome_clc.fa
30578


In []:
#IMPORTANT the location of files will change depending on your computer
blastx \
-query /Users/sr320/Dropbox/Steven/eimd/data/Phel_transcriptome_clc.fa \
-db /Users/Shared/data/blast/db/uniprot_sprot \
-out /Users/sr320/Dropbox/Steven/eimd/data/Phel_clc_blastx_uniprot_sprot_1.tab \
-outfmt 6 \
-num_threads 8 \
-max_hsps 1 \
-max_target_seqs 1 \
-evalue 1E-20
In [11]:
!head /Users/sr320/Dropbox/Steven/eimd/data/Phel_clc_blastx_uniprot_sprot_1.tab
Phel_clc_contig_4	sp|P25001|COX1_PISOC	88.39	517	48	1	7061	5547	1	517	0.0	  749

Phel_clc_contig_7	sp|Q33818|CYB_ASTPE	79.94	329	66	0	993	7	51	379	9e-168	  479

Phel_clc_contig_8	sp|P68037|UB2L3_MOUSE	76.97	152	35	0	4862	4407	1	152	7e-62	  214

Phel_clc_contig_9	sp|Q0MVN8|QOR_PIG	45.61	239	129	1	796	80	90	327	5e-63	  210

Phel_clc_contig_17	sp|Q6DGL8|RT15_DANRE	35.00	180	107	3	1177	638	61	230	5e-22	99.8

Phel_clc_contig_18	sp|P96202|PPSC_MYCTU	30.81	714	438	15	5407	3386	1414	2111	6e-76	  286

Phel_clc_contig_20	sp|P46058|EDSP_CYNPY	31.03	348	218	8	1731	703	4	334	5e-38	  148

Phel_clc_contig_22	sp|Q96LI5|CNO6L_HUMAN	60.78	357	128	5	1887	832	186	535	2e-149	  450

Phel_clc_contig_24	sp|P63245|GBLP_RAT	80.77	312	60	0	1032	97	1	312	0.0	  540

Phel_clc_contig_25	sp|Q9QYP1|LRP4_RAT	32.74	623	393	7	1816	5	393	1008	1e-99	  339


In [12]:
!tr '|' "\t" </Users/sr320/Dropbox/Steven/eimd/data/Phel_clc_blastx_uniprot_sprot_1.tab> /Users/sr320/Dropbox/Steven/eimd/data/Phel_clc_blastx_uniprot_sprot_sqlready.tab
In [13]:
!head /Users/sr320/Dropbox/Steven/eimd/data/Phel_clc_blastx_uniprot_sprot_sqlready.tab
Phel_clc_contig_4	sp	P25001	COX1_PISOC	88.39	517	48	1	7061	5547	1	517	0.0	  749

Phel_clc_contig_7	sp	Q33818	CYB_ASTPE	79.94	329	66	0	993	7	51	379	9e-168	  479

Phel_clc_contig_8	sp	P68037	UB2L3_MOUSE	76.97	152	35	0	4862	4407	1	152	7e-62	  214

Phel_clc_contig_9	sp	Q0MVN8	QOR_PIG	45.61	239	129	1	796	80	90	327	5e-63	  210

Phel_clc_contig_17	sp	Q6DGL8	RT15_DANRE	35.00	180	107	3	1177	638	61	230	5e-22	99.8

Phel_clc_contig_18	sp	P96202	PPSC_MYCTU	30.81	714	438	15	5407	3386	1414	2111	6e-76	  286

Phel_clc_contig_20	sp	P46058	EDSP_CYNPY	31.03	348	218	8	1731	703	4	334	5e-38	  148

Phel_clc_contig_22	sp	Q96LI5	CNO6L_HUMAN	60.78	357	128	5	1887	832	186	535	2e-149	  450

Phel_clc_contig_24	sp	P63245	GBLP_RAT	80.77	312	60	0	1032	97	1	312	0.0	  540

Phel_clc_contig_25	sp	Q9QYP1	LRP4_RAT	32.74	623	393	7	1816	5	393	1008	1e-99	  339


In []: