Tag Archives: transposable element

Re-Reproducing differential methylation analyses, again

Having just given a talk on reproducibility, I am in the midst of responding to reviewer comments about what we did (12 months ago!) and boy can I say every minute of putting this notebook together was worth it. I even found where we ran the entire notebook, so all result files are easily accessible. Beyond praising Claire, I will document my follow up analysis here.

Essentially the want more quantitative information on differential methylation beyond ..


Makes sense.

Here is what was originally done.


For example the file named linexon contained 16 exon_intersect_DML_lin_u.txt. The 4 files were concatenated to produce lintable ….


and a little awk

awk 'FNR==NR{sum+=$1;next}; {print $0,sum}' lintable{,} > lin_total
awk '{print $2, $1, $3, (($1/$3)*100)}' lin_total > lineage_DMLs

to create lineage_DMLs


Analogously here are the developmental_DMLs….


And we certainly need to now how many all_CGs we have…



Feature Family specific DMLs Developmental specific DMLs
Transposable Element 17 16
Promoter Region 2 3
Exon 16 12
Intron 25 46

I know we did this before, but I believe the reviewers want a break-down, or list of which specific transposable elements. This is a long shot if I can find this…
2 minutes later https://github.com/sr320/ipython_nb/blob/master/BiGo_larvae_manuscript4.ipynb.

To be sure files are accurate, I will intersectbed again. Based on recollection there is likely not a difference in proportion based on all TEs. This brings up a an important point of how to record “negative” data that does not go into a paper.


There is something about TEs

For purposes of proposaling and reports, I have gone back to look at a small project done in collaboration with scientist at IFREMER looking at pesticide exposure on oyster larvae methylation.

The control library had limited yield so the number of loci with data from the treated and the control library was restricted. However using a liberal 3x coverage for both, we found a total of 823 DMRs (544 hypermethylated and 279 hypomethylated).


Intriguingly, when one accounts for all CGs in the genome, these DMRs are predominantly in transposable elements.


Reminiscent of …


The analysis is not pretty, but here is what I have to offer.