So, as en exercise, I followed through with deduplication and sorting of the BAM files.
Then, ran a quick analysis using MethylKit in R. The analysis simply copied what Steven had done with another data set and I haven’t examined it very thoroughly, so am not well-versed on what it’s doing and/or why.
Jupyter Notebook (GitHub):
R Studio Project (download the folder, load project in R Studio, and then run the script in the scripts subdirectory to run the analysis):
Will take the full data sets through this whole pipeline.