To be honest I’m not entirely sure what happened today.

We began our day in the computer lab looking at our transcriptome using Weighted Correlation Network Analysis (WGCNA), an R package that looks at patterns of co-expression to see possible groupings of similarly behaving genes.

The R script we used looks as follows:

source("http://bioconductor.org/biocLite.R") biocLite("impute") install.packages("WGCNA") library(WGCNA); # The following setting is important, do not omit. options(stringsAsFactors = FALSE); #Read in the female liver data set SSData = read.csv("Phel_rnaseq_normalized_expression.csv"); # Take a quick look at what is in the data set: dim(SSData); names(SSData); datExpr0 = as.data.frame(t(SSData 1 )); names(datExpr0) = SSData$Contig; rownames(datExpr0) = names(SSData) 1 ; datExpr0 datExpr=datExpr0 gsg = goodSamplesGenes(datExpr0, verbose = 3); gsg$allOK collectGarbage(); powers = c(c(1:10), seq(from = 12, to=20, by=2)) # Call the network topology analysis function sft = pickSoftThreshold(datExpr, powerVector = powers, verbose = 5) # Plot the results: sizeGrWindow(9, 5) par(mfrow = c(1,2)); cex1 = 0.9; # Scale-free topology fit index as a function of the soft-thresholding power plot(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2], xlab="Soft Threshold (power)",ylab="Scale Free Topology Model Fit,signed R^2",type="n", main = paste("Scale independence")); text(sft$fitIndices[,1], -sign(sft$fitIndices[,3])*sft$fitIndices[,2], labels=powers,cex=cex1,col="red"); # this line corresponds to using an R^2 cut-off of h abline(h=0.90,col="red") # Mean connectivity as a function of the soft-thresholding power plot(sft$fitIndices[,1], sft$fitIndices[,5], xlab="Soft Threshold (power)",ylab="Mean Connectivity", type="n", main = paste("Mean connectivity")) text(sft$fitIndices[,1], sft$fitIndices[,5], labels=powers, cex=cex1,col="red") net = blockwiseModules(datExpr, power = 10, TOMType = "unsigned", minModuleSize = 9, reassignThreshold = 0, mergeCutHeight = 0.25, numericLabels = TRUE, pamRespectsDendro = FALSE, saveTOMs = TRUE, saveTOMFileBase = "femaleMouseTOM", verbose = 3) sizeGrWindow(12, 9) # Convert labels to colors for plotting mergedColors = labels2colors(net$colors) # Plot the dendrogram and the module colors underneath plotDendroAndColors(net$dendrograms[[1]], mergedColors[net$blockGenes[[1]]], "Module colors", dendroLabels = FALSE, hang = 0.03, addGuide = TRUE, guideHang = 0.05)

This gives us a dendrogram of our transcriptome, grouping our genes into modules.

Try as I might, today’s attempts to do more with IPython were a bit pathetic…