Due to some sort of data mis-handling, morphometric data that was previously taken for thousands (seriously, THOUSANDS) of Pacific oysters in 2014 was found to be incorrect. Unfortunately, there’s not enough of a “paper trail” to back track to see what/where things might have gone wrong to try to fix the issues. Essentially, they all had to be re-measured!
The one good thing is that all of the oysters were photographed at the time of sampling (along with a ruler), which allows us to go back and measure them.
I re-measured them all using the free imaging software ImageJ.
Oyster measurements taken were length and width. For length, the oyster was measured from hinge to the leading edge of the shell, attempting to measure as close to the theoretical center line of the oyster as possible, while also capturing the two points furthest from each other. The width was measured at the apparent widest part of the oyster and attempted to be perpendicular to the length measurement line.
Each image with the measurement lines was saved as a .tif file and the filename appended with “measured”. Additionally, each image produced a corresponding Excel file named CgOA_measurements_bag_info.xls, where “bag_info” contains information regarding the oysters in that set.
Images were measured by setting the pixel scale using 100mm (10cm) measurement on each image via the ruler in the image. Images were greatly enlarged when setting the scale to improve scale accuracy. Some images did not contain a ruler. Instead, the scale was set using the length of a weigh boat: 89mm (8.9cm). Weigh boat size was gathered from manufacturer specs: VWR Cat#89106-768 (8.9cm x 8.9cm x 2.5cm). Files corresponding to these sets of measurements are appended with “no_ruler” in the filename. The sample sets that were measured in this fashion were oyster bags:
The measured images and the individual Excel files were uploaded to the following Dropbox location: Dropbox/Friedman Lab/Carolyn Lab/Manuscripts/2016/Cg OA selection/Data/Sam DATA.
Data from the individual files was aggregated in the following spreadsheet in Dropbox: Dropbox/Friedman Lab/Carolyn Lab/Manuscripts/2016/Cg OA selection/Data/Sam DATA/files to merge/Cg OA selection 9mo sampling All 3 sites_survival data_ FOR SAM to add L and W data.xlsx
Data is still missing (i.e. no labelled image file was present) for the following oysters:
- 458 21-37
- 486 1-20
- 556 20-45
- 588 1-19
Here’s a quick summary of the amount of data I gathered. I’ll provide details of how I used ImageJ to not only measure the samples, but also create a more reproducible means of following the data acquisition process so that we can improve our ability to follow the “paper trail” from who acquired the data, how they acquired it, and allow people to easily review that data. This way, once all this data is transposed to some master spreadsheet, it will still be granularly accessible for any future troubleshooting that might be needed. I’ll do this in a separate post .
So, what did this work produce and how did I determine this information?
Using Bash (i.e. command line in Terminal):
Count the number of image files analyzed (i.e. saved) by ImageJ:
ls -1 *.tif | wc -l 163
Count the number of spreadsheet files produced by ImageJ:
ls -1 CgOA*.xls | wc -l 164
Well, there’s an odd discrepancy. These should be the same number. However, if anything were off, I’d expect the number of images to be greater in number than the Excel files. That would indicate I went through the measuring process, but neglected to save the data. However, this suggests that there’s an “extra” Excel file. It’s possible that I accidentally saved the image to a different location by accident. Will look into this…
Count the number of measurements taken. This will be a two step process.
First, aggregate all the data from the individual data files into a single file:
for i in CgOA*.xls; do awk 'NR>1' "$i" >> all_measures.csv; done
The code above uses a for loop to look at each Excel file (files beginning with “CgOA” and ending with “.xls”). Each file ($i) is passed to the program awk, which concatenates/appends the contents of the file (excluding the header line; NR>1) to a new file (all_measures.csv).
Next, count the number of lines (i.e. measurements) in the all_measures.csv file:
wc -l < all_measures.csv 5251
Whoa! That’s pretty remarkable. Over 5000 individual measurements were recorded (length and width for each oyster). That means there were over 2600 oysters!!!
Hopefully we won’t have to re-measure these guys a third time!