# Distribution records

Only the records for 190 species got processed last night. With another 300+ species to go performance could be improved. The bottleneck is looking up the environmental data so I tried to improve this part of the process.

Old version:

 add_environment <- function(row, data, layers) {
environment <- extract(layers, data[,xycols])
data <- cbind(data, environment)
write.csv2(data, paste0("3_environment/", row$ScientificName, " ", row$AphiaID, ".csv"), row.names = FALSE)
data
}

New version:

 add_environment <- function(row, data, layers) {
cells <- cellFromXY(layers, data[,lonlat])
unique_cells <- unique(cells)
matched_cells <- match(cells, unique_cells)
environment <- extract(layers, unique_cells)
data <- cbind(data, environment[matched_cells,])
write.csv2(data, paste0("3_environment/", row$ScientificName, " ", row$AphiaID, ".csv"), row.names = FALSE)
data
}


But it didn’t improve the performance and might even have slowed everything down but its hard to test because there are huge caching effects.

I’ve finished writing the species info files with traits from WoRMS. Not sure it will be useful relevant information to try to include in modeling but we’ll see.

# Shiny UI

I’ve also been working on the MarineSPEED UI. There where some bugs which are now gone and I’ve added a Leaflet map with species points but this is rather slow with large amounts of records (45000 records). So I’ve filtered out records in order to have less then 2000 points per species which still gives a good idea of where the points are without slowing down the map.

Useful leaflet help of the day: http://stackoverflow.com/questions/32107667/clearshapes-not-working-leaflet-for-r/32118413

# Next week

• Create a poster for the VLIZ Marine Scientist conference
• MarineSPEED viewer: