r - Manipulating all split data sets -
i'm drawing blank-- have 51 sets of split data data frame had, , want take mean of height of each set.
print(dataset) $`1` id species plant height 1 1 42.7 2 1 32.5 $`2` id species plant height 3 2 43.5 4 2 54.3 5 2 45.7 ...
...
...
$`51` id species plant height 134 51 52.5 135 51 61.2 i know how run each individually, 51 split sections, take me ages.
i thought that
mean(dataset[,4]) might work, says have wrong number of dimensions. why incorrect, no closer figuring out how average of heights.
the dataset list. use lapply/sapply/vapply etc loop through list elements , mean of 'height' column. using vapply, can specify class , length of output (numeric(1)). useful debugging.
vapply(dataset, function(x) mean(x[,4], na.rm=true), numeric(1)) # 1 2 51 #37.60000 47.83333 56.85000 or option (if have same columnames/number of columns data.frames in list), use rbindlist data.table optionidcol=trueto generate singledata.table. '.id' column shows name of thelistelements. group '.id' , themeanof theheight`.
library(data.table) rbindlist(dataset, idcol=true)[, list(mean=mean(height, na.rm=true)), = .id] # .id mean #1: 1 37.60000 #2: 2 47.83333 #3: 51 56.85000 or similar option above unnest library(tidyr) return single dataset '.id' column, grouped '.id', summarise mean of 'height'.
library(tidyr) library(dplyr) unnest(dataset, .id) %>% group_by(.id) %>% summarise(mean= mean(height, na.rm=true)) # .id mean #1 1 37.60000 #2 2 47.83333 #3 51 56.85000 the syntax plyr is
df1 <- unnest(dataset, .id) ddply(df1, .(.id), summarise, mean=mean(height, na.rm=true)) # .id mean #1 1 37.60000 #2 2 47.83333 #3 51 56.85000 data
dataset <- structure(list(`1` = structure(list(id = 1:2, species = c("a", "a"), plant = c(1l, 1l), height = c(42.7, 32.5)), .names = c("id", "species", "plant", "height"), class = "data.frame", row.names = c(na, -2l)), `2` = structure(list(id = 3:5, species = c("a", "a", "a" ), plant = c(2l, 2l, 2l), height = c(43.5, 54.3, 45.7)), .names = c("id", "species", "plant", "height"), class = "data.frame", row.names = c(na, -3l)), `51` = structure(list(id = 134:135, species = c("a", "a" ), plant = c(51l, 51l), height = c(52.5, 61.2)), .names = c("id", "species", "plant", "height"), class = "data.frame", row.names = c(na, -2l))), .names = c("1", "2", "51"))
Comments
Post a Comment