grouping - R - apply adf.test by group -
i have data.frame bbm
variables ticker
, variable
, value
. want apply augmented dickey fuller test via adf.test function grouped ticker , variable. r should add new column initial data.frame corresponding p-values.
i tried
x <- with(bbm, tapply(value, list(ticker, variable), adf.test$p.value)) cbind(bbm, x)
this yields error in adf.test$p.value : object of type 'closure' not subsettable
.
then tried
x <- with(bbm, tapply(value, list(ticker, variable), as.list(adf.test)$p.value)) cbind(bbm, x)
this yields result, in new column not want. when change p.value on code method stills yields odd number.
then tried using ddply:
bbm<-ddply(bbm, .(ticker, variable), mutate, df=adf.test(value)$p.value)
which yields error: wrong embedding dimension
.
how can solve this? suggestions?
here's sample of df:
ticker variable value 1 1002z av equity bs_customer_deposits 29898.0 2 1002z av equity bs_customer_deposits 31302.0 3 1002z av equity bs_customer_deposits 29127.0 4 1002z av equity bs_customer_deposits 24056.0 5 1002z av equity bs_customer_deposits 22080.0 6 1002z av equity bs_customer_deposits 22585.0 7 1002z av equity bs_customer_deposits 22674.0 8 1002z av equity bs_customer_deposits 21733.0 9 1002z av equity bs_customer_deposits 22016.0 10 1002z av equity bs_customer_deposits 21999.0 11 1002z av equity bs_customer_deposits 22013.0 12 1002z av equity bs_customer_deposits 21135.0 13 1002z av equity bs_tot_loan 28476.0 14 1002z av equity bs_tot_loan 29446.0 15 1002z av equity bs_tot_loan 29273.0 16 1002z av equity bs_tot_loan 27579.0 17 1002z av equity bs_tot_loan 20769.0 18 1002z av equity bs_tot_loan 21370.0 19 1002z av equity bs_tot_loan 22306.0 20 1002z av equity bs_tot_loan 21013.0 21 1002z av equity bs_tot_loan 21810.0 22 1002z av equity bs_tier1_cap_ratio 6.5 23 1002z av equity bs_tier1_cap_ratio 6.2 24 1002z av equity bs_tier1_cap_ratio 7.9 25 1002z av equity bs_tier1_cap_ratio 9.2 26 1002z av equity bs_tier1_cap_ratio 8.5 27 1002z av equity bs_tier1_cap_ratio 6.6 28 1002z av equity bs_tier1_cap_ratio 9.6 29 1002z av equity bs_tot_cap_to_risk_base_cap 11.5 30 1002z av equity bs_tot_cap_to_risk_base_cap 10.9 > dput(head(select(bbm, ticker, variable, value), 30)) structure(list(ticker = c("1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity", "1002z av equity" ), variable = structure(c(4l, 4l, 4l, 4l, 4l, 4l, 4l, 4l, 4l, 4l, 4l, 4l, 5l, 5l, 5l, 5l, 5l, 5l, 5l, 5l, 5l, 8l, 8l, 8l, 8l, 8l, 8l, 8l, 9l, 9l), .label = c("px_last", "pe_ratio", "vol_mean", "bs_customer_deposits", "bs_tot_loan", "*", "rn366", "bs_tier1_cap_ratio", "bs_tot_cap_to_risk_base_cap", "return_com_eqy", "bs_lev_ratio_to_tang_cap", "npls_to_total_loans"), class = "factor"), value = c(29898, 31302, 29127, 24056, 22080, 22585, 22674, 21733, 22016, 21999, 22013, 21135, 28476, 29446, 29273, 27579, 20769, 21370, 22306, 21013, 21810, 6.5, 6.2, 7.9, 9.2, 8.5, 6.6, 9.6, 11.5, 10.9)), .names = c("ticker", "variable", "value"), row.names = c(na, 30l), class = "data.frame")
oh, , using analogue dplyr function yields same error ddply.
it seems problem might group small handle. option deal creating custom function catch error (with trycatch
and, pass function via lapply()
call, so:
testx <- function (x) { return(trycatch(adf.test(x), error=function(e) null)) } g<- lapply(split(bbm, bbm$variable), function(x) testx(x$value)) str(g) #list of 12 # $ px_last : null # $ pe_ratio : null # $ vol_mean : null # $ bs_customer_deposits :list of 6 # ..$ statistic : named num -4.86 # .. ..- attr(*, "names")= chr "dickey-fuller" # ..$ parameter : named num 2 # .. ..- attr(*, "names")= chr "lag order" # ..$ alternative: chr "stationary" # ..$ p.value : num 0.01 # ..$ method : chr "augmented dickey-fuller test" # ..$ data.name : chr "x" # ..- attr(*, "class")= chr "htest" # $ bs_tot_loan :list of 6 # ..$ statistic : named num -0.784 # .. ..- attr(*, "names")= chr "dickey-fuller" # ..$ parameter : named num 2 # .. ..- attr(*, "names")= chr "lag order" # ..$ alternative: chr "stationary" # ..$ p.value : num 0.951 # ..$ method : chr "augmented dickey-fuller test" # ..$ data.name : chr "x" # ..- attr(*, "class")= chr "htest" # $ * : null # $ rn366 : null # $ bs_tier1_cap_ratio :list of 6 # ..$ statistic : named num -4.33 # .. ..- attr(*, "names")= chr "dickey-fuller" # ..$ parameter : named num 1 # .. ..- attr(*, "names")= chr "lag order" # ..$ alternative: chr "stationary" # ..$ p.value : num 0.0118 # ..$ method : chr "augmented dickey-fuller test" # ..$ data.name : chr "x" # ..- attr(*, "class")= chr "htest" # $ bs_tot_cap_to_risk_base_cap: null # $ return_com_eqy : null # $ bs_lev_ratio_to_tang_cap : null # $ npls_to_total_loans : null
this create list object g
of length 12 (one per factor), where, valid adf.test calls, element populated relevant characteristics, , rest null
passed.
if parameter of interest p.value
per group, previous lapply
can wrapped around sapply()
following object:
h<- sapply(lapply(split(bbm, bbm$variable), function(x) testx(x$value)), function(x) print(x$p.value)) str(h) #list of 12 # $ px_last : null # $ pe_ratio : null # $ vol_mean : null # $ bs_customer_deposits : num 0.01 # $ bs_tot_loan : num 0.951 # $ * : null # $ rn366 : null # $ bs_tier1_cap_ratio : num 0.0118 # $ bs_tot_cap_to_risk_base_cap: null # $ return_com_eqy : null # $ bs_lev_ratio_to_tang_cap : null # $ npls_to_total_loans : null
as per comments, if there needs grouping both ticker
, variable
yield desired results:
g<- lapply(split(bbm, list(bbm$variable, bbm$ticker)), function(x) testx(x$value)) #to remove null not needed: g[g != "null"]
Comments
Post a Comment