r - Using Apply functions instead of for loops -
i have data set consists of 59 columns columns 4 59 contain mixture of email addresses , nonsense want create vector (which go data frame) picks unique email addresses columns 4:59. below function works 1 column email0. columns sequential email0-email55
udf.unique.emails <- function (strcol, data) { vector <- as.character() # columns email in data set for(i in 1:length(data)) { # check items in row per email if (grepl("@", strcol[i])) { vector <- unique(c(vector,strcol[i])) } } return (vector) } test <- udf.unique.emails (foo$email0, foo.data)
i wish implement on columns 4:59 produce single column, can point me in right direction using apply family?
thank time
#######update#####
due sensitivity of data in question cant give detail. below mock data called foo.data , data , column fed function
for email0, foo@fpo.com returned function
the end result single column unique emails other email columns below
$ email0 (chr) "foo@fpo.com", "recieved report", "daily", "query", "weekly", "products", "products2", "results", "products... $ email1 (chr) "foo2@fpo2.com", "", "nonsense", "", "", "garbage", "", "", "trace stack", "", "", "", "", "", "", "js@fpo.com", "", "",... $ email2 (chr) "john.smith@fpo.com", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "john.smith.weston@fpo.com"
you can try:
data[which(matrix(grepl("@",as.matrix(data)),ncol=55),arr.ind=t)]
it indices there "@" , returns value @ these indices.
similar this post
Comments
Post a Comment