2012年6月5日星期二

for loop to sapply loop


http://thr3ads.net/r-help/2012/01/1830182-converting-a-for-loop-into-a-foreach-loop

You may not need foreach if you use the power of R and vectorize your
operations.  Instead of going through the loops and extracting subsets
at each iteration, use 'split' to either split the dataframe into the
subsets, or faster, create a set of indices to access the subsets of
data:

Here is one way of doing it; create a list of indices to split up the
data instead of doing it with 'for' loop and extracting dataframes at
each point (very time consuming).
> n <- 10000 > x <- data.frame(ID = sample(1:10, n, TRUE)
+ , day = sample(1:7, n, TRUE) + , hour1 = sample(0:23, n, TRUE) + , value = runif(n) + )
> # create list of indices to split the data > idh <- split(seq(nrow(x)), list(x$ID, x$day, x$hour1), drop = TRUE) > str(idh) # sample of the indices
List of 1675 $ 1.1.0 : int [1:5] 226 795 869 6617 9496 $ 2.1.0 : int [1:11] 479 3483 3702 4660 4876 5373 5479 5960 6580 6956 ... $ 3.1.0 : int [1:9] 383 668 2437 3877 5290 5835 7003 7896 8905 $ 4.1.0 : int [1:3] 1493 3502 9635 $ 5.1.0 : int [1:2] 2480 6237 $ 6.1.0 : int [1:5] 2061 4898 5288 9439 9692
> # now take the means of each set of indices > imeans <- sapply(idh, function(i) mean(x$value[i])) > head(imeans, 20)
1.1.0 2.1.0 3.1.0 4.1.0 5.1.0 6.1.0 7.1.0 8.1.0 9.1.0 10.1.0 1.2.0 0.6231298 0.2556291 0.4942764 0.5416091 0.9509064 0.4968711 0.4037645 0.4107976 0.4189220 0.5922433 0.5581944 2.2.0 3.2.0 4.2.0 5.2.0 6.2.0 7.2.0 8.2.0 9.2.0 10.2.0 0.6275555 0.6411061 0.4885817 0.5413741 0.4134971 0.4838082 0.5207435 0.4018991 0.5338913
#####
If 'idh' is the set of indices resulting from the 'split', then
your
function would probably have to look like this:

myfun <- function(x){
H.scv <- Hscv(data[x, ], pilot = "unconstr")
KDE <- kde(data[x, ], H=H.scv, approx.cont=TRUE)
}

This is based on your original example where 'data' was the object
name of your data.  You want to use the indices to access the relavent
rows in your data.  Not sure what your function is supposed to be
returning; right now it return the last value which would be the
result of the call to 'kde'.  If you want both results returned, then
the last statement should possibly be:

list(H.scv, KDE)

which will return the values in a list that you can then extract from.

没有评论:

发表评论