r - using eval in data.table -
r - using eval in data.table -
i'm trying understand behaviour of eval in data.table "frame".
with next data.table:
set.seed(1) foo = data.table(var1=sample(1:3,1000,r=t), var2=rnorm(1000), var3=sample(letters[1:5],1000,replace = t))
i'm trying replicate instruction
foo[var1==1 , sum(var2) , by=var3]
using function of eval:
eval1 = function(s) eval( parse(text=s) ,envir=sys.parent() )
as can see, test 1 , 3 working, don't understand "correct" envir set in eval test 2:
var_i="var1" var_j="var2" var_by="var3" # test 1 works foo[eval1(var_i)==1 , sum(var2) , by=var3 ] # test 2 doesn't work foo[var1==1 , sum(eval1(var_j)) , by=var3] # test 3 works foo[var1==1 , sum(var2) , by=eval1(var_by)]
the j-exp
, checks it's variables in environment of .sd
, stands subset of data
. .sd
data.table
holds columns that group.
when do:
foo[var1 == 1, sum(eval(parse(text=var_j))), by=var3]
directly, j-exp
gets internally optimised/replaced sum(var2)
. sum(eval1(var_j))
doesn't optimised, , stays is.
then when gets evaluated each group, it'll have find var2
, doesn't exist in parent.frame() function called, in .sd
. example, let's this:
eval1 <- function(s) eval(parse(text=s), envir=parent.frame()) foo[var1 == 1, { var2 = 1l; eval1(var_j) }, by=var3] # var3 v1 # 1: e 1 # 2: c 1 # 3: 1 # 4: b 1 # 5: d 1
it find var2
it's parent frame. is, have point right environment evaluate in, additional argument value = .sd
.
eval1 <- function(s, env) eval(parse(text=s), envir = env, enclos = parent.frame()) foo[var1 == 1, sum(eval1(var_j, .sd)), by=var3] # var3 v1 # 1: e 11.178035 # 2: c -12.236446 # 3: -8.984715 # 4: b -2.739386 # 5: d -1.159506
r data.table
Comments
Post a Comment