r - Keeping zero count combinations when aggregating with data.table -

August 15, 2010

suppose i've got following data.table :

dt <- data.table(id = c(rep(1, 5), rep(2, 4)),                  sex = c(rep("h", 5), rep("f", 4)),                   fruit = c("apple", "tomato", "apple", "apple", "orange", "apple", "apple", "tomato", "tomato"),                  key = "id")     id sex  fruit 1:  1   h  apple 2:  1   h tomato 3:  1   h  apple 4:  1   h  apple 5:  1   h orange 6:  2   f  apple 7:  2   f  apple 8:  2   f tomato 9:  2   f tomato

each row represents fact (identified it's id , sex) ate fruit. want count number of times each fruit has been eaten sex. can :

dt[ , .n, = c("fruit", "sex")]

which gives:

    fruit sex n 1:  apple   h 3 2: tomato   h 1 3: orange   h 1 4:  apple   f 2 5: tomato   f 2

the problem is, doing way i'm losing count of orange sex == "f", because count 0. there way aggregation without loosing combinations of 0 counts?

to clear, desired result following:

   fruit sex n 1:  apple   h 3 2: tomato   h 1 3: orange   h 1 4:  apple   f 2 5: tomato   f 2 6: orange   f 0

thanks lot !

seems straightforward approach explicitly supply category combos in data.table passed i=, setting by=.eachi iterate on them:

setkey(dt,sex,fruit)[cj(unique(sex), unique(fruit)), .n, by=.eachi] #    sex  fruit n # 1:   f  apple 2 # 2:   f orange 0 # 3:   f tomato 2 # 4:   h  apple 3 # 5:   h orange 1 # 6:   h tomato 1

Search This Blog

DIs

r - Keeping zero count combinations when aggregating with data.table -

Comments

Post a Comment

Popular posts from this blog

css - Text drops down with smaller window -

php - cannot display multiple markers in google maps v3 from traceroute result -

php - Boolean search on database with 5 million rows, very slow -