r - Keeping zero count combinations when aggregating with data.table -
suppose i've got following data.table
:
dt <- data.table(id = c(rep(1, 5), rep(2, 4)), sex = c(rep("h", 5), rep("f", 4)), fruit = c("apple", "tomato", "apple", "apple", "orange", "apple", "apple", "tomato", "tomato"), key = "id") id sex fruit 1: 1 h apple 2: 1 h tomato 3: 1 h apple 4: 1 h apple 5: 1 h orange 6: 2 f apple 7: 2 f apple 8: 2 f tomato 9: 2 f tomato
each row represents fact (identified it's id
, sex
) ate fruit
. want count number of times each fruit
has been eaten sex
. can :
dt[ , .n, = c("fruit", "sex")]
which gives:
fruit sex n 1: apple h 3 2: tomato h 1 3: orange h 1 4: apple f 2 5: tomato f 2
the problem is, doing way i'm losing count of orange
sex == "f"
, because count 0. there way aggregation without loosing combinations of 0 counts?
to clear, desired result following:
fruit sex n 1: apple h 3 2: tomato h 1 3: orange h 1 4: apple f 2 5: tomato f 2 6: orange f 0
thanks lot !
seems straightforward approach explicitly supply category combos in data.table passed i=
, setting by=.eachi
iterate on them:
setkey(dt,sex,fruit)[cj(unique(sex), unique(fruit)), .n, by=.eachi] # sex fruit n # 1: f apple 2 # 2: f orange 0 # 3: f tomato 2 # 4: h apple 3 # 5: h orange 1 # 6: h tomato 1
Comments
Post a Comment