r - converting a row of data.frame to column names using data.table -


i have large data frame of 5 million rows, 3 columns. transform matrix has rows user_id, id columns, , value cnt. done melt , cast or

xtabs(cnt ~ user_id + id, data = foo) 

however object created large , following error 'dim' specifies large array

user_id id cnt 1      1.813e+14 21   1 2      1.559e+14 28   1 6      1.592e+14 71   2 

i'm trying use data.table seams handle large data better data.frame, can't figure out how use data.table create contingency table want.
1 have idea how working? i'm thinking of creating , empty matrix appropriate dimensions , fill appropriate indexes.

try using built in data.frame co2 :

> xtabs(uptake ~ treatment + type, co2)             type treatment    quebec mississippi   nonchilled  742.0       545.0   chilled     666.8       332.1 

or using tapply:

> with(co2, tapply(uptake, list(treatment, type), sum))            quebec mississippi nonchilled  742.0       545.0 chilled     666.8       332.1 

and compare data.table:

> library(data.table) > > dt <- data.table(co2) > dt[, as.list(tapply(uptake, type, sum)), = treatment]     treatment quebec mississippi 1: nonchilled  742.0       545.0 2:    chilled  666.8       332.1 

cautionary note: if same levels of type not appear in every treatment group not sufficient. in case necessary convert type factor in data table (as in co2).

added:

its possible rid of tapply , have pure data table approach this:

> dt[, setnames(as.list(.sd[,list(uptake = sum(uptake)), = type][, uptake]),  +   levels(type)), = treatment]     treatment quebec mississippi 1: nonchilled  742.0       545.0 2:    chilled  666.8       332.1 

the cautionary note above applies here too.


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -