python - Calculating Covariance in Pandas Time Series -
apologies in advance if documented somewhere , failed find it:
let's have time series data frame looks this:
week_end_date title_short sales 2012-02-25 00:00:00.000000 "bob" (ebk) 1 2012-03-31 00:00:00.000000 "bob" (ebk) 1 2012-03-03 00:00:00.000000 "sally" (ebk) 1 2012-03-10 00:00:00.000000 "sally" (ebk) 1 2012-03-17 00:00:00.000000 "sally" (ebk) 1 2012-04-07 00:00:00.000000 "sally" (ebk) 1
i want calculate covariance in sales in order find users tend move together. know pandas has covariance feature: http://pandas.pydata.org/pandas-docs/stable/computation.html#covariance, i'm not sure how reshape data kind of purpose.
am correct in thinking users need set column index, each series vector across time series? have no idea how that.
you looking pandas pivot. first do:
df.pivot(index='week_end_date', columns='title_short', values='sales')
and should bob , sally columns. can normal correlation analysis 2 columns.
Comments
Post a Comment