distribution - entropy estimation using histogram of normal data vs direct formula (matlab) -


let's assume have drawn n=10000 samples of standard normal distribution.

now want calculate entropy using histograms calculate probabilities.

1) calculate probabilities (for example using matlab)

[p,x] = hist(samples,binnumbers); area = (x(2)-x(1))*sum(p); p = p/area; 

(binnumbers determined due rule)

2) estimate entropy

h = -sum(p.*log2(p)) 

which gives 58.6488

now when use direct formula calculate entropy of normal data

h = 0.5*log2(2*pi*exp(1)) = 2.0471 

what do wrong when using histograms + entropy formula? thank help!!

you missing dp term in sum

dp = (x(2)-x(1)); area = sum(p)*dp; h = -sum( (p*dp) * log2(p) ); 

this should bring close enough...

ps,
careful when take log2(p) might have empty bins. might find nansum useful.


Comments

Popular posts from this blog

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -

Trying to import CSV file to a SQL Server database using asp.net and c# - can't find what I'm missing -