Two reasons for that
- To reduce gap between sample variance and population variance
( empirical reason ) - "1/n" version is the maximum likelihood estimate of the population variance, however, it is also mathematically biased
- sample variance is usually smaller than the population variance
→ estimation of the population variance is getting bigger than real - to reduce gap using "1/n-1" convention ( provides an unbiased estimate )
- why not n-2 ?
- related to degree of freedom, that is n-1
- To match both expectation of sample variances and population variance
( mathematical reason ) - let,
: sample size
: sample mean
: sample variance
: population mean
: population variance
- then, figure out following is true
- first,
- as here,