Dependence of Variance with Population Size
Understanding the Dependence of Variance with Population Size
The framework in this section is useful for approaching the problem of dependence with population size (). We denote a reference population size as and consider a different population size .
A power law of variance with implies a linear relation between (equivalent to ) and . The slope of this line is , which determines the ratio . If we consider two populations, where the second is a multiple of the first, then , and the difference in variance between these two populations is given by . These relationships apply not only to the total population but also to its parts.
To link changes in total population to changes in parts' population, consider: if we sample agents from a population, on average, the population of each part is , where is the expected population at part when the total population size is . In logarithmic scale, this means that if , then for all parts .
Empirically, the dependence of a part's log variance with changes in the part's log population can be approximated qualitatively by a line of slope :
The accuracy of this model can be tested a posteriori. When changing for , the levels of change as:
If all parts present a common exponent, then when replacing this value in the expression of the idiosyncratic term of aggregate variance, the dependence with comes out as a common factor:
Thus, the relation shown by the parts is itself valid for the aggregate:
This leads to the equation:
This equation indicates that if we plot the idiosyncratic term of as a function of population sampling size in a log-log scale, it will show a slope .
In the special case where all parts have the same variance , the idiosyncratic part of aggregate variance fulfills . Therefore, . In our case, , so . This determines the variance drop when comparing parts to aggregate.
We have expressed the idiosyncratic part of aggregate variance both as a function of total population and as a function of parts' population . They both should show a common . Empirically, the observed slope of variance decay with population size is for exports data, and for imports data. These values were computed from parts' variances (blue lines) and can be extended to describe aggregate idiosyncratic variance (yellow lines).
So far, we can measure the rate of decay of aggregate variance with population size (). We know that the rate of decay of parts is related to the rate of decay in the aggregate. However, we have yet to understand why this slope has its particular value. To explore this, we need to examine what occurs within the parts themselves, which will be the focus of the following section.