Skip to main content

Observations in the LQ Problem

Let us now extend the reasoning to a description of observations in terms of the pair of variables x1=log(scp)x_1 = \log(s_{cp}) and x2=log(ScSpSW)x_2 = \log\left( \frac{S_c S_p}{S_W} \right) that span the 2D space for the LQ problem. Points in these coordinates are denoted as x=(x2,x1)\textbf{x} = (x_2, x_1). The two-dimensional distribution of growth rates of these coordinates is denoted as G0(Δx2,Δx1)G_{0} (\Delta x_2, \Delta x_1), and the region where x1>x2x_1 > x_2 is the two-dimensional plane region RR where log(scp)>log(ScSpSW)scp>ScSpSW\log(s_{cp}) > \log\left( \frac{S_c S_p}{S_W} \right) \Rightarrow s_{cp} > \frac{S_c S_p}{S_W}, i.e., LQ > 1. The growth cases resulting in the condition LQ > 1 are, in calculus notation:

pLQ(x0)=RG0(Δx2,Δx1) (x1>x2) dRpLQ(\textbf{x}_0) = \iint\limits_{R} G_{0} (\Delta x_2, \Delta x_1) \ ( x_1 > x_2) \ dR

This integration would be approximated numerically. A practical way to do so is to store the 2D distribution of growth out of the x\textbf{x} coordinates in an array G and the condition x1>x2x_1 > x_2 in another array C of the same shape. In this notation, the numerical integration can simply be:

pLQ(x0)=(G * C).sum() / G.sum()pLQ(\textbf{x}_0) = \text{(G * C).sum() / G.sum()}

Before continuing with the strategy for estimation of two-dimensional growth probabilities G, let me emphasize that here we are propagating the volatility of observations and totals into volatility of LQ values.

Volatility and Growth Distributions

First, note that the growth rates of log(scp)\log(s_{cp}) are essentially proportional to the standard deviation σx1=std(log(scp))\sigma_{x_1} = \text{std}(\log(s_{cp})). This σx1\sigma_{x_1} is larger for smaller entities, but we need to propagate this σx1\sigma_{x_1} to the volatility of LQ values to conclude that smaller entities have higher log(LQ)\log(LQ) jumps. Working with two-dimensional growth distributions as previously described addresses this generally. However, one can take variance in the definition log(LQ):=log(scp)(log(Sc)+log(Sp)log(Sw))\log(LQ) := \log(s_{cp}) - (\log(S_c) + \log(S_p) - \log(S_w)):

var(Δlog(LQ))=var(Δlog(scp)[Δlog(Sc)+Δlog(Sp)Δlog(Sw)])=var(Δx1Δx2)\text{var}(\Delta \log(LQ)) = \text{var} (\Delta \log(s_{cp}) - \left[ \Delta \log(S_c) + \Delta \log(S_p) - \Delta \log(S_w) \right] ) = \text{var}(\Delta x_1 - \Delta x_2)

In empirical settings, jumps in these two terms are largely independent, meaning cov(Δx1,Δx2)0\text{cov}(\Delta x_1, \Delta x_2) \approx 0, so:

var(Δlog(LQ))=var(Δx1Δx2)var(Δx1)+var(Δx2)\text{var}(\Delta \log(LQ)) = \text{var}(\Delta x_1 - \Delta x_2) \approx \text{var}(\Delta x_1) + \text{var}(\Delta x_2) σlog(LQ)=std(Δx1Δx2)σx1+σx2\sigma_{\log(LQ)} = \text{std}(\Delta x_1 - \Delta x_2) \approx \sqrt{ \sigma_{x_1} + \sigma_{x_2} }

Volatility Decay with Size

It turns out that var(Δx1)\text{var}(\Delta x_1) and var(Δx2)\text{var}(\Delta x_2) are functions of x1x_1 and x2x_2, respectively. These volatilities are plotted in Figure 1. Larger nominal values fluctuate less in relative terms than smaller values, consistent with findings from Stanley et al. and others studying volatility decay with size. This dependence means that var(Δlog(LQ))\text{var}(\Delta \log(LQ)) is a function of x1,x2x_1, x_2, i.e., the sizes.

Volatility versus size

Volatility versus size. Larger observations fluctuate less and are less likely to traverse a given gap in LQ levels. The standard deviation in these plots is the width of the axis of ellipses in the subsequent figures.

Ellipse Representation of Volatility

This expression of volatility of Δlog(LQ)\Delta \log(LQ) is linked to equations of ellipses. Using the notation σlog(LQ)2=σx12+σx22\sigma^2_{\log(LQ)} = \sigma^2_{x_1} + \sigma^2_{x_2} for variances of the jumps (Δ\Delta), recall the equation of an ellipse centered at the origin is k2=(x/a)2+(y/b)2k^2 = (x/a)^2 + (y/b)^2 where aa, bb are the semi-minor and major axes, and kk is a measure of ellipse size. Associating the ellipse axes with the width of jumps in x1x_1, x2x_2, the magnitude of Δlog(LQ)=Δx1Δx2\Delta \log(LQ) = \Delta x_1 - \Delta x_2 jumps is determined. These are the jumps in log(LQ)\log(LQ), in the sense that log(LQ)t+1=log(LQ)t+Δlog(LQ)\log(LQ)_{t+1} = \log(LQ)_t + \Delta \log(LQ). See Figure 2 for a schematic diagram.

Scheme of jumps in observations Scheme of jumps in observations

Scheme of jumps in observations after a time step. Using size factor y2=(x1+x2)/2y_2 = (x_1 + x_2)/2 and log(LQ) as axes (left) and observed and expected sizes x2,x2x_2, x_2 as axes (right). The green regions are those where LQ>1LQ > 1. The initial point x0\bm{x_0} is shown with a white star. Around it, three purported jumps after a timestep are drawn. The ellipses indicate how far jumps are expected to stretch, in the directions x1x_1 and x2x_2 and thus in any other direction given by a combination of those.

Finally, even after characterizing the widths of the jumps in x1x_1, x2x_2, and thus log(LQ)\log(LQ), there remains a step to translate it into chances of surpassing the LQ=1LQ = 1 threshold (pLQ). For that, let us return to our core goal of explaining pLQ by means of growth distributions. We need to count all cases where jumps have let log(LQ)>1\log(LQ) > 1.