If we take two stocks, and each day record whether each of the stocks
went up or down, we have a process whose state each day is one of the
four values
0 = -- (both went down)
1 = +- (first went up, second went down)
2 = -+ (first went down, second went up)
3 = ++ (both went up)
Thus the statespace of the process is I = { 0,1,2,3 }. We take a long run
of N+1 days of data, and then estimate the transition matrix of the I-valued
process by
p_{ij} = n_{ij}/( n_{i0} + n_{i1} + n_{i2} + n_{i3} ),
where n_{ij} is the number of times we got a day in state i immediately
followed by a day in state j.
Now if the returns were IID, the rows of the matrix P would all be the
same, and a contingency table test shows that this is overwhelmingly rejected
for most pairs of stocks, so it seems that the state today can be informative
about the state tomorrow.
One way to try to exploit this is look at each pair of stocks in turn on
each day, and base a portfolio demand on its state. If we are in state i,
then we will hold
sign( p_{i1} + p_{i3} - p_{i0} - p_{i2} )
of the first stock, and
sign( p_{i2 + p_{i3} - p_{i0} - p_{i1} )
of the second (so hold 1 if an up move is more likely than a down move,
else hold -1).
Summing these demands over all pairs gives the aggregate demand, and that's
one way to select a portfolio