12/04/2012

Portfolio Optimization with Simple Heuristic

Lately on the Heuristic course we have competition of Portfolio. It simulates the real market but largely simplified into a game that we can play with. The description is as here. And the whole gaming process can be found on this website.

A detailed description is given as below:



Initialization

We call the ith gamble g[i]. Each gamble will have three different returns (gihi, gimed, gilow), mapping to three initial probabilities(gihiprob, gimedprob, gilowprob), where the gimedprob is at least 0.4.

Initially, the average expected return per gamble is 2 to 1. Thus the expectation is E=gihi*P(gihi)+gimed*P(gimed)+gilow*P(gilow)=2. Different gambles may give different returns. Low returns are normally less than 1 (so constitute a loss). During each round, we permute all the gambles, and get a new sequence of gambles g'1, g'2, ..., g'n.

Classes

Now we assume each gamble belongs to a class (which is also true in reality), ranging from 1 to 15. Among these classes some are favorable classes, some are neutral, and some are unfavorable. The favorable classes bias the gihiprob higher while the unfavorable bias the gilowprob higher. Number of gambles in each classes might not be equal. We have the following rules to change the probabilities of the gambles in the classes:

for i = 1, N do
  if gi is favorable
    P(gihi) += P(gilow) / 2
    P(gilow) /= 2
  else if gi is unfavorable
    P(gilow) += P(gihi) / 2
    P(gihi) /= 2
end

Link

Certain gamble have links to each other. Link also makes a series of gambles change their probabilities of returns during each round.  The link if different from class, and it gives a correlation between two gambles. For example, the stock price of Microsoft and the price of Nokia will influence each other (I consider these two are not necessarily in the same class). The way of calculating the link's influence to a gamble's probability is as follows:


permute gamble sequence: g'=perm(g)
for i = 1, N do
  Hi = all gambles before g'i that have links with g'i and have high return
  Mi = same but have medium return
  Li = same but have low return
  if Hi > Mi + Li
    P(gihi) +=P(gilow) / 2
    P(gilow) /= 2
  else if Li > Mi + Hi
    P(gilow) += P(gihi) / 2
    P(gihi) /= 2
  Play gi
end

Thus, the full sequence of operation is:

  1. Reset each probability to initial 
  2. Make adjustments based on favorable/unfavorable classes 
  3. Start computing gambles, making adjustments based on links as you go 

Task

Suppose we can bet (invest) on approximately N gambles (stocks). We are given the capital of 1 (the total money or budget you initially have is 1). Our task is to allocate the 1 unit among chosen number of gambles.

Game

1. In the first game we play 5 round.
a. Assign outcomes to each gamble according to the probability above.
b. Calculate each person's wealth based on their allocation and the outcomes (having started each round at 1).
If return of the person >=2, then that person gets a point for that round. The player will be told what the final return outcome of each gamble was to help the player infer which class is favorable and perhaps change allocations for the next round.

The player is not told which order the gambles were played in however. The favorable and unfavorable classes will remain the same during the five rounds.

2. In the second game we play 200 round.

The favorable and unfavorable classes may change up to 10 times (without warning) during those rounds. Again, the player will be told what the final return outcome of each gamble was to help the player infer which class(es) is(are) favorable and perhaps change allocations for the next round. Players need not be fully invested during all rounds. Inspired by Jake Loveless's experience at a finance company where consistent winnings are favored over erratic profits and losses, we will have two criteria for winning: total amount of money at the end, and 10 percentile daily return (that is sort all daily returns from least (most negative) to greatest and take the return that is at the 10th percentile in that list). In the original game, we used Sharpe Ratio as the criteria, which is computed as

sum(return) / sqrt(sum((return - avg_return)^2))
Here the return is the real value of profit we make each round (though the real sharpe ratio is given as a ratio).

Tackling the first problem

For the first game, our intuitive goal is to find some combination of gambles, that give the largest probability of return that is larger than 2.

For example, if we bet two gambles g0, g1, and we allocate x to g0, then the probability that return >=2 is (here i and j has 3 values: 0:high, 1: medium, 2: low):
for (int i = 0; i < 3; ++ i) {
  for (int j = 0; j < 3; ++ j) {
    if (g0.getReturn(i) * x + g1.getReturn(j) * (1-x) >= 2) {
      prob += getProb(g0, i) * getProb(g1, j);
    }
}

This method is guaranteed to have the best probability in the set of gambles with fixed amount of gambles, by traversing all the possible combinations of the selected gambles with different ratio assigned to each gamble. However, in the scenario of two gambles, if we explore the ratio (x) by a step size of 0.01 each time, we will have 100 * N * N computation. If we want to select many gambles, the time complexity is factorial.

But do we really need so many gambles to decide which will give the best return? Actually for the first game we don't need to complicate the problem. If we can find the best allocation in one or two gambles that get us a point, we don't really care about whether the return is 2 or 100. If we name the previous function computeProbability, we have the following algorithm:

double best_prob = 0;
int[] best_comb = new int[2];
double best_x = 0;
for (double x = 0; x <= 1; x += 0.01) {
  for (int i = 0; i < gambles.size() - 1; ++i) {
    for (int j = i + 1; j < gambles.size(); ++j) {
      double prob = computeProbability(i, j, x);
      if (prob > best_prob) {
        best_prob = prob;
        best_comb[0] = i;
        best_comb[1] = j;
        best_x = x;
      }
    }
  }
}
We usually get 5 points in 5 rounds, that means the easy algorithm works quite well to meet the requirement.

Solving the second problem


Here I will firstly explain the original solution to solve the problem to get the best sharpe ratio.

This problem gives more complicated scenario: besides links making probabilities of gambles change each round, the classes might also change.

But do we really want to consider all these complex changes? Lets look at the Sharpe Ratio again:
sum(return) / sqrt(sum((return - avg_return)^2))
It is the measurement of return per risk. The assumption is that if the returns of the gambles of your portfolio stay fixed, but they vary tremendously, the sharpe ratio will be small. This is because the denominator, which is the variance of the returns, will be large. Then you might risk in earning a lot of money for a round, than lose a lot in the next. Also, if you keep the denominator fixed, and make the sum of return as large as possible, the sharpe ratio will be large.

Based on the formular, to get the best sharpe ratio, we need to make the denominator as small as possible. If we can make the return of each round as close as we can, say, almost 0, we will get the denominator almost 0. It doesn't matter if the return is large or small, say 2 or 100, because the nominator and denominator are both multiplied by a constant.

So we bet on some gambles, say 5 gambles, which produce the smallest variance of the high, medium and low returns. If there is one stock whose variance is less than 0.0001, we only bet on that gamble.

PriorityQueue q
for (int gi = 0; gi < n; ++gi) {
  var = computeVariance(gi)
  if gi.avg < 1.2 continue // not consider gi whose average is less than 1.2
  q.add(var)
  if q.size() > 5 // maximum gambles we bet on is 5
    q.poll()
}

smallest = smallest variance in q
if smallest < 0.0001
  only bet on the gamble
else
  bet on all gambles in q

Actually this is really impractical though we gives a very large sharpe ratio, average 10000 during experiments. Because we are not making any money when betting on the stablest gambles.

So now the goal is to make as much money as possible.
Here is the other strategy we used during the competition, which makes billions of money in the scenario.

Now we do need to consider the class problem inorder to get a really big return in each round. To make the strategy as simple as we can, we only bet on one gamble with the highest return (You could make it onto several gambles but the idea here is quite simple). Because the class will change, we need a function to guess the favorable and unfavorable classes. Here I will introduce the depth we trace in history. We only consider a certain time limit of data, say only the previous round, to obtain information of guessing the class of the gamble. Adapting Machine Learning techniques here might contribute more but then it's more than heuristic and not what I am going to cover.

For each current round we update return probability based on the previous round and linking information. And then we look at the previous rounds, if number of high returns (Hi) is more than number of low returns (Li) in the depth we trace, we consider it to be among favorable classes, and vice versa.
int[] probAdaption // save the number of probabilities being high or low

for (int i = 0; i < n; ++i) {
  if (gi.return
    if (probAdaption[i] >= traceDepth) continue;
    ++probAdaption[i];
    } else if (gi.return == Gamble.LOW) {
      if (probAdaption[i] <= -traceDepth) continue;
      --probAdaption[i];
  }
}
Because we are not sure of the adaptation with this strategy, so we don't actually make P(gihi) +=P(gilow) / 2, P(gilow) /= 2, we give an experimental coefficient to replace the 0.5 coefficient. 

int adaption = probAdaption[id];
while (adaption > 1) {
  adaption = 0;
  double d = lProb * adaptionCoefficient;
  lProb -= d;
  hProb += d;
}
while (adaption < -1) {
  adaption = 0;
  double d = hProb * adaptionCoefficient;
  hProb -= d;
  lProb += d;
}
The last thing is to pick the largest expected return among n gambles. That's it. Our strategy doing portfolio. This strategy returns near 5 - 10 times of the previous round in average.