Sample Weights
Last updated
Last updated
In financial time series, the samples in the training set do not contain equal amounts of information, ideally the model would focus on significant events. For example, samples wherein the subsequent period a large absolute return can be realised, are more interesting for the model than periods where small returns are made. In addition, it makes intuitive sense that recent information is more valuable than dated information in financial markets. It therefore desirable for the model to place more emphasis on recent information than dated information.
These two concepts can be formalised by calculating sample weights for each sample. These weights are then used by the model to place more emphasis on samples with a high weight.
Return Attribution
To calculate the sample weight based on the sample's return, we transform the prices to log prices such that the sum of log prices is approximately equal to the return over that period. The weight for samplewith a lifespan between can be calculated as follows
where is the number of concurrent events, i.e. the number of samples that (partially) overlap in the period . The weights are then normalised such that they sum one.
Time Decay
It is possible to assign more weight to recent samples than older samples by calculating time decay factor To calculate these time decay factors, we use the array with the average uniqueness where the most recent sample always receives a weight of 1. The user can control the amount of time decay with parameter The weight of the oldest sample is for When the decay factor is 0 for some samples, which implies the model will fully ignore these samples. For other samples, the decay factor can be computed with a linear piecewise function defined as