Mizar
  • Whitepaper
    • Abstract
    • C-Mizar
      • Problem
      • Solution
      • Opportunity
      • Product
        • Marketplace
        • DCA Bots
        • API Bots
        • Smart Trading
        • Paper Trading
        • Portfolio Manager
    • D-Mizar
      • Problem
      • Solution
      • Opportunity
      • Product
        • Contract Sniffer
        • Sniper Bot
    • $MZR Token
      • Use Cases
      • Token Metrics
      • Vesting Schedule and Release
      • FAQ
    • Roadmap
      • Supersonic Phase (C-Phase)
      • Hypersonic Phase (D-Phase)
    • Team
  • SDK
    • DCA Bots
      • DCA Bot SDK
      • DCA Bot - TradingView
    • API Bots
      • API Trading SDK
      • API Trading - TradingView
  • Mizar AI (on hold)
    • Mizar AI (on hold)
    • Data Sources
    • Model
      • Downsampling with CUSUM Filter
      • Average Uniqueness
      • Sample Weights
      • Sequentially Bootstrapped Bagging Classifier
      • Metalabeling
      • Bet Sizing
      • Combinatorial Purged Cross Validation
    • Structural Breaks
    • Transformations
      • Labeling Methods
      • Technical Analysis Features
      • Microstructural Features
    • Strategy Backtesting
    • Strategy Deployment
Powered by GitBook
On this page

Was this helpful?

  1. Mizar AI (on hold)
  2. Model

Downsampling with CUSUM Filter

Filtering out the noise and keeping only the informative parts of your data.

PreviousModelNextAverage Uniqueness

Last updated 4 years ago

Was this helpful?

Typically financial time series suffer from a low signal-to-noise ratio. When the entire financial dataset is used the model will focus too much on noisy samples and not enough on highly informative samples. A way to improve the signal-to-noise ratio is to downsample the dataset, but randomly downsampling is not effective as the ratio of noisy to informative sample will persist. Instead one could apply a CUSUM filter which only creates a sample when the next values deviate sufficiently from the previous value.

Consider a locally stationary process generating IID observations {yt}t=1,..,T\{y_t\}_{t=1,..,T}{yt​}t=1,..,T​. The cumulative sums can then be defined as

St=max⁡(0,St−1+yt−Et−1[yt])S_t = \max(0, S_{t-1} +y_t - E_{t-1}[y_t])St​=max(0,St−1​+yt​−Et−1​[yt​])

with boundary condition S0=0.S_{0} = 0.S0​=0. A sample is only created when St≥h,S_{t} \ge h,St​≥h, for some threshold h.h.h.This can be further extended to a symmetric CUSUM filter to include run-ups and run-downs such that

St+=max⁡(0,St−1++yt−Et−1[yt]),S0+=0St−=min⁡(0,St−1−+yt−Et−1[yt]),S0−=0St=max⁡(St+,−St−)S^+_t = \max(0, S^+_{t-1} + y_t - E_{t-1}[y_t]), S^+_0 = 0 \\ S^-_t = \min(0, S^-_{t-1} + y_t - E_{t-1}[y_t]), S^-_0 = 0 \\ S_t = \max(S^+_t, -S^-_t)St+​=max(0,St−1+​+yt​−Et−1​[yt​]),S0+​=0St−​=min(0,St−1−​+yt​−Et−1​[yt​]),S0−​=0St​=max(St+​,−St−​)