The streaming SMA quietly lies on long crypto backtests — and how Backticks fixes it

The fast running-sum SMA accumulates IEEE-754 error on long crypto backtests — enough to flip crossover signals. Here's how the bug works, and four ways to fix it.

8 min read

The streaming SMA is one of the most-used patterns in algorithmic trading. Instead of recomputing the average every bar, it keeps a running sum, adds the incoming close, subtracts the leaving one, divides:

class SMA {
  constructor(period) {
    this.period = period;
    this.arr = [];
    this.sum = 0;
    this.filled = false;
  }
  nextValue(value) {
    this.filled = this.filled || this.arr.length === this.period;
    this.arr.push(value);
    if (this.filled) {
      this.sum -= this.arr.shift();
      this.sum += value;
      return this.sum / this.period;
    }
    this.sum += value;
  }
}

Compared to the naive every-bar resum:

function sma(data, period) {
  const result = [];
  for (let i = 0; i <= data.length - period; i++) {
    const chunk = data.slice(i, i + period);
    const sum = chunk.reduce((acc, num) => acc + num, 0);
    result.push(sum / period);
  }
  return result;
}

The streaming version is dramatically faster — orders of magnitude on long series — and it’s the right call for real-time trading and most backtests.

But it has a subtle issue almost nobody catches until it bites them: on long series with large-magnitude values (think BTC m1 over a year), the running sum accumulates floating-point error, and after enough bars the streaming SMA disagrees with the naive resum SMA by enough to flip crossover signals.

Why the running sum drifts

Every + and - on a JS Number (IEEE-754 float64) rounds to the nearest representable value. The error per op is ~10⁻¹⁶ relative — tiny. But it compounds when:

  1. There are a lot of operations on the same accumulator (525,600 ops per year of m1 bars).
  2. The values added or subtracted have a very different magnitude from the running total.

BTC at $90k with sub-dollar tick variations hits #2 hard. Every this.sum -= this.arr.shift() is subtracting a value where the low-order bits matter, but they’re being silently truncated relative to the accumulator’s magnitude. Equities backtests over 5–10 years have the same shape; crypto just exposes it faster because of the magnitude gap.

How bad, concretely

The experiment fits in 30 lines:

const closes = loadBtcM1Closes(); // ~525k floats
const period = 200;
const streaming = new SMA(period);

let maxDiff = 0;
let phantomCrossings = 0;
let prevDiffSign = 0;

for (let i = period; i < closes.length; i++) {
  const fast = streaming.nextValue(closes[i]);
  const slow = avg(closes.slice(i - period + 1, i + 1));
  const d = fast - slow;
  if (Math.abs(d) > Math.abs(maxDiff)) maxDiff = d;
  const sign = Math.sign(d);
  if (sign !== 0 && sign !== prevDiffSign) phantomCrossings++;
  prevDiffSign = sign;
}

Representative numbers on BTC m1, period 200, one year of data: max absolute diff in the 10⁻⁴ to 10⁻³ range in absolute price, with non-trivial sign-flip events scattered along the series. If a strategy uses an SMA crossover and it’s optimised over a long history, the optimiser is partly fitting to floating-point noise.

This is exactly the class of bug that “passes locally”, “passes on a recent slice”, and only shows up in a long-horizon optimisation run when one of the candidate parameter sets gets a slightly better score because two indicators crossed in a way they wouldn’t have in real life.

Four ways to fix it

Kahan-compensated streaming sum

Track the bits each addition rounds away in a separate compensation variable and fold them back on the next op:

class SMAKahan {
  constructor(period) {
    this.period = period;
    this.arr = [];
    this.sum = 0;
    this.c = 0; // compensation
    this.filled = false;
  }
  _add(x) {
    const y = x - this.c;
    const t = this.sum + y;
    this.c = (t - this.sum) - y;
    this.sum = t;
  }
  nextValue(value) {
    this.filled = this.filled || this.arr.length === this.period;
    this.arr.push(value);
    if (this.filled) {
      this._add(-this.arr.shift());
      this._add(value);
      return this.sum / this.period;
    }
    this._add(value);
  }
}

Cost: ~4 ops per add instead of 1. Error growth becomes bounded instead of linear in N. Handles most cases.

Neumaier (improved Kahan)

Neumaier’s variant handles the edge case where the value being added is larger in magnitude than the accumulator. Same cost as Kahan, strictly better:

_add(x) {
  const t = this.sum + x;
  if (Math.abs(this.sum) >= Math.abs(x)) {
    this.c += (this.sum - t) + x;
  } else {
    this.c += (x - t) + this.sum;
  }
  this.sum = t;
}
// Final corrected sum: this.sum + this.c

If you’re rewriting anyway, use Neumaier. Same complexity, zero downsides.

Pairwise (cascade) summation

Maintain a small binary tree of partial sums instead of a single accumulator. Adds are O(log N), error growth is O(log N) instead of O(N). Useful when inputs are pathological (alternating signs, residual sums) but overkill for a plain SMA.

Hybrid: streaming + periodic resum

The pragmatic answer. Use the streaming sum (Kahan or naive), but every K bars (e.g., 10,000) recompute the sum from scratch over the actual window. Resets the error.

nextValue(value) {
  this.filled = this.filled || this.arr.length === this.period;
  this.arr.push(value);
  if (this.filled) {
    this.sum -= this.arr.shift();
    this.sum += value;
    this.tick++;
    if (this.tick % 10_000 === 0) {
      this.sum = this.arr.reduce((a, b) => a + b, 0);
    }
    return this.sum / this.period;
  }
  this.sum += value;
}

Cost: amortised O(N/K) — for K=10k, negligible. Bounds the worst-case error to whatever accumulates within K bars. Combine with Kahan for belt-and-suspenders.

This is the most realistic answer for production code: simple to reason about, bounded by construction, no exotic math.

Speed vs precision — what to actually use

MethodCost vs naive streamingError growthNotes
Naive streaming sumO(N)Don’t, on long series
Kahan~4×~O(1)Sufficient for most cases
Neumaier~4×~O(1)Strictly better than Kahan
Pairwise~log N×O(log N)Overkill for SMA
Hybrid (streaming + resum every 10k)~1× amortisedbounded by KBest practical default
Slow resum every bar~N×0For verification only

Beyond SMA

The same problem applies to:

  • EMA — exponential decay damps old contributions, but still drifts in long runs with extreme prices.
  • Variance / std dev / Bollinger — naive E[X²] - E[X]² is a precision disaster on BTC. Use Welford’s online algorithm (compensated by construction).
  • Rolling correlation / covariance — same family.
  • PnL aggregation in optimisation — summing micro-PnLs across thousands of generations of millions of trades. Compensate everywhere.

If an indicator implementation maintains any long-lived float accumulator, it needs to be audited.

A Backticks backtest report: BTCUSDT 1h with smaTrend (SMA-50), Bollinger upper / middle / lower, and RSI(14) — every indicator value lines up with the slow-resum reference at every bar.

How Backticks handles this

Every indicator in the Backticks node catalogue — SMA, EMA, RSI, MACD, Bollinger, ATR, Stochastic, SuperTrend, Aroon, PSAR — is implemented with the long-horizon precision profile in mind:

  • Compensated accumulators (Neumaier-class) for any rolling sum.
  • Welford’s algorithm for variance / std-dev–derived indicators (Bollinger, Z-score, etc.).
  • Periodic-resum hybrid as a default safety net for the running-window family.
  • Float drift is tested against a slow-resum reference on a long BTC m1 corpus as part of indicator regression suites.

When a strategy in Backticks reads an SMA(200) value 200,000 bars deep, the value matches the slow-resum reference to within float precision — not “close enough plus accumulated noise”. That stability is what makes parameter search on long histories meaningful: the optimiser fits real strategy structure, not floating-point ghosts.

Audit checklist for any backtester

If you’re using something other than Backticks (or you’ve rolled your own indicators), here’s the audit:

  1. For every streaming indicator, write the slow resum reference.
  2. Run both on a long real series (1+ year of BTC m1 — pathological enough to expose bugs fast).
  3. Compute max absolute diff and the count of sign-flip events. If max diff grows monotonically with bar count, the error is unbounded.
  4. Replace the naive accumulator with Kahan / Neumaier / hybrid resum.
  5. Re-run, confirm diff is bounded and small enough that strategy logic doesn’t see phantom signals.

An afternoon’s work, saves a class of bug that’s invisible until it isn’t.