Data Augmentation in Time Series Analysis

Placeholder image (you can replace this later).

Overview

This post is a working draft where I summarize key ideas, practical tricks, and findings from my two papers. I will update this page progressively with experiments and insights.

Core Idea

Data augmentation improves model generalization by creating meaningful transformed samples that preserve semantic structure. In forecasting settings, this can be viewed as sampling from a transformation family $T_\theta$ and optimizing expected risk:

\[ \min_f \; \mathbb{E}_{(x,y)\sim \mathcal{D}}\;\mathbb{E}_{\theta\sim p(\theta)}\left[\ell\big(f(T_\theta(x)), y\big)\right]. \]

Simple Taxonomy

Time-domain transforms: jittering, scaling, window slicing, permutation.
Frequency-domain transforms: spectral filtering, wavelet-based masking and mixing.
Sample-mixing transforms: linear or patch mixing between compatible series.

Code Sketch

import numpy as np

def jitter(x, sigma=0.02):
    noise = np.random.normal(0.0, sigma, size=x.shape)
    return x + noise

def scaling(x, low=0.9, high=1.1):
    factor = np.random.uniform(low, high)
    return x * factor

Key Findings (Placeholder)

I will add concise bullet-point takeaways from both papers here, including which augmentations worked best under distribution shift, and when they did not help.

Comments

Commenting is enabled via Disqus. To activate on your site, replace your-disqus-shortname below with your Disqus shortname.