Preprocessing Pipeline
Prior to data entering the brain foundation model, raw sensor signals must be transformed into a standardised, denoised version of the data. This pipeline is characterized by the following stages, which are modular and replaceable:
- Stage 1: Artifact Rejection and Denoising
- Stage 2: Bandpass Filtering
- Stage 3: Windowing and Normalisation
Artifact Rejection and Denoising
Raw physiological signals are corrupted by a variety of artifacts:
EEG artifacts:
- Ocular artifacts: eye blinks and saccades produce large slow potentials at frontal electrodes. Removed via ICA or regression-based eye-movement correction.
- Muscle artifacts: electromyographic noise from jaw clenching or facial movements. Addressed by high-frequency filtering and ICA.
- Motion artifacts: electrode movement during physical activity. Particularly relevant for WAUC.
- Channel dropouts: individual electrodes with poor contact produce flat or saturated signals. Detected and interpolated from neighbouring channels.
PPG artifacts:
- Motion artifacts from wrist movement dominate PPG noise. Accelerometry data from the same wristband is typically used in adaptive filtering to subtract motion-correlated components.
Bandpass Filtering
EEG signals are filtered to retain clinically and cognitively meaningful frequency bands:
| Band | Range | Cognitive Association |
|---|---|---|
| Delta | 0.5 – 4 Hz | Deep sleep, unconscious processing |
| Theta | 4 – 8 Hz | Cognitive load, working memory, frontal theta |
| Alpha | 8 – 13 Hz | Relaxed wakefulness; suppressed during tasks |
| Beta | 13 – 30 Hz | Active concentration, motor activity |
| Gamma | 30 – 100 Hz | High-level processing, binding |
For cognitive workload monitoring, theta (frontal) and alpha (occipital) bands are most informative.
Windowing and Normalisation
Windowing
Continuous recordings are segmented into overlapping windows. Window length governs the trade-off between low-frequency frequency resolution and cognitive state stationarity. Typical ranges: 2–10 seconds with 50% overlap.
Normalisation
Each window is independently normalised to remove slow drift and inter-subject amplitude differences:
- Per-channel z-score normalisation within each window.
- Robust normalisation using median absolute deviation to resist outlier contamination.
- Riemannian covariance-matrix normalisation, which is inherently scale-invariant and hardware-agnostic.