Sensor Harmonization
Sensor harmonization refers to the set of techniques that enable the Brain Foundation Model to generalise across different EEG hardware and electrode configurations without requiring full retraining.
The Heterogeneity Problem
EEG recordings in the real world come from a wide variety of sources, and these sources differ along two independent axes:
Hardware axis (device heterogeneity): Different amplifiers have different noise characteristics, input impedance, sampling rates, and gain settings. A signal recorded on a research-grade Neuroscan system differs systematically from the same brain activity recorded on a consumer Emotiv Epoc X, even with identical electrode placement.
Spatial axis (topology heterogeneity): Different headsets place electrodes at different positions on the scalp and use different numbers of channels. A model expecting a 32-channel 10-20 layout cannot directly process 5-channel Muse S data (AF7, AF8, TP9, TP10, FpZ), even if both record the same participant.
Both axes must be addressed for the Brain FM to be practically deployable across diverse operational environments - where sensor choice is often dictated by comfort, cost, or task constraints rather than research standards.
Two Harmonization Goals
The Brain FM addresses these axes through two complementary properties:
| Property | Addresses | Mechanism |
|---|---|---|
| Device Agnosticism | Hardware heterogeneity | Normalisation, multi-hardware training, augmentation |
| Channel Topology Agnosticism | Spatial heterogeneity | Permutation-equivariant attention, coordinate-based positional encodings |
Why Harmonization Matters for Deployment
In operational environments, the available EEG hardware will change over time as sensor technology advances. A Brain FM trained on 32-channel research-grade EEG today would become obsolete if the operational deployment uses 8-channel consumer headsets.
Sensor harmonization future-proofs the foundation model: as long as the new sensor measures EEG (or a related biosignal), the pre-trained encoder can be adapted with minimal retraining, rather than discarding all learned representations.
Relationship to Pre-training Strategy
The most effective approach to sensor harmonization is to train on data from diverse hardware from the outset. A model that has seen 5-channel, 8-channel, 14-channel, and 32-channel recordings during pre-training will naturally learn representations that are invariant to channel count. This is preferable to post-hoc harmonization (e.g. channel interpolation) because it avoids introducing artifacts through signal reconstruction.