Channel Topology Agnosticism

The Problem

EEG headsets differ not only in hardware quality but in the number and spatial placement of electrodes. A model trained on 32-channel 10-20 layout data cannot directly process 5-channel Muse S data, and vice versa:

Headset	Channels	Example Positions
Research-grade (32 ch)	32	Full 10-20 international system
Emotiv Epoc X	14	AF3/4, F7/8, F3/4, FC5/6, T7/8, P7/8, O1/2
Emotiv Insight	5	AF3/4, T7/8, Pz
Muse S	5 @ 256 Hz	AF7/8, TP9/10, FpZ
Neuroelectrics Enobio 8	8	FP1/2, AF7/8, P3/4, T9/10

Classical approaches work around this by either interpolating missing channels to a standard montage (introducing artefacts) or training separate models per headset (requiring separate labelled datasets for each).

What Topology Agnosticism Means

A topology-agnostic model maps any variable-length set of electrode channels into the same latent space, while satisfying two key mathematical properties:

Permutation equivariance - the output representation does not change (up to permutation of features) if the order of input channels is permuted. The representation of electrode set \(\{e_1, e_2, e_3\}\) is identical to \(\{e_3, e_1, e_2\}\) after accounting for the permutation. This means the model cannot exploit arbitrary channel ordering as a cue.
Variable input cardinality - the architecture accepts any number of channels \(n \in \{1, 2, \ldots, N_{\max}\}\) without modification or padding.

Together, these properties mean the model treats electrodes as an unordered set rather than a fixed-size ordered vector.

Technical Requirements

Spatial Position Encoding

Permutation equivariance alone is insufficient: the model must also know where each electrode is on the scalp. Without spatial information, an electrode at AF7 (frontal left) and an electrode at O1 (occipital left) are indistinguishable.

The solution is to condition each electrode's representation on its 3D scalp coordinates \((x, y, z)\), derived from the international 10-20 system. These coordinates are projected through a learned embedding:

\[\mathbf{e}_{\text{pos}} = \text{MLP}(x, y, z)\]

This positional embedding is added to the channel's signal representation before attention, ensuring that spatial information is preserved while the attention mechanism remains permutation-equivariant.

Permutation-Equivariant Attention

Multi-head self-attention over channels is inherently permutation equivariant when position encodings are tied to physical coordinates rather than sequence indices. The attention weights between channels \(i\) and \(j\) depend on their content and spatial positions, not on whether electrode \(i\) appeared before electrode \(j\) in the input tensor.

STCPE (Sliding Temporal Conditional Positional Encoding)

DIVER-0 introduces STCPE to achieve simultaneous temporal translation equivariance and channel permutation equivariance. A sliding window conditional position encoder generates position embeddings conditioned on local temporal context, rather than using global absolute positions.

Key Model Implementations

DIVER-0

Han et al. (ICML 2025 Workshop) - arXiv:2507.14141

DIVER-0 is the first fully channel-equivariant EEG foundation model. It achieves consistent representations across all channel permutation conditions, validated empirically. Achieves competitive performance with only 10% of pre-training data due to the strong inductive bias provided by equivariance.

LUNA

Döner et al. - ICML Workshop

LUNA is an efficient topology-agnostic foundation model that reconciles disparate electrode geometries with linear-scaling attention. Pre-trained on >21,000 hours, it achieves 300× fewer FLOPs and 10× less GPU memory than attention-based baselines while maintaining state-of-the-art performance.

LaBraM and CBraMod

Both LaBraM and CBraMod use asymmetric conditional positional encodings or patch-based tokenisation that handles variable channel counts, though they do not enforce full permutation equivariance.

Practical Implications

The benefits of topology agnosticism in operational deployment are significant:

Hardware upgrades - when the lab replaces 8-channel Enobio 8 headsets with 14-channel Emotiv Epoc X headsets, the encoder requires no retraining. Only the downstream decoder head may need a brief fine-tuning step.
Combined training - data from multiple headsets can be combined in a single pre-training run without channel-by-channel alignment preprocessing.
Partial electrode failure - if an electrode loses contact during recording, the model gracefully handles the reduced channel set rather than failing.

Relationship to Device Agnosticism

Channel topology agnosticism and device agnosticism together define a fully hardware-agnostic Brain Foundation Model - one that generalises across both hardware signal characteristics and electrode placement configurations.