UNIVERSE Dataset
"Unobtrusive measurement of cognitive load and physiological signals in uncontrolled environments"
Overview
UNIVERSE is a ~315-hour multimodal psychophysiological dataset designed specifically for cognitive load monitoring in both controlled and uncontrolled environments.
It is one of the largest publicly available EEG-based cognitive load datasets and is a primary pre-training resource for the Brain FM system.
Study Design
- 24 participants completed an eight-hour cognitive load elicitation protocol per participant.
- Data is balanced across two environmental conditions and two workload levels.
Tasks
| Environment | Type | Example Tasks |
|---|---|---|
| Controlled (~half the data) | Abstract stimuli, standardised | Mental arithmetic, Stroop task, N-Back (1-back, 2-back, 3-back), Sudoku |
| Uncontrolled (~half the data) | Naturalistic, ecologically valid | Researching online, programming, writing emails (home-office simulation) |
The inclusion of uncontrolled tasks significantly increases ecological validity compared to lab-only datasets. Most EEG cognitive datasets use only abstract stimuli (N-Back, Stroop) that differ substantially from real-world cognitive work. UNIVERSE's home-office simulation tasks - programming, email writing, web browsing - are directly relevant to knowledge-work environments and provide a bridge to operational deployment.
Cognitive Labels
UNIVERSE collects multiple complementary subjective measures:
| Instrument | Description | Scale |
|---|---|---|
| NASA-TLX | Composite and per-dimension workload ratings | 1–21 per dimension |
| Likert scales | Single-question workload and engagement ratings | 1–5 or 1–7 |
| Affective Sliders | Continuous valence and arousal ratings | 0–1 continuous |
| PANAS | Positive and Negative Affect Schedule | 1–5 per item, 20 items |
The combination of NASA-TLX (task-focused workload), Affective Sliders (continuous affect), and PANAS (mood) enables multi-target learning: a single BFM fine-tuning run can jointly predict workload, valence, and arousal, providing a richer cognitive state profile.
Sensor Specifications
Muse S Headband (EEG)
| Property | Value |
|---|---|
| Channels | 5 |
| Electrode positions | AF7, AF8, TP9, TP10, FpZ |
| Sampling rate | 256 Hz |
| Form factor | Consumer headband; no gel required |
| Reference | FpZ |
The Muse S is a consumer-grade wearable EEG headset with frontal (AF7, AF8) and temporal-parietal (TP9, TP10) coverage. Its limited channel count and non-standard electrode placement (compared to the full 10-20 system) represent a realistic deployment constraint - the Brain FM must work with this reduced coverage.
Additional Sensors
| Sensor | Signal | Sampling Rate |
|---|---|---|
| Empatica-compatible | EDA (electrodermal activity) | 4 Hz |
| Empatica-compatible | PPG (photoplethysmography) | 64 Hz |
| Empatica-compatible | Skin temperature | 4 Hz |
| Wrist accelerometer | Accelerometry (3-axis) | 32 Hz |
Why UNIVERSE is Valuable
| Property | Value for Brain FM |
|---|---|
| Scale | 315 hours far exceeds most cognitive load datasets; enables meaningful SSL pre-training |
| Ecological validity | Uncontrolled home-office tasks generalise beyond lab settings |
| Multi-label | NASA-TLX + Affective Sliders + PANAS enables multi-target cognitive state modelling |
| Consumer EEG | Muse S tests viability of low-cost, minimal-channel monitoring |
| Multimodal | Simultaneous EEG + PPG + EDA enables cross-modal alignment for PPG and EDA encoders |
Data Structure
UNIVERSE/
├── participant_XX/
│ ├── eeg/
│ │ └── session_YY.csv # 5 channels × 256 Hz
│ ├── eda/
│ │ └── session_YY.csv # 4 Hz
│ ├── ppg/
│ │ └── session_YY.csv # 64 Hz
│ ├── labels/
│ │ └── session_YY_nasatlx.csv
│ │ └── session_YY_affect.csv
│ └── metadata/
│ └── participant_info.json
Relevance to the Brain FM Project
UNIVERSE is used for:
- SSL pre-training - 315 hours of Muse S EEG provides the largest consumer-grade EEG pre-training source in the project.
- Workload fine-tuning - NASA-TLX labels enable workload decoder training.
- Multi-state fine-tuning - Affective Sliders and PANAS enable joint workload + stress/affect decoder training.
- PPG cross-modal alignment - Simultaneous EEG + PPG enables EEG-to-PPG alignment for the Underrepresented Modalities research thread.