Tutorial 1: Introduction to NeuroVLM

Tutorial 1: Introduction to NeuroVLM#

This tutorial provides an overview of NeuroVLM, a multimodal framework for text-to-brain and brain-to-text applications in neuroimaging. You’ll learn about:

Fetching models and datasets from HuggingFace
Types of models and architectures
Available datasets
Basic concepts: text-to-brain and brain-to-text

1. Setup and Installation#

First, let’s import the necessary modules and fetch the pre-trained models and datasets.

import os

os.environ["USE_TF"] = "0"
os.environ["USE_FLAX"] = "0"
os.environ["TOKENIZERS_PARALLELISM"] = "false"

from neurovlm import NeuroVLM
from neurovlm.data import fetch_data, load_dataset, load_latent

# Fetch all models and datasets from HuggingFace
fetch_data()

Downloading dataset: neurovlm/neuro_image_papers

Downloading dataset: neurovlm/neuro_wiki

Downloading dataset: neurovlm/cognitive_atlas

Downloading dataset: neurovlm/embedded_text

Downloading model: neurovlm/encoder_and_proj_head

Downloading SPECTER: allenai/specter2_aug2023refresh_base

Downloading SPECTER: allenai/specter2_aug2023refresh_adhoc_query

Data fetch complete. Cache directory: /home/rph/.cache/huggingface/hub

'/home/rph/.cache/huggingface/hub'

2. Model Architectures#

NeuroVLM uses several model architectures working together:

Text Encoder (SPECTER2)#

Encodes scientific text (titles, abstracts) into 768-dimensional embeddings
Pre-trained on scientific literature for domain-specific understanding
Handles variable-length text input

Autoencoder#

Encoder: Compresses 28,542-dimensional brain activation maps to 384-dimensional latent space
Decoder: Reconstructs brain maps from latent representations
Enables efficient storage and manipulation of neuroimaging data

Projection Heads#

Contrastive (InfoNCE): Projects text and brain embeddings into shared space for retrieval
Generative (MSE): Projects text embeddings to brain latent space for generation

Architecture Overview#

Text → SPECTER2 → Projection Head → Shared Space ← Projection Head ← Autoencoder.Encoder ← Brain
                                                                                             |
Text → SPECTER2 → Projection Head → Brain Latent → Autoencoder.Decoder → Generated Brain  ←┘

3. Available Datasets#

NeuroVLM includes several curated datasets:

Text Datasets#

PubMed: ~30K neuroimaging publications with titles and abstracts
NeuroWiki: Neuroscience concepts from Wikipedia
Cognitive Atlas: Cognitive concepts, tasks, and disorders
Networks: Canonical brain network descriptions

Brain Image Datasets#

PubMed Images: Brain activation maps from published studies
NeuroVault: Community-contributed brain maps
Network Atlases: Canonical brain networks (multiple atlases)

Let’s explore these datasets:

# Load text datasets
publications = load_dataset("pubmed_text")
print(f"PubMed publications: {len(publications)} papers")
print(f"Columns: {list(publications.columns)}")
print("\nExample publication:")
publications.head(10)

PubMed publications: 30826 papers
Columns: ['pmid', 'pmcid', 'doi', 'name', 'description', 'train', 'test', 'val']

Example publication:

	pmid	pmcid	doi	name	description	train	test	val
0	24911975	NaN	10.1371/journal.pone.0099222	Acute aerobic exercise increases cortical acti...	There is increasing evidence that acute aerobi...	True	False	False
1	22884992	NaN	10.1016/j.dcn.2012.07.001	Developmental differences in the neural correl...	Despite vast knowledge on the behavioral proce...	True	False	False
2	15722210	NaN	10.1016/j.cogbrainres.2004.09.011	The neural substrate of arithmetic operations ...	Recent functional neuroimaging studies have be...	True	False	False
3	21930137	NaN	10.1016/j.neuropsychologia.2011.09.006	Neural processing associated with comprehensio...	In daily communication, we often use indirect ...	True	False	False
4	21930160	NaN	10.1097/gme.0b013e3181cc49e9	Postmenopausal hormone use impact on emotion p...	Despite considerable evidence for potential ef...	True	False	False
5	21930304	NaN	10.1016/j.jad.2011.08.020	Neural correlates of disbalanced motor control...	BACKGROUND: Motor retardation is a common symp...	True	False	False
6	21932265	NaN	10.1002/hbm.21385	Propofol disrupts functional interactions betw...	Current theories suggest that disrupting corti...	True	False	False
7	21932266	NaN	10.1002/hbm.21373	Functionally distinct regions for spatial proc...	There has been much debate recently over the f...	True	False	False
8	21933718	NaN	10.1016/j.neuroimage.2011.09.004	Imitation components in the human brain: an fM...	Human ability to imitate movements is instanti...	False	True	False
9	21935391	NaN	10.1371/journal.pone.0024253	Spatial language processing in the blind: evid...	Neuropsychological and imaging studies have sh...	True	False	False

# Cognitive Atlas concepts
cogatlas = load_dataset("cogatlas")
print(f"\nCognitive Atlas concepts: {len(cogatlas)}")
print("\nExample concepts:")
cogatlas.head(10)

Cognitive Atlas concepts: 911

Example concepts:

	term	definition	alias
0	abductive reasoning	The process of adopting an explanatory hypothe...	None
1	abstract analogy	High-level analogy that retains general inform...	None
2	abstract knowledge	Knowledge that is general and not tied to a sp...	None
3	acoustic coding	A type of short term memory coding that repres...	None
4	acoustic encoding	The processing and encoding of auditory input ...	None
5	acoustic phonetic processing	The cognitive ability to discriminate items on...	None
6	acoustic processing	The extraction of information from signals pro...	None
7	action	The bringing about of an alteration by force o...	None
8	action initiation	The facilitation or initiation of an act.	None
9	action perception	The perception of an action being performed by...	None

# NeuroWiki
neurowiki = load_dataset("wiki")
print(f"\nNeuroWiki entries: {len(neurowiki)}")
print("\nExample entries:")
neurowiki.head(10)

NeuroWiki entries: 38410

Example entries:

	title	summary	cos_sim_summary	cos_sim_title	cos_sim_avg	cos_sim_invalid	cos_sim	id
0	Membrane potential	Membrane potential (also transmembrane potenti...	0.772090	1.000000	0.886045	0.167787	0.718258	833aec91
1	Ion channel	Ion channels are pore-forming membrane protein...	0.703097	1.000000	0.851549	0.138511	0.713037	5736c326
2	Calcium imaging	Calcium imaging is a microscopy technique to o...	0.802223	1.000000	0.901111	0.232906	0.668205	0c744fe2
3	Signal processing	Signal processing is an electrical engineering...	0.757949	1.000000	0.878974	0.219474	0.659500	a5c71b00
4	Ion channel pore	There are two distinctive features of ion chan...	0.703097	0.777337	0.740217	0.101555	0.638663	f9d03533
5	Corpus callosum	The corpus callosum (Latin for "tough body"), ...	0.719148	1.000000	0.859574	0.226570	0.633003	1778086e
6	Complications of traumatic brain injury	Traumatic brain injury (TBI, physical trauma t...	0.714219	0.872134	0.793176	0.162419	0.630757	259064c0
7	Metacognition	Metacognition is an awareness of one's thought...	0.779177	1.000001	0.889589	0.264607	0.624982	a3d483d8
8	Huntington's disease-like syndrome	Huntington's disease-like syndromes (HD-like s...	0.727485	0.916222	0.821854	0.198028	0.623825	a1db5686
9	Fourier transform	In mathematics, the Fourier transform (FT) is ...	0.637224	1.000000	0.818612	0.194800	0.623812	34567425

# Network atlases
networks = load_dataset("networks_canonical")
print(f"\nCanonical networks: {len(networks)}")
print("\nExample networks:")
networks.head(10)

Canonical networks: 8

Example networks:

	title	description
0	Language	Language network (LAN; perisylvian language ne...
1	Auditory	Auditory network (AUD; auditory cortex network...
2	Default Mode	Default mode network (DMN; default network; de...
3	Frontoparietal Control	Frontoparietal control network (FPCN; frontopa...
4	Attention	Dorsal attention network [SEP] Primary regions...
5	Visual	Visual network (VIS; occipital visual network)...
6	Motor	Motor network (motor/sensorimotor network; SMN...
7	Cingulo-Opercular	Salience network (SN; cingulo-opercular networ...

4. Text-to-Brain and Brain-to-Text#

NeuroVLM supports bidirectional querying:

Text-to-Brain#

Given a text query, NeuroVLM can:

Generate brain activation patterns (generative approach)
Retrieve similar brain maps from datasets (contrastive approach)

Brain-to-Text#

Given a brain activation map, NeuroVLM can:

Retrieve related scientific text, concepts, or descriptions
Generate text descriptions using language models (see Tutorial 4)

5. Quick Examples#

Let’s see both directions in action:

# Initialize the model
# Note: Models are lazy-loaded on first use (not here).
# The first call to .text() will load SPECTER (~500MB transformer) into RAM,
# which typically takes 1-3 minutes. All subsequent calls are < 5 seconds.
nvlm = NeuroVLM(device="cpu")

print("Model initialized successfully!")

Model initialized successfully!

Text-to-Brain: Generate brain maps from text#

# Generate a brain map from text
# First run: loads SPECTER + projection heads + autoencoder into RAM (~1-3 min total)
# Subsequent runs in the same kernel session: < 5 seconds
# All computation runs on CPU - no MPS or CUDA used
result = nvlm.text("visual processing").to_brain(head="mse")

# Plot the generated brain map
result.plot(threshold=0.2);

There are adapters available but none are activated for the forward pass.

../_images/64b7884ea63624a10d6955515406e75d6ca0983721e606f51b367de4914bd14c.png

Text-to-Brain: Retrieve similar brain maps#

# Find brain maps similar to the text query
result = nvlm.text("working memory").to_brain(head="infonce")

# Show top matches
top = result.top_k(3)
top

	dataset	dataset_index	title	description	cosine_similarity
0	networks	38	HCPICA	ICA11	0.322152
1	networks	10	Du	FPN-A	0.300826
2	networks	84	Shirer	Visuospatial	0.290263
3	neurovault	833	Exploring the role of the posterior middle tem...	Making sense of the world around us depends up...	0.403994
4	neurovault	1105	Identification of Two Distinct Working Memory-...	Working memory (WM) is an important cognitive ...	0.398966
5	neurovault	2900	Evidence for Hierarchical Cognitive Control in...	In non-habitual situations, cognitive control ...	0.377900
6	pubmed	1581	Control networks and hemispheric asymmetries i...	Neuropsychological research has consistently d...	0.405711
7	pubmed	6428	Salience maps in parietal cortex: imaging and ...	Models of spatial attention are often based on...	0.405312
8	pubmed	22999	Effects of in-Scanner Bilateral Frontal tDCS o...	Working memory is an executive memory process ...	0.399832

# Visualize the top match
top.plot_row(6, threshold=0.1);

../_images/f5c1c740afef03aad0c2bdaf8e2060ff92dc2e5bbb7f954a9547c6021fa0ab8a.png

Brain-to-Text: Find text descriptions for brain maps#

# Load example network atlases
networks_neuro = load_latent("networks_neuro")

# Use the Default Mode Network as a query
dmn = networks_neuro["Du"]["AUD"]

# Find related text
result = nvlm.brain(dmn).to_text()
top = result.top_k(5).query("cosine_similarity > 0.4") # return up to 5 examples per dataset within threshold
top

	dataset	title	description	cosine_similarity
0	cogatlas	auditory stream segregation	The perceptual grouping of sounds to form cohe...	0.437669
1	cogatlas	auditory encoding	The process of storing auditory information in...	0.431347
2	cogatlas	music cognition	The processing of mental functions on auditory...	0.431245
3	cogatlas	acoustic phonetic processing	The cognitive ability to discriminate items on...	0.423635
4	cogatlas	auditory tone detection	Determining the presence of an auditory stimul...	0.417665
5	networks	Auditory	Auditory network (AUD; auditory cortex network...	0.470543
10	pubmed	The processing of temporal pitch and melody in...	An fMRI experiment was performed to identify t...	0.526884
11	pubmed	Heschl's gyrus, posterior superior temporal gy...	A part of the auditory system automatically de...	0.504547
12	pubmed	Hierarchical processing of sound location and ...	Horizontal sound localization relies on the ex...	0.499913
13	pubmed	Brain bases for auditory stimulus-driven figur...	Auditory figure-ground segregation, listeners'...	0.490401
14	pubmed	Dichotic pitch activates pitch processing cent...	Although several neuroimaging studies have rep...	0.485120
15	wiki	Temporal masking	Temporal masking or non-simultaneous masking o...	0.489581
16	wiki	Melodic expectation	In music cognition and musical analysis, the s...	0.454363
17	wiki	Search by sound	Search by sound is the retrieval of informatio...	0.452222
18	wiki	Beat (acoustics)	In acoustics, a beat is an interference patter...	0.447247
19	wiki	Harmonic series (music)	The harmonic series (also overtone series) is ...	0.446160

6. Summary#

In this tutorial, you learned:

How to fetch models and datasets from HuggingFace using fetch_data()
Model architectures in NeuroVLM:
- SPECTER2 text encoder
- Autoencoder for brain compression
- Contrastive and generative projection heads
Available datasets:
- Text: PubMed, NeuroWiki, Cognitive Atlas, Networks
- Brain: PubMed images, NeuroVault, Network atlases
Text-to-brain and brain-to-text concepts

In the following tutorials, you’ll learn:

Tutorial 2: Contrastive retrieval for brain-to-text and text-to-brain
Tutorial 3: Generative text-to-brain mapping
Tutorial 4: Generative brain-to-text with LLMs