Predicting brain response to ads with TRIBE v2: what works, what doesn't, and the license cliff
Meta released a foundation model that predicts fMRI brain responses to video, audio, and text. Here's an honest walkthrough of what it can and can't tell you about ad creative - and why CC-BY-NC means most of you can't legally touch it.
Read this first: TRIBE v2 is CC-BY-NC.
Non-commercial use only. That includes using TRIBE outputs to inform paid creative work, in-house brand campaigns, freelance client deliverables, or any product feature. Most ad-industry use cases are excluded. There's an interactive license-checker further down - start there if you're considering this for any client-facing work.
Read the full CC-BY-NC 4.0 licenseWhat TRIBE actually does
Imagine you put someone in an MRI scanner and showed them a 30-second ad. The scanner measures, second by second, which parts of their brain are working hardest. Visual parts when there's motion. Sound parts when there's audio. Language parts when there's text or VO.
TRIBE is a model that predicts what that scanner would show - without ever putting anyone in a scanner. You feed it a video, it gives you a brain map of what would have lit up if a real person had watched it.
This is genuinely cool. It's also several careful steps removed from 'will this ad convert' - brain activation is attention, not preference. And the license says you can only do this for research, not for paid creative work.
In one line: TRIBE predicts what brains do, not what people buy. Treat it like a research microscope, not a performance dashboard.
cortical vertices predicted per timestep
hemodynamic lag offset
modalities - vision, audio, language
commercial uses permitted (CC-BY-NC)
Can you actually use TRIBE for this?
TRIBE v2 is released under CC-BY-NC 4.0 - non-commercial only. Pick the closest match to your situation. (Not legal advice; consult counsel for anything material.)
Running TRIBE yourself (academic / hobby use only).
The cleanest path: open the official Colab notebook (link in sources). Otherwise the local install is one-line; weights pull from HuggingFace.
Install
pip install git+https://github.com/facebookresearch/tribev2Load the pretrained model
from tribev2 import TribeModel
model = TribeModel.from_pretrained(
"facebook/tribev2",
cache_folder="./cache",
)Predict from a video
.mp4 video, audio, or text. Predictions land on the fsaverage5 cortical mesh (~20k vertices).df = model.get_events_dataframe(
video_path="path/to/your_ad.mp4",
)
preds, segments = model.predict(events=df)
print(preds.shape)
# (n_timesteps, n_vertices)Slice by ROI
# pseudo - see tribev2/plotting and utils_fmri.py
roi_means = aggregate_by_roi(preds, atlas="fsaverage5")
v1_response = roi_means["V1"]
ffa_response = roi_means["FFA"]
# ... per-region time series across the adMind the offset
Eight regions worth watching in ad analysis.
Each region recruits on different ad elements. Click one for what it does, what likely activates it in advertising creative, and the honest caveat about reading too much into TRIBE's predictions for it.
Primary visual cortex
What it does: First stop for visual input. Edges, contrast, color, orientation.
Ad relevance: Recruited by every visible ad. High activation early in the cut indicates the brain is doing real visual work - pattern interrupts, high-contrast first frames, saturated color all hit this.
- Pattern interrupt with hard color shift in frame 1
- Bold text-on-image hooks
- High-contrast product reveals
Strong V1 activation is necessary but not sufficient. Lots of bad ads light up V1 just fine - it's a baseline of attention, not a predictor of conversion.
Honest accounting
Five overclaims to avoid
Neuromarketing has a long history of overclaiming. TRIBE is a powerful tool but not a magic one - these are the five claims most likely to embarrass you in front of a neuroscientist.
Use it as a microscope, not a dashboard.
TRIBE is a research tool. Used in that frame - academic exploration, hypothesis generation about which cortical regions different creative grammars recruit, validating that an attention-grabbing ad actually grabs visual + face cortex - it's genuinely valuable. It can sharpen your intuitions about why certain hooks work neurally.
For commercial creative work, the path forward is behavioral signal (Andromeda's hold rate, click-through, conversion) measured on actual variations shipped to actual audiences. Shuttergen ships those variations on the structural axes (hook archetype, format, audio source) the brain regions above suggest matter - but the measurement loop runs on real performance data, not modeled BOLD, because the license requires it and the science supports it.
The playbook
Eight rules for using TRIBE responsibly
your team's coverage
Sources
What we read to build this
TRIBE v2 GitHub repository
facebookresearch / Meta
TRIBE v2 paper: A foundation model of vision, audition, and language for in-silico neuroscience
Meta AI Research
TRIBE v2 weights on HuggingFace
facebook/tribev2
Live demo (Meta AI Demos)
Meta AI
Colab demo notebook
Meta AI Research
CC-BY-NC 4.0 license summary
Creative Commons
The science is fascinating. The license is hard.
If you're researching, run TRIBE - it's an extraordinary tool. If you're shipping commercial creative, Shuttergen runs the structural-variation playbook the brain-region work above suggests matters, with the behavioral-signal loop that actually measures whether it works.
Get started free