News: Welcome to join the STAR Challenge 2024.

A Benchmark for Situated Reasoning in Real-World Videos

What is STAR?

Reasoning in the real world is not divorced from situations. A key challenge is to capture the present knowledge from surrounding situations and reason accordingly. STAR is a novel benchmark for Situated Reasoning, which provides challenging question-answering tasks, symbolic situation descriptions and logic-grounded diagnosis via real-world video situations.

Welcome to evaluate your models on the STAR Evaluation and check results on the STAR Challenge Leaderboard.


The dataset consists of four question types for situated reasoning: Interaction, Sequence, Prediction, and Feasibility. Video situations are decomposed by bottom-up hyper-graphs with atomic entities and relations (e. g., actions, objects, and relationships). Questions are procedurally generated using functional programs based on the situation hyper-graphs.

More Examples

Download & Repository

STAR Overview

Question Types:

  • Interaction Question
  • Sequence Question
  • Predictive Question
  • Feasibility Question

22K Situation Video Clips

60K Situated Questions

140K Situation Hypergraphs

Annotation Statistics

  • 111 action classes
  • 37 entity classes
  • 24 relationship classes

Data Download

Questions, Answers and Situation Graphs

Train json Val json Test json
Train/Val/Test Split File json

Question-Answer Templates and Programs

Question Templates csv
QA Programs csv

Situation Video Data

Video Segments csv
Video Keyframe IDs csv
Raw Videos from Charades(scaled to 480p) mp4 Keyframe Dumping Tool from Action Genome


Classes Files zip
Object Bounding Boxes pkl
Human Poses zip
Human Bounding Boxes pkl

Download from Baidu Yunpan (百度云盘)

Data Download Access Code: 6v8u

STAR Codes and Scripts

The code of the STAR benchmark is available on GitHub. With this code you can:

Visualize the STAR questions, options, and situation graphs

QA Visualization Script

Generate new STAR questions for situations

QA Generation Code


Link to Paper
@inproceedings{wu2021star_situated_reasoning, author = {Wu, Bo and Yu, Shoubin and Chen, Zhenfang, Tenenbaum, Joshua B and Gan, Chuang}, title = {STAR: A Benchmark for Situated Reasoning in Real-World Videos}, booktitle = {Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS)}, year = {2021} }


Bo Wu
MIT-IBM Watson AI Lab

Shoubin Yu
Shanghai Jiao Tong University

Zhenfang Chen
MIT-IBM Watson AI Lab

Joshua B. Tenenbaum

Chuang Gan
MIT-IBM Watson AI Lab