The 2026 ACII Dyadic Contest (DaiKon) Workshop & Challenge

Introducing a newly emphasized dimension of affective behavior critical to social interaction

Panagiotis Tzirakis, Alice Baird, Jeffrey Brooks, Emilia Parada-Cabaleiro, Lukas Stappen, Sharath Rao, Theo Lebryk

Sponsored by Hume AI, the 2026 ACII Dyadic Contest (ACII-DaiKon) Workshop & Challenge introduces a novel benchmark for modeling interpersonal affect and social dynamics in dyadic conversations. While conversational affect modeling has advanced rapidly in recent years, most existing benchmarks and shared tasks remain largely speaker-centric, focusing on individual predictions rather than the coupled, time-evolving processes that arise between interacting individuals. In contrast, ACII-DaiKon emphasizes the relational and dynamic nature of human interaction, targeting phenomena such as directional influence, and emotional contagion. The challenge is structured as a three-track benchmark designed to capture these dynamics at scale, incorporating both temporal structure and cross-cultural variability.

To the best of the organizers’ knowledge, this is the first ACII workshop, and one of the first efforts in the broader machine learning community, to provide a dedicated, large-scale benchmark specifically focused on dyadic interpersonal dynamics.

Other Topics

For those interested in submitting research to the DaiKon workshop outside of the competition, we encourage contributions related to the following topics:

  • Modeling interpersonal affect and social dynamics

  • Dyadic and multi-party interaction modeling

  • Temporal modeling of affect trajectories

  • Cross-modal fusion for social signals (audio, video, text, physiology)

  • Representation learning for social and emotional behavior

  • Cross-Cultural Dyadic Interactions

Submission details TBA.

Important Dates (AoE)

  • Challenge Opening (data and baselines released): April 6, 2026

  • Baseline Paper released: April 20, 2026

  • Test set released: May 20, 2026

  • All Tracks submission deadline: May 25, 2026 (submit test set labels to competition@hume.ai)

  • Workshop paper submission: May 30, 2026

  • Notification of Acceptance/Rejection: July 3, 2026

  • Camera Ready: July 10, 2026

  • Conference: 7-10 September, Pueblo Mexico

Challenge Tasks

The Influence Sub-Challenge, a dyadic affect prediction task (DaiKon Influence). In the DaiKon Influence sub-challenge, participants will predict a target speaker's affective state for each labeled speech segment in a dyadic conversation. Given the multimodal conversational context, systems will output continuous intensity estimates for 10 emotion dimensions for the target segment: anger, anxiety, uncertainty, confusion, doubt, boredom, surprise, curiosity, joy, and amusement. Participants will report Concordance Correlation Coefficient (CCC), as well as Pearson correlation, averaged across the target emotion dimensions.

The Turn-Taking Sub-Challenge, a conversational timing and speaker prediction task (DaiKon Turn Taking). In the DaiKon Turn Taking sub-challenge, participants will predict who speaks next and when the next speech onset occurs. Systems will output (i) next-speaker prediction as a classification task and (ii) time-to-next-speech as a regression task. Participants will report Macro-F1 and accuracy for next-speaker prediction, and MAE for time-to-next-speech.

The Rapport Sub-Challenge, a time-evolving interaction quality prediction task (DaiKon Rapport). In the DaiKon Rapport sub-challenge, participants will predict rapport for labeled windows throughout a dyadic conversation, rather than only a single conversation-level score. Given the multimodal conversational context, systems will output a continuous rapport score for each labeled window. Participants will report Concordance Correlation Coefficient (CCC), as well as Pearson correlation, between predicted and ground-truth rapport scores, averaged across conversations.

Challenge Baselines

The organizers have prepared a set of multimodal baselines for each of the three tasks. To reproduce baselines, please find the code at github.com/HumeAI/competitions.

Sub-Challenge Modality Val Test
Influence (CCC/Pearson) Audio 0.39 / 0.50 0.40 / 0.50
Video 0.17 / 0.29 0.19 / 0.30
Multimodal 0.39 / 0.50 0.40 / 0.50
Turn-Taking (Macro-F1/MAE) Audio 0.61 / 1.53 0.66 / 1.50
Video 0.51 / 1.57 0.51 / 1.55
Multimodal 0.61 / 1.53 0.63 / 1.50
Rapport (CCC/Pearson) Audio 0.65 / 0.67 0.68 / 0.70
Video 0.23 / 0.28 0.26 / 0.31
Multimodal 0.58 / 0.63 0.59 / 0.64

The Challenge Dataset

The Daikon Challenge dataset consists of 945 sessions, and a total of 743 hours from 5 countries. The challenge data is a subset of a larger corpus curated by Hume AI, designed to capture rich, naturalistic human interactions across diverse settings and populations.

All sessions were originally captured as dual-channel audio recordings using a proprietary recording platform developed and hosted by Hume AI. This setup enables high-quality separation of speaker streams, supporting detailed analysis of conversational dynamics.

Participants are recruited from multiple countries, ensuring linguistic and cultural diversity within the dataset. All participants provided informed consent prior to data collection, and the dataset was assembled in accordance with applicable ethical guidelines and data protection standards.

For inquiries regarding Hume AI Dual Channel Data please reach out at: link.hume.ai/sales-partnerships-form

Split Rooms Hours DE EN ES NL PL
Train 661 504.3 46 328 111 67 107
Val 142 118.9 10 75 26 11 18
Test 142 120.2 - - - - -

Figure 1. Aggregate emotion intensity of the training split, projected onto Russell's valence–arousal plane; bubble area is proportional to the summed soft-label intensity of each emotion across all segments.

Figure 2. A 10x10 grid of randomly selected frames from participant videos in the Daikon Challenge dataset.

Team Registration

To gain access, register your team by emailing competitions@hume.ai with the following information:

Team Name, Researcher Name, Affiliation, and Research Goals

Restricted Access: After registering your team, you will receive an End User License Agreement (EULA) for signature. Please note that this dataset is provided only for ACII DaiKon Challenge use.

Results Submission

For each task, participants should submit their test set prediction as a zip file to competitions@hume.ai.

Organizers

Dr. Panagiotis Tzirakis. Hume AI, New York, USA. panagiotis@hume.ai. [Main Contact] Panagiotis Tzirakis is an AI research scientist working at the intersection of multimodal deep learning, affective computing, and audio-visual representation learning. He obtained his Ph.D. from Imperial College London (iBUG) in 2021, where he contributed to scalable, end-to-end multimodal emotion recognition and real-world affect modeling. He publishes in leading journals and conferences including Information Fusion, International Journal of Computer Vision, ICASSP, INTERSPEECH, and ACM Multimedia (i10-index: 38). He has co-organized several workshops and challenges, including ACII-VB’22, ICML ExVo’22, and CVPR ABAW’25, ’26.

Dr. Alice Baird. Hume AI, New York, USA. alice@hume.ai. Alice Baird is an AI researcher specializing in computational paralinguistics and affective computing, with a focus on stress and emotional well-being. She received her PhD in 2021 from the University of Augsburg’s Chair of Embedded Intelligence for Health Care and Wellbeing. Her work on emotion understanding from speech, physiological, and multimodal signals, and has been widely published in leading venues such as INTERSPEECH, ICASSP, IEEE Intelligent Systems, and the IEEE Journal of Biomedical and Health Informatics (i10-index: 68). She has also co-organized international workshops and challenges, including the 2022 ACII Affective Vocal Bursts Workshop and Challenge.

Dr. Jeffrey Brooks. Hume AI, New York, U.S.A. jeff@hume.ai. Jeffrey Brooks is a computational emotion scientist with expertise in emotional expression, computational affective neuroscience, and emotional AI. He completed his PhD at New York University in 2021. His work on emotional expression and recognition in the face and voice has been published in leading interdisciplinary journals such as Nature Human Behaviour and Proceedings of the National Academy of Sciences (i10-index: 20).

Dr. Emilia Parada-Cabaleiro, University of Music Nuremberg, emiliaparada.cabaleiro@hfm-nuernberg.de Emilia Parada-Cabaleiro received her PhD in 2016 from the University of Rome Tor Vergata, Italy. She is a music therapist, elementary music educator, and musicologist. Her research interest lay at the intersection between Psychology, Musicology, and Computer science, with a particular focus on affective computing. Her work on emotion and speech modeling has been widely published in leading international venues, including INTERSPEECH and other top conferences in speech and affective computing (i10-index: 33).

Dr. Lukas Stappen, BMW Group, Munich, Germany lukas.stappen@bmw.de. Lukas Stappen is an AI researcher  with expertise in multimodal learning, affective computing, and large language models. He obtained his PhD from the University of Augsburg in 2021, where he contributed to human-centric multimodal understanding and also (co-)organized international workshops and challenges, including founding the MuSe Challenge series (2020-2024) for advancing multimodal sentiment analysis. His current interest focuses on LLM-based voice assistants and AI safety. His work has been widely published in leading venues, such as IEEE Transactions on Affective Computing, ACM Multimedia, ACL, INTERSPEECH, and ICASSP (i10-index: 26).