December, 2016 (exact day TBA)
Research advances in computer vision and pattern recognition have resulted in tremendous advances in different problems and applications. As a result, several problems on visual analysis can be considered as solved (e.g., face recognition), at least in certain scenarios and under specific circumstances. Despite these important advances, there are still many open problems that are receiving much attention from the community because of the potential applications. We are organizing a contest around four of these problems that require, in addition to performing an effective visual analysis, to deal with multimodal information (e.g., audio, RGB-D video, etc.) in order to be solved. In addition we focus on problems in which we aim to recognize non visually evident patterns (e.g., personality traits). The contest is supported by three organizations with vast experience and prestige in the organization of academic contests, namely: Chalearn, MediaEval and ImageCLEF. The contest is also supported by the IAPR TC 12 on visual and multimedia information systems.
Workshop on Multimedia Challenges Beyond Visual Analysis:
Important dates (Workshop)
Please prepare your paper according to the ICPR guidelines & template:
and submit your paper through the CMT system at:
Please refer to the workshop section for more details.
The contest is composed by four tracks:
Track 1: First impressions challenge
We are organizing a challenge track on “first impressions”, in which participants will develop solutions for recognizing personality traits of users in short video sequences. We will make available a large newly collected data set sponsored by Microsoft of at least 10,000 15-second videos collected from YouTube, annotated with personality traits by AMT workers. The traits will correspond to the “big five” personality traits used in psychology and well known of hiring managers using standardized personality profiling: Extroversion, agreeableness, conscientiousness, neuroticism, and openness to experience. As is known, the first impression made is highly important in many contexts, such as human resourcing or job interviews. This work could become very relevant to training young people to present themselves better by changing their behavior in simple ways. The participants who obtain the best results in the challenge will be invited to submit a paper to the workshop.
Table 1. Sample videos that will be used for the first impression challenge.
Workshop/challenge participants will be invited to submit revised and extended versions of their papers to be considered for publication in a
Special Issue on Personality Analysis in the IEEE Transactions on Affecfive Computing
Please refer to the SI site for more details.
Track 2: Isolated gesture recognition track
We are organizing a track on isolated gesture recognition from RGB-D data, where the goal is developing methods for recognizing the category of human gestures from segmented RGB-D video. A new data set called ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD) is considered for this track (sample images taken from this data set are shown in Figure 1). This database includes 47,933 RGB-D gesture videos (about 9G). Each RGB-D video depicts a single gesture and there are 249 gesture categories performed by 21 different individuals. Performance of methods will be judged by its recognition performance.
Track 3: Continuous gesture recognition track
are also organizing a track complimentary to track 2 on continuous gesture recognition from RGB-D data. The goal in this track is to develop methods that can perform simultaneous segmentation and recognition of gesture categories from continuous RGB-D video. The newly created: ChaLearn
LAP RGB-D Continuous Gesture Dataset (ConGD) is considered for this track
(sample images taken from this data set are shown in Figure 1). The data set comprises a total of 47,933 gestures in 22535 RGB-D continuous videoss (about 4G).
video depicts one or more gestures and there are 249 gesture categories
performed by 21 different individuals. Performance of methods will be judged by its Jaccard index.
Figure 1: Sample images taken from the depth video for a set of the considered gestures.
Track 4: Context of experience track
The aim is to explore the suitability of video content for watching in certain situations. Specifically, we look at the situation of watching movies on an airplane (see Figure 3). As a viewing context, airplanes are characterized by small screens and distracting viewing conditions. We assume that movies have properties that make them more or less suitable to this context. We are interested in developing systems that are able to reproduce a general judgment of viewers about whether a given movie is a good movie to watch during a flight. We provide a data set including a list of movies and human judgments concerning their suitability for airplanes. The goal of the task is to use movie metadata and audio-visual features extracted from movie trailers in order to automatically reproduce these judgments. The provided data set comprises: a total of 318 movies + metadata + audio (MFCC), textual (td-idf) and visual (HOG, CM, LBP, GLRLM) features + movie. Video resolution: full HD 1080p, with length of around 2-5 minutes, the considered categories are: latest, recent, the collection, family, world, dutch and European. As evaluation measures we will use a combination of: precision, recall, and F1.
Figure 2. A set of conditions, including small screen and confined, crowded space, characterize the context of watching a movie on an airplane.
previous challenges organized by Chalearn, all the four tracks of the
contest will be based on the CodaLab platform, information about
registration and participation can be found in the CodaLab's tracks'
In broad terms the challenge will proceed as follows (track 1 will adhere to slightly different procedure):
Top three ranked participants of each track of the contest will receive an official certificate award and travel awards as allowed by our Budget,
In addition top ranked participants will be invited to follow the workshop submission guide for inclusion of a description of their system at the ICPR 2016 workshop proceedings. Participants of the workshop or challenge (with papers of relevant thematic content) will be invited to submit revised and extended versions of their papers to be considered for publication in a special issue on personality analysis in the IEEE Transactions on Affecfive Computing.
Important dates (quantitative challenge):
We are very grateful with our sponsors: