Chicago Fingerspelling in the Wild Data Sets (ChicagoFSWild, ChicagoFSWild+)


Bowen Shi	Aurora Martinez Del Rio	Jonathan Keane	Diane Brentari	Greg Shakhnarovich	Karen Livescu

For questions contact Bowen Shi: bshi@ttic.edu

References

American Sign Language fingerspelling recognition in the wild

Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Jonathan Michaux, Diane Brentari, Greg Shakhnarovich and Karen Livescu

SLT 2018
Fingerspelling recognition in the wild with iterative visual attention

Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Diane Brentari, Greg Shakhnarovich and Karen Livescu

ICCV 2019

Overview

This is the home of a collaborative data collection effort by U. Chicago and TTI-Chicago researchers. This is to our knowledge the first collection of American Sign Language fingerspelling data "in the wild," that is in naturally occurring (online) video. The collection consists of two data set releases, ChicagoFSWild and ChicagoFSWild+.

The ChicagoFSWild data set contains 7304 ASL fingerspelling sequences signed by 160 signers, carefully annotated by students who have studied ASL. ChicagoFSWild+ contains 55,232 fingerspelling sequences signed by 260 signers.

Citing ChicagoFSWild

@article{fs18slt,
author = {B. Shi, A. Martinez Del Rio, J. Keane, J. Michaux, D. Brentari, G. Shakhnarovich, and K. Livescu},
title = {American Sign Language fingerspelling recognition in the wild},
journal = {SLT},
year = {2018},
month = {December}
}

Citing ChicagoFSWild+

@article{fs18iccv,
author = {B. Shi, A. Martinez Del Rio, J. Keane, D. Brentari, G. Shakhnarovich, and K. Livescu},
title = {Fingerspelling recognition in the wild with iterative visual attention},
journal = {ICCV},
year = {2019},
month = {October}
}

Publications


SLT'18 paper		ICCV'19 paper

Download

You can download the data sets here:

ChicagoFSWild.tgz(14 GB)

ChicagoFSWildPlus.tgz(82 GB)

Data format

Files are structured as follows:

ChicagoFSWild.csv - This is the main data description file. Each line corresponds to a single fingerspelling sequence.
- filename - Name of fingerspelling sequence
- url - url of the video from which the sequence was obtained
- start_time - start time of the sequence in that video, in the format HH:MM:SS.xxx
- number_of_frames - number of frames of the fingerspelling sequence
- width - frame width
- height - frame height
- label_raw - raw labels from the annotators
- label_notes - annotator notes
- label_proc - processed labels, used for training and testing
- partition - partition (train/dev/test) the sequence belongs to
- signer - signer identity for this sequence
ChicagoFSWild-Frames.tgz - This file contains sequences of image frames (in .jpg), identified by filename in ChicagoFSWild.csv.
annotation_instructions.txt - This text file provides the instructions used by the annotators, which define the conventions used for the raw labels. This is provided for completeness. However, to reproduce our results, only the label_proc field in the CSV file is needed.
HandAnnotation.csv - Annotations of hand bounding boxes for a subset of the fingerspelling sequences in ChicagoFSWild
- filename - Name of fingerspelling sequence
- partition - partition (train/dev) the sequence belongs to, used to train and tune the hand detector
BBox - A folder of hand bounding boxes
- F/X.txt - hand bounding boxes in frame indexed by X of the fingerspelling sequence F