AslgPc12 (English-ASL Gloss Parallel Corpus 2012)
Synthetic English-ASL Gloss Parallel Corpus 2012
@kaggle.thedevastator_unlocking_the_power_of_cross_cultural_language_i
Synthetic English-ASL Gloss Parallel Corpus 2012
@kaggle.thedevastator_unlocking_the_power_of_cross_cultural_language_i
By Huggingface Hub [source]
This dataset provides an exciting opportunity to bridge the cultural divide between English and American Sign Language, by unlocking a powerful synthetic English-ASL gloss parallel corpus that was generated in 2012. By exploring this cross-cultural language interoperability, we can become better connected both within and beyond our linguistic communities and bring together aspects of communication often seen as separated. With the data provided in this dataset, which consists of columns for gloss (a representation of a sign in English) and text (the translated text of the sign), researchers can uncover further insights into bridging linguistic divides with innovative approaches to machine translation models
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The data set consists of two columns: gloss and text. The “gloss” column contains English representations of an ASL sign, helping users better understand the correlation between written English and ASL signs. The “text” column provides a written translation or interpretation in English for each corresponding ASL sign within the gloss column.
Using this data set, users can create a variety of scenarios which emulate common conversation topics that are found within everyday life - such as greetings, family activities, home chores etc, by pairing up individual words with their translations into ASL signs. With diligent practice users will gain proficiency over time when it comes to having coherent conversations using both spoken languages and signed languages such as those found in American Sign Language (ASL). Furthermore further exploration using predictive models developed from this corpus could help unravel complex linguistic problems abound cross-cultural communication barriers
- Developing generative ASL-English bilingual chat bots
- Benchmarking different translation models to measure accuracy
- Using the parallel data to assess various translation techniques and determine which is the most successful technique in translating from English to ASL
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: train.csv
| Column name | Description |
|---|---|
| gloss | This column contains the ASL gloss representation in a given context for any keyword or phrase spoken in ASL. (String) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.
CREATE TABLE train (
"gloss" VARCHAR,
"text" VARCHAR
);Anyone who has the link will be able to view this.