Defined Crowd

DefinedData

High Quality AI Datasets

Accelerate your AI roadmap with instant access to ethically sourced speech and text data.

Get Data

Improve Your Automated Speech Recognition (ASR) Performance

Scripted Monologue Speech Training Data

Single speaker speech following a set of prompts and recorded at 16khz in a single channel audio file.

100 Hours

English

Scripted Monologue

WER: 0.98% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 0.98% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

May 2020

Number of speakers:

275

Device:

Mobile

Communication band:

Broadband

Channels:

1

Sample rate:

16kHz

Audio format:

WAV

Age:

35: 44% / 35: 56%

Gender:

68F / 32M

100 Hours

German

Scripted Monologue

WER: 0.6% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 0.6% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

May 2020

Number of speakers:

463

Device:

Mobile

Communication band:

Broadband

Channels:

1

Sample rate:

16kHz

Audio format:

WAV

Age:

35: 71% / 35: 29%

Gender:

49F / 51M

600 Hours

Italian

Scripted Monologue

WER: 0.5% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 0.5% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

June 2020

Number of speakers:

2308

Device:

Mobile

Communication band:

Broadband

Channels:

1

Sample rate:

16kHz

Audio format:

WAV

Age:

35: 53% / 35: 47%

Gender:

52F / 48M

Showing 3 from 58
See all Scripted Monologue Data

Spontaneous Dialogue Speech Training Data

Multi speaker spontaneous conversations following a given scenario, recorded at 8khz in a dual channel audio file, and transcribed with marked speaker turns.

300 Hours

English-Britain

Spontaneous Dialogue

WER: 5.2% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 5.2% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

December 2020

Device:

Telephone

Communication band:

Narrowband

Channels:

2

Sample rate:

8kHz

Audio format:

WAV

500 Hours

German

Spontaneous Dialogue

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

February 2021

Device:

Telephone

Communication band:

Narrowband

Channels:

2

Sample rate:

8kHz

Audio format:

WAV

300 Hours

Japan

Spontaneous Dialogue

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

February 2021

Device:

Telephone

Communication band:

Narrowband

Channels:

2

Sample rate:

8kHz

Audio format:

WAV

Showing 3 from 58
See all Spontaneous Dialogue Data

Spontaneous IVR Speech Training Data

Single speaker spontaneous conversations with a scripted IVR system following a set of scenarios, recorded at 8khz in a single channel audio file, and transcribed.

700 Hours

English

Spontaneous IVR

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

February 2021

Device:

Telephone

Communication band:

Narrowband

Channels:

1

Sample rate:

8kHz

Audio format:

WAV

500 Hours

Spanish

Spontaneous IVR

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

February 2021

Device:

Telephone

Communication band:

Narrowband

Channels:

1

Sample rate:

8kHz

Audio format:

WAV

300 Hours

Japan

Spontaneous IVR

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

WER: 10% Ameasurement indicating errors in alignment of text representation (actual vs. perfect) of audio, taking into account words omitted, inserted or wrongly replaced

Environment: Quiet

Published date:

February 2021

Device:

Telephone

Communication band:

Narrowband

Channels:

1

Sample rate:

8kHz

Audio format:

WAV

Showing 3 from 58
All Spontaneous IVR Datasets

Data Quality Guaranteed

Our multifaceted approach ensures our AI datasets are both accurate and diverse.

Word Error Rate (WER)
Word Error Rate (WER)
WER is our primary quality metric to measure the accuracy of speech data by comparing spoken words with the corresponding transcriptions.
Language Testing
Language Testing
For each the contributor ensures speech is representative of the target population.
Domain Specificity
Domain Specificity
Scripts gathered from industry specific sources enhance coverage of unique vocabulary.
Gender and Age Distribution
Gender and Age Distribution
Proactively managed throughout collection to minimize and combat bias in the dataset.

Advantages of DefinedData's Prebuilt Datasets

Simplify access to ethically sourced high quality datasets and AI solutions to accelerate go-to-market timelines.

Fast to Market
Fast to Market
Accelerate AI model training, tuning, and testing with datasets available for immediate use.
Flexibility
Flexibility
Choose from numerous datasets curated for model training, benchmarking, or domain customization.
Variety
Variety
Browse an expansive library of fresh, high-quality data available in multiple languages, domains, and recording environments.
Ethically Sourced
Ethically Sourced
Datasets collected with the explicit consent of contributors ensure compliance with global data privacy regulations.