The challenge

In the arms race for speech-enabled technologies, systems that can support the widest user base will win out in the end. That’s why when a Fortune 500 Tech company sought comprehensive speech training data in French, they needed to ensure that the collected data would truly account for the wide range of dialects present in the francophone world.

That meant DefinedCrowd needed to source 600 hours of speech from 1,000 unique speakers who represented a cross-section of genders, ages and regional dialects.