Now that Alexa knows how to speak Spanish in the US, there's a common question: how did it learn the language when it didn't have the benefit of legions of users issuing commands? Through new tools, it seems. Amazon has revealed a pair of system that helped Alexa hone its español (and Hindi, and Brazilian Portugese) using just a tiny amount of reference material. Effectively, they gave the natural language machine learning model a jumpstart.
The first tool studies a handful of "golden utterances" (that is, reference commands suggested by the developers) to learn general syntax and semantics patterns. After that, it produces "rewrite expressions" that themselves create thousands of new yet similar sentences to work from. The system works quickly -- you could move from 50 utterances to a fully operational linguistic set in less than two days.
Amazon's other tool uses guided resampling to replace terms that can be safely swapped, further improving the AI's training. The technique draws both on data from existing Alexa languages as well as media sources like the Amazon Music catalog, and it's capable enough to be aware of context (it won't swap a musician's name for an audiobook, for example).
This doesn't mean that Alexa will master every known language in a matter of weeks. There are other factors to consider beyond the linguistic structure, such as accommodations for cultural differences and customer support systems. Still, it suggests that Amazon might have an easier time adding languages than it has in the past. It might just be a matter of which Alexa device you get, rather than whether you can get one in the first place.