MSHLT Student Laura Montenegro

See this post on LinkedIn

Our Master’s in Human Language Technology program had one other student graduate in the fall semester: Laura Montenegro. For her internship with Cobalt Speech, Laura evaluated a hybrid ASR model (using a Hidden Markov Model-Deep Neural Network) for low-resource languages using Cobalt’s existing tools.

Although “low-resource language” is the standard term to refer to languages without large existing datasets, I find the term “data-scarce language” to be less prone to misunderstanding; the “resource” that is low is prepared datasets, not necessarily money or people.

By definition, it’s hard to find datasets for languages that are considered “low-resource,” but Laura found a workable amount of audio data for the Ewe language on HuggingFace—great practical experience for future work in NLP and speech technology!

Wishing you all the best in your future endeavors, Laura!




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Who and What comprise AI Skepticism? | Eric Jackson
  • Vector Databases Are the Wrong Abstraction | Eric Jackson
  • How language data can benefit your organization | Eric Jackson posted on the topic | LinkedIn
  • I’ve been planning to introduce Retrieval-Augmented Generation (RAG) in my LING 531 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝑅𝑒𝑡𝑟𝑖𝑒𝑣𝑎𝑙 course this fall. I was looking for a very compact introduction to RAG… | Eric Jackson
  • How to prepare for the MS in Human Language Technology program