POSITION: Speech Engineer
HOURS: Full-time, 40 hours per week
EMPLOYER: Seasalt AI, Inc.
2447 152^nd^ Avenue NE
Redmond, WA 98052
Job Duties:
Provide engineering support and model development for Seasalt AI's various speech technologies and products. Design and develop the company's artificial intelligence (AI) text-to-speech engine by training a model which maps text into a spectrogram (a representation of the human voice) and a model which maps the spectrogram into real human voice, using knowledge of machine learning (ML) frameworks (Tensorflow, PyTorch). Maintain the company's speech-to-text engine with emphasis on the language model algorithms by applying software to crawl target domain text data from various websites, clean the crawled data (text normalization, punctuation normalization), and select the most useful data for model training. Plan and develop full-scale, end-to-end AI systems for different speech technologies, and assist in backend development and cloud deployment at scale on major cloud providers, including Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP). Improve ML model performance (speech-to-text, text-to-speech, grapheme-to-phoneme models) of different languages through feature engineering, applying knowledge of computational phonology and phonetics. Create and collect training data for various languages, and feed the data through the spectrogram model and vocoder model training pipeline to create language models tuned for specific domains. Develop end-to-end, production-ready, speech-to-text, and diarization systems with speed and memory optimizations, using speech recognition technology (Kaldi). Review and analyze zh_tw training data, and adjust the Tacotron-based spectrogram model and the WaveGlow-based vocoder model accordingly. Perform model serving using RESTful application programming interfaces (API) to allow use through API calls. Mentor and train the speech and frontend teams, consisting of junior engineers and interns. Draft commercialization plans with the CEO of Seasalt AI, and present to clients and investors for review. Use good oral and written communication skills to interact with customers, develop go-to market strategies for speech services, and oversee all speech products.
Requirements:
Master's degree in Computer Engineering, Computational Linguistics, or related field, and 6 months of experience as a Computer/Software Engineer, Speech Engineer, or related engineering occupation.
2. Must have experience with the following:
Computational Phonology and Phonetics
Kaldi
Amazon Web Services (AWS)
Azure
Google Cloud Platform