Member of Technical Staff - Audio-Language Model Data
Liquid AI
Full-time
Boston / San Francisco
Posted on 6 months ago
Job Description
Liquid AI, an MIT spin-off, is seeking a highly skilled Member of Technical Staff to develop Liquid Audio-Language models by gathering high-quality audio-text pre-training and SFT datasets.
Responsibilities
Create and maintain data cleaning, filtering, selection pipeline for audio-text data
Monitor the release of public high-quality audio datasets
Create and maintain synthetic data generation pipeline for task-specific audio SFT data
Collaborate with the multimodal audio team to run ablations on new datasets
Requirements
Experience in data curation, cleaning, augmentation, and synthetic data generation
Ability to write and debug models in ML frameworks, and experience working with LLMs and VLMs
Strong programming skills in Python
B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience