Member of Technical Staff - Audio-Language Model Data

Liquid AI
Full-time
Boston / San Francisco
Posted on 6 months ago

Job Description

Liquid AI, an MIT spin-off, is seeking a highly skilled Member of Technical Staff to develop Liquid Audio-Language models by gathering high-quality audio-text pre-training and SFT datasets.

Responsibilities

  • Create and maintain data cleaning, filtering, selection pipeline for audio-text data
  • Monitor the release of public high-quality audio datasets
  • Create and maintain synthetic data generation pipeline for task-specific audio SFT data
  • Collaborate with the multimodal audio team to run ablations on new datasets

Requirements

  • Experience in data curation, cleaning, augmentation, and synthetic data generation
  • Ability to write and debug models in ML frameworks, and experience working with LLMs and VLMs
  • Strong programming skills in Python
  • B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience

Benefits

  • No benefits