
Software Engineer, Data Infrastructure & Acquisition
Speechify
Software Engineer, Data Infrastructure & Acquisition
Speechify seeks a Software Engineer to manage data collection and infrastructure for AI model training. The role involves building petabyte-scale datasets, operating GCP infrastructure with Terraform, and collaborating with scientists. Requires 5+ years of experience and proficiency in Python, Docker, and cloud platforms.
Software Engineer, Data Infrastructure & Acquisition
Speechify seeks a Software Engineer to manage data collection and infrastructure for AI model training. The role involves building petabyte-scale datasets, operating GCP infrastructure with Terraform, and collaborating with scientists. Requires 5+ years of experience and proficiency in Python, Docker, and cloud platforms.
Salary
Core Qualifications
Technical (Must-have)
Soft Skills
Preferred Qualifications
Technical (Nice-to-have)
Key Responsibilities
- Be scrappy to find new sources of audio data and bring it into our ingestion pipeline
- Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform.
- Collaborate closely with our Scientists to shift the cost/throughput/quality frontier, delivering richer data at bigger scale and lower cost to power our next-generation models.
- Collaborate with others on the AI Team and Speechify Leadership to craft the AI Team’s dataset roadmap to power Speechify’s next-generation consumer and enterprise products.