About the Role
Synthesia is the world’s #1 AI video generation platform. Well, it’s actually a video production studio — in a browser. As in, no cameras or film crews at all. You simply choose an avatar, enter your script in one of 60 languages, and your video is ready in minutes. In Synthesia, you can build personalised on-the-fly videos, give your chatbot a human face or run 24/7 weather channels in different languages, to name just a few of the possibilities. 🎬
We believe the future of media is synthetic, and we are on a mission to turn cameras into code and make everyone a creator. Not sure what we’re talking about? Check out our brand video that explains what we’re doing at Synthesia in a way that even our grandparents *kind of * understand what this AI video stuff is all about.
About the role
We are looking for a Senior ML Platform Engineer - DataOps to help R&D manage large data-sets of audio-video data at Synthesia. We are creating a new ML Platform team, that will be supporting 7+ teams developing cutting edge solutions in generative video synthesis. You will join us to set up a world class data function, managing a lake with PB scale data and building complex audio/visual data pipelines to bring order and make data consumption simple. You are going to super-charge our research.
🔬You are someone that loves DevOps, you love Data, and you want to work at Scale. You pay close attention to detail and you create and communicate clear, well-defined processes. You love to support and help others. The happiest day is when you hear "it was so easy, just 1-click and everything worked". You love to build systems that unblock others and unlock scale.
👩💼 You will join a group of more than 30 Researchers and Engineers in the R&D department. This is an open, collaborative and highly supportive environment. We are all working together to build something big - the future of synthetic media and programmable video through Generative AI. We are proud of the culture, as well as the impact of the technology we are building.
What will you be doing?
🚀 In this position, you will set up and provide data management for our ML teams in R&D at Synthesia. You will help set up our audio-video data pipeline for the Video team and our speech data pipeline for the Voice team. You will be responsible for:
Data storage - our data lake for large scale audio-visual datasets
Data sources - set up our ingest process, working with external data providers
Data annotation - manage data verification and annotation, working with external providers
Data pipelines - deploy custom ML data transformations, working with our ML teams
Data access - create transient data-sets on demand to support ML model training
Data tracking - usage tracking and monitoring across all data sources
Who are you?
We are looking for candidates that can own the DataOps function. You will have:
3+ years minimum experience in Data Engineering / Data Ops / Data Science.
Been involved in managing large scale datasets not just one-off data collection tasks, you have seen continuous data collection.
Been responsible for setting up data ops (ingest / storage / transform / access) end-to-end for multiple teams.
Seen audio/video data and understand managing audio/video data at PB scale.
Strong understanding of Data Ops with dataset management, versioning, usage tracking, monitoring and logging.
In depth experience working with AWS for data and compute. You will work side-by-side with DevOps to define our infra.
Experience supporting deep tech teams working with Python and containerised development with Docker.
Outstanding communication skills.
Nice to have…
If you have seen large scale data management and data governance, multi-modal data-sets, multi-stage data transform pipelines, and large model training with 10000s to 100000s of hours of content. If you have worked with ML Ops to provide data sources to support world class research teams spanning tech planned direct to product as well as foundational research for top-tier academic conferences, then we would love to talk to you! We'd also love to talk to you - if this what you dream of doing. 😎
The good stuff...
💸 You will be compensated well (salary + stock options + bonus)
📍 You will work in a hybrid setting with an office in Amsterdam
🏝 You get 25 days of annual leave + public holidays
🥳 You will join an established company culture with regular socials and company retreats
🤩 You get 4 weeks paid sabbatical after 4 years at the company + $10,000!!
👉 You can participate in a generous referral scheme
💻 You get a brand new computer of your choice (if that still counts as a benefit in 2023 🤔)
🚀 You will have huge opportunities for your career growth