StafnearRemote

ML Ops Engineer

Description

We are looking for an ML Ops Engineer who will be responsible for building and maintaining infrastructure for ML systems in production.

What you will do Deploy and maintain ML systems in production Work with model serving frameworks and ensure their stable operation Operate and scale GPU workloads Set up and maintain MLOps processes (model registry, experiment tracking, deployment automation) Use infrastructure-as-code approaches to manage infrastructure Write and maintain Python code

What we expect 3+ years of experience in ML Ops, Platform Engineering, or SRE for ML systems Experience with model serving frameworks (vLLM, TGI, Triton, or similar) Strong experience with containerization and orchestration, including GPU workloads Hands-on experience with MLOps tools (model registries, experiment tracking, deployment automation) Proficiency in Python Experience with infrastructure-as-code

Nice to have / Important Ability to effectively use AI coding assistants for rapid prototyping, debugging, and refactoring Strong engineering judgment and the ability to maintain high standards when using AI tools

Skills

DevOpsPlatform EngineeringSREPythonMLMachine LearningAI

Want AI to find more roles like this?

Upload your CV once. Get matched to relevant assignments automatically.

Try personalized matching