Senior ML Engineer (Evaluation)

kaiko.ai

Amsterdam

2 days ago

Amsterdam

2 days ago

Apply

Senior ML Engineer (Evaluation)

Kaiko.ai seeks a Senior ML Evaluation Engineer to own the evaluation stack for their clinical AI assistant. Based in Zurich or Amsterdam, the role requires expertise in Python, MLOps, distributed GPU workloads, and model serving. You'll build scalable evaluation pipelines, manage inference services, and drive engineering excellence.

Apply

Hybrid

Full-time

Senior

Python

Ray

Salary

Not specified

Work Location

Amsterdam, North Holland, Netherlands, NL

Work Model

Hybrid: ~50% in-office in Zurich or Amsterdam

Employment Type

Full-time

Experience Level

Senior

Core Qualifications

Technical (Must-have)

PythonRayCUDADockerKubernetesvLLMTensorRT-LLMTriton Inference ServerDagsterCI/CD

Soft Skills

communicationcollaborationtechnical leadershipproblem solvingownershipambitioncuriosity

Tools (Must-have)

DagsterDockerKubernetesRayvLLMTensorRT-LLMTriton Inference Server

Preferred Qualifications

Technical (Nice-to-have)

lm-eval-harnessOpenAI EvalsHF EvaluateTerraformPulumiAWSGCPAzure

Tools (Nice-to-have)

TerraformPulumi

Key Responsibilities

•Own AI factory orchestration for evaluation workloads with Dagster
•Design, operate, and mature pipelines for large-scale evaluation jobs
•Maintain and evolve inference services for evaluation runs
•Ensure functional integrity of eval stack through testing and validation
•Own Eval/MLOps end-to-end: deployments, model registry, artifact versioning, rollout/rollback
•Develop towards a technical lead: set engineering direction, make architectural decisions

ML EngineerEvaluationMLLMClinical AIPythonDagsterMLOpsGPUHybridAmsterdam

Senior ML Engineer (Evaluation)

kaiko.ai

Amsterdam

2 days ago

Amsterdam

2 days ago

Apply

Senior ML Engineer (Evaluation)

Apply

Hybrid

Full-time

Senior

Python

Ray

Salary

Not specified

Work Location

Amsterdam, North Holland, Netherlands, NL

Work Model

Hybrid: ~50% in-office in Zurich or Amsterdam

Employment Type

Full-time

Experience Level

Senior

Core Qualifications

Technical (Must-have)

PythonRayCUDADockerKubernetesvLLMTensorRT-LLMTriton Inference ServerDagsterCI/CD

Soft Skills

communicationcollaborationtechnical leadershipproblem solvingownershipambitioncuriosity

Tools (Must-have)

DagsterDockerKubernetesRayvLLMTensorRT-LLMTriton Inference Server

Preferred Qualifications

Technical (Nice-to-have)

lm-eval-harnessOpenAI EvalsHF EvaluateTerraformPulumiAWSGCPAzure

Tools (Nice-to-have)

TerraformPulumi

Key Responsibilities

•Own AI factory orchestration for evaluation workloads with Dagster
•Design, operate, and mature pipelines for large-scale evaluation jobs
•Maintain and evolve inference services for evaluation runs
•Ensure functional integrity of eval stack through testing and validation
•Own Eval/MLOps end-to-end: deployments, model registry, artifact versioning, rollout/rollback
•Develop towards a technical lead: set engineering direction, make architectural decisions

ML EngineerEvaluationMLLMClinical AIPythonDagsterMLOpsGPUHybridAmsterdam