
Senior ML Engineer (Token Factory)
Jobgether
Senior ML Engineer (Token Factory)
Senior Machine Learning Engineer role focused on optimizing inference and training of large language models at scale. The position involves GPU optimization, low-precision pipelines, and distributed systems. Based in the Netherlands with remote work.
AIRemoteFull-timeSeniorMachine LearningTransformer Architectures
Senior ML Engineer (Token Factory)
Senior Machine Learning Engineer role focused on optimizing inference and training of large language models at scale. The position involves GPU optimization, low-precision pipelines, and distributed systems. Based in the Netherlands with remote work.
AIRemoteFull-timeSeniorMachine Learning
Salary
Not specified
Core Qualifications
Technical (Must-have)
Machine LearningTransformer architecturesLarge Language ModelsGPU profilingNsightPyTorch ProfilerGPU architectureMemory hierarchyAttention mechanismsRoPEKV-cacheFlash AttentionQuantizationDistributed trainingPython
Soft Skills
CommunicationCollaboration
Key Responsibilities
- Drive inference optimization efforts by identifying bottlenecks and implementing performance improvements across diverse LLM architectures.
- Contribute to the design and evolution of inference engines, including techniques such as speculative decoding, KV-cache optimization, and support for dense and MoE models.
- Develop and productionize low-precision training and inference pipelines (e.g., FP8, MXFP4) to maximize efficiency on large GPU clusters.
- Profile and analyze GPU workloads using modern tooling to identify performance constraints and guide architectural improvements.
- Collaborate on scalable distributed training and inference systems, including sharding strategies, custom kernels, and hardware-aware optimizations.
- Contribute to engineering best practices including testing, CI/CD, and maintainable production-grade ML systems.
Senior ML EngineerMachine LearningLLMGPU optimizationInferenceTrainingDistributed systemsPythonRemoteNetherlands