GenAI Systems Lab Open interactive version →
AI Engineering 8 min read

What Does an ML Engineer Actually Do in 2025?

The evolving ML Engineer role post-LLM revolution — what's changed, what's still core (training, MLOps, model serving), and how to position yourself.

The ML Engineer title covers a wide range — from building training pipelines for billion-parameter models to deploying fine-tuned classifiers in production microservices. Understanding what the role actually involves, how it differs from AI Engineer and Data Scientist, and what the career path looks like is essential reading before you apply.

What ML engineers actually do

ML Engineers sit at the intersection of software engineering and machine learning research. They write production code, but the code trains and serves models. Day-to-day work includes: building and maintaining training pipelines, curating and versioning training datasets, running experiments and tracking results, deploying models to serving infrastructure, monitoring model performance in production, and collaborating with researchers to productionise new techniques.

The distinction from Data Scientists: ML Engineers own the production path. A Data Scientist builds a model in a notebook; an ML Engineer turns it into a service that handles 10K requests per minute, fails gracefully, and can be retrained and redeployed in an hour.

ML Engineer vs AI Engineer — the 2025 distinction

DimensionML EngineerAI Engineer
Primary workTraining + fine-tuning modelsBuilding on top of foundation models
Core skillPyTorch / JAX, distributed trainingPrompt engineering, RAG, agents, evals
OutputModel weights + serving infrastructureLLM-powered applications
Infra depthDeep — owns GPUs, distributed systemsModerate — uses managed APIs
Math depthHigh — loss functions, gradientsModerate — uses models as black boxes
2025 demandHigh at labs and large techRapidly growing across all sectors

Core technical skills

What companies want in 2025

Pre-2022, most ML engineering roles focused on classical models — tabular data, recommendation systems, NLP classifiers. Post-2022, the majority of new ML Engineering hiring is LLM-adjacent: fine-tuning foundation models, building RLHF pipelines, scaling training infrastructure for frontier model training, or deploying and serving large models efficiently.

The most in-demand specialisations: LLM fine-tuning (LoRA, QLoRA, full fine-tune at scale), inference optimisation (quantisation, speculative decoding, vLLM deployment), and training infrastructure (GPU cluster management, distributed training debugging).

Career progression

LevelScopeKey milestone
Junior MLEExecutes well-defined tasks on existing pipelinesShips first model to production
Mid MLEOwns a model or pipeline end-to-endReduces training time or serving cost by 2×
Senior MLELeads cross-functional ML projectsDesigns the ML architecture for a new product
Staff MLESets technical direction for an ML platform or areaInfluence across multiple teams or products
Principal MLEOrg-level impact on ML strategyDrives multi-year technical roadmap

How to get in

The clearest path from SWE to MLE: build a project that requires training a model from scratch — not fine-tuning an existing one. Build the data pipeline, write the training loop, deploy the model, and monitor it. Show this project in interviews. Complement it with a strong understanding of transformers, backpropagation, and distributed systems.

The Karpathy path: watch 'Let's build GPT from scratch', implement it yourself, then implement GPT-2 training on a small dataset. This project — described confidently in interviews — opens more MLE doors than any certification.

Explore the AI careers section →: Salary guides, role comparisons, and breaking-in strategies for every AI role.

Try it interactively

GenAI Systems Lab is a free platform for AI engineers — configure real failure modes, break things, and build the judgment that gets you hired.

Open GenAI Systems Lab →