Senior LLM Engineer: Text & Reasoning LLM / NLU
Omilia
- Italia
- Tempo indeterminato
- Full time
- Technical Ownership: Hold final technical authority over all LLM/NLU services, including entity/intent classification, specialized LLMs, and agentic orchestration.
- System Quality: Ensure production stability, performance, and compliance (including PCI/PII) across the LLM/NLU domain.
- Delivery Monitoring: Commit to delivery dates, drive features from design through deployment, and proactively flag risks.
- Autonomy: Resolve technical ambiguity, structure loosely defined requirements, and make architectural decisions independently.
- Scope & Complexity: Lead the most complex, ambiguous, or cross-cutting features, including model research, agentic reasoning, and inference server development.
- Impact: Directly influence the quality and reliability of AI services serving millions of customer interactions in regulated industries.
- Influence/Mentorship: Guide and mentor mid-level and junior engineers through code reviews, pairing, and knowledge transfer; drive alignment between Product, Architecture, and Engineering.
- Lead research and experimentation on new model architectures, training strategies, and evaluation methodologies for LLM/NLU.
- Design, develop, fine-tune, and evaluate specialized LLMs for Concierge and Task Agents.
- Develop and optimize ML pipelines for training, evaluation, and deployment (AWS SageMaker).
- Architect and maintain inference servers, ensuring low latency and high reliability.
- Implement and evolve closed-loop self-learning systems for continuous model improvement.
- Drive benchmarking, experiment reproducibility, and documentation quality.
- Ensure compliance with data privacy standards throughout the ML lifecycle.
- Mentor and support the growth of team members; share expertise via tech talks and guides.
- 5+ years in applied LLM/ML/NLU/NLP, with ownership of production ML systems at scale.
- Strong hands-on skills in Python, PyTorch, and HuggingFace Transformers.
- Deep experience with LLMs: fine-tuning, distillation, prompt engineering, evaluation, and deployment (especially small/efficient models).
- Solid foundation in NLU: intent classification, entity extraction, etc.
- Experience with model serving infrastructure (e.g., Triton Inference Server, vLLM, TGI, FastAPI).
- Experience with cloud ML infrastructure (AWS SageMaker, Bedrock, or equivalent).
- Proven architectural decision-making and technical ownership across services/products.
- Ability to break down ambiguous problems and drive actionable plans.
- Excellent communication skills for both technical and non-technical audiences.
- Experience with agentic system design (tool use, reasoning chains, multi-step planning).
- Experience with self-learning/continuous improvement ML systems.
- Multilingual NLU or cross-lingual transfer experience.
- Familiarity with PCI/PII compliance in ML workflows.
- Experience with experiment tracking tools (Weights & Biases, MLflow).
- Open-source ML/NLP contributions or publications at top venues.
- Experience with speech or multimodal LLMs.
- Act as an Omilia ambassador in all interactions.Benefits
- Fixed compensation;
- Long-term employment with the working days vacation;
- Development in professional growth (courses, training, etc);
- Being part of successful cutting-edge technology products that are making a global impact in the service industry;
- Proficient and fun-to-work-with colleagues;
- Apple gear