hero

Envestors Job Search

Explore opportunities in our UK network for top international talent
companies
Jobs

AI Language Engineer, Rufus LangEn

Amazon

Amazon

Software Engineering, Data Science
London, UK
Posted on Mar 10, 2026

Description

The Conversational Shopping team is looking for a Language Engineer to contribute to efficiencies and innovation in its efforts to deliver a seamless, fluent, and multilingual experience for AI-assisted shopping. This is an opportunity to join the high-performing team behind Amazon's Generative AI shopping initiatives such as Amazon's AI Shopping assistant. Our objective is to make it easy for customers worldwide to find and discover the best products by providing comparisons, recommendations, and answers to specific product questions. This role is cross-functional, requiring collaboration across global product, design, science, and engineering teams.

We are looking for candidates who are passionate about the intersection of language and technology and who are keen to use their technical abilities to develop automated, scalable solutions to challenges in the Large Language Model (LLM) space. Applying a combination of expertise in LLMs, coding, and linguistics (i.e., semantics, syntax, pragmatics), they will overcome complex problems in model evaluation, automation, and context engineering for multilingual agentic systems.

In this role within the International Editorial team, the candidate will contribute to our evaluation-driven product development strategy, working in close collaboration with Language Editors, Product Managers, Applied Scientists, and Software Engineers on initiatives that drive editorial quality, speed, and consistency. They will design processes to facilitate the production of high-quality editorial data for evaluating and improving the AI Shopping experience in different languages. The candidate will also create and develop LLM-assisted editorial tools and automated annotations (e.g., LLM-as-a-judge) to support the humans-in-the-loop (HITL) work of the broader Editorial team. Additionally, they will define requirements for internal tooling by developing prototypes. They will be responsible for authoring, optimizing, and managing system prompts for multilingual, customer-facing LLM systems. Drawing on data processing and analysis skills, they will evaluate and report on model performance and annotation quality, producing regular reports for stakeholders. By creating and synthesizing quality metrics, they will also support Conversational Shopping teams in delivering both internal stakeholder requirements and the desired Amazon customer outcomes.

This role requires strong analytical and technical skills as well as experience in language technology to help us measure, analyze, and solve complex problems. The ideal candidate should have experience in creating technical solutions for automating and processing data workflows at scale while upholding the highest linguistic quality standards. They should also have exceptional writing and communication skills with the ability to interface between both technical and non-technical teams.

Key job responsibilities
- Develop LLM-as-a-judge systems to support Human-in-the-loop evaluations
- Automate operations and perform data analysis using scripting languages (e.g. Python)
- Author, optimize, and manage system prompts for multi-lingual, customer-facing LLM systems
- Integrate API calls into Retrieval Augmented Generation (RAG) systems
- Evaluate model performance and annotation quality to produce reports for stakeholders
- Produce, process, and manipulate different types of language data
- Contribute to defining platform requirements for internal tooling by developing prototypes
- Raise the quality bar on editorial workflows and SOPs through standardization, documentation, and periodic audits and investigations
- Support processes and mechanisms to onboard and upskill Editors and AI Tutors on an ongoing basis
- Support editorial data production and collection by defining project scope with internal teams
- Design, implement, and refine control mechanisms, metrics, and methodologies to ensure editorial and annotation quality
- Collaborate with editors, applied scientists, engineers, and product managers to deliver an optimal customer experience by defining metrics, guidelines, and workflows
- Deliver across parallel workstreams, balancing timelines, impact, and stakeholder requirements