Lead SRE- Azure & GCP
Glasgow, UK
We have a Lead Site Reliability Engineer (SRE) opportunity within our Google Cloud Site Reliability Engineering team.
As a Lead Site Reliability Engineer at JPMorgan Chase within the Infrastructure Platform - Cloud Foundational Services SRE organization, you will join our Google Cloud Site Reliability Engineering team operating within a global follow-the-sun support model.
Job Responsibilities:
- Lead and Implement SRE frameworks to support global google cloud environments and ensure the highest level of SLOs through operational excellence
- Mastery of application, data, infrastructure, and Agentic AI disciplines
- Keen understanding of financial control and budget management using expertise in working in partnership with colleagues throughout the firm, and in leading collaborative teams to achieve common goals
- Uses enterprise-authorized AI capabilities within the work environment to accelerate major-incident triage, troubleshooting, and post-incident analysis, validating outputs and handling operational data according to sensitivity and security requirements.
- Provide support to develop & improve the quality of technical engineering documentation
- Provide technical supervision, oversight and problem resolution for engineering activities
- Champion a DevOps model so that services are automated and elastic across all platforms
Required qualifications, capabilities, and skills:
- Google & Azure cloud expertise in a mission critical production environment
- Strong understanding about container technologies such as Docker, Kubernetes, GKE and HELM
- Experience in programming in one of the following languages: Python, shell scripting or GO along with good understanding of REST APIs
- Hands-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Google Observability, Azure Monitor, Data Dog, Prometheus, Splunk, Elasticsearch and Grafana.
- Demonstrated experience using enterprise-authorized AI capabilities within the work environment to improve SRE workflows (e.g., incident investigation support and knowledge capture) with strong validation habits and awareness of data sensitivity.
- Ability to evaluate AI-assisted operational recommendations for correctness and risk, define appropriate guardrails for team usage, and ensure outcomes align to resiliency and security expectations.
- Strong understanding about the Google Cloud governance and compliance and cost management
- Strong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Infrastructure as Code, Terraform and Jenkins.
- Google Cloud certification or equivalent technical experience in the Public Cloud.
- Good understanding of Agentic AI SDKs and GitHub Copilot Skills.
Preferred qualifications, capabilities, and skills:
- Good understanding of operating systems such as Windows, Linux (Redhat / Ubuntu)
- Good understanding of LLM and other AI/ML frameworks which can be used in AIOPS
J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world’s most prominent corporations, governments, wealthy individuals and institutional investors. Our first-class business in a first-class way approach to serving clients drives everything we do. We strive to build trusted, long-term partnerships to help our clients achieve their business objectives.
Our professionals in our Corporate Functions cover a diverse range of areas from finance and risk to human resources and marketing. Our corporate teams are an essential part of our company, ensuring that we’re setting our businesses, clients, customers and employees up for success.
We have a Lead Site Reliability Engineer (SRE) opportunity within our JPMC Google & Azure Cloud Site Reliability Engineering team.


