Home
/
Comprehensive
/
Senior Cloud Site Reliability Engineer
Senior Cloud Site Reliability Engineer-March 2024
Multiple Locations
Mar 28, 2026
About Senior Cloud Site Reliability Engineer

  Working with one of the most exciting products in Microsoft Azure, you will help with advancing Microsoft's cloud first strategy. The Azure Customer Experience (CXP) team is searching for a customer obsessed Cloud Site Reliability Engineer that can drive reliability and observability engineering excellence and embody our culture of inclusiveness, growth-mindset, and unwavering dedication to diversity.

  We are a fast-paced agile team in a start-up like culture where you are empowered to help shape the future. We apply software engineering approach to run operations. More specifically, responsible for defining, instrumenting, measuring SLO/SLI/SLAs and improving service availability, latency, scalability, performance, observability, and efficiency.

  Our “no dead-ends”, “whatever it takes”, “biased for action”, “make it better than ever” philosophy ensures that every customer can realize their full potential through the Microsoft Cloud. We are fast growing team, but we make sure we are committed to remain agile. Customer first, nurturing trust, high responsiveness, automation, SLO/SLI/SLA, blameless post-mortem, observability, monitoring, alerting, and toil reduction form the foundations of our code and we work with teams across Microsoft and external customers to ensure success. We work on exciting engineering challenges in a fun and supporting environment, with access to cutting edge technology surrounded by world-class engineers.

  Responsibilities

  Distributed systems architecture – understand and manage the most complex systems

  Continual reliability and performance optimisation – enhancing observability stack to improve proactive detection and resolution of issues

  Working at bleeding edge - adopting new approaches and technologies, iterating on existing tooling to drive improvements

  Problem solving capabilities – troubleshooting complex issues and proactively reducing toil through automation

  Collaboration skills – working across teams to drive change and provide guidance

  Technical expertise – depth skills and ability to act as subject matter expert in one or more of: IAAC, observability, coding, reliability, debugging, system design

  Capacity planning – effectively forecasting demand and react to changes

  Incident response – rapidly detecting and resolving critical incidents. Minimising customer impact through effective collaboration, escalation (including periodic on-call shifts) and post incident reviews.

  Regular travel to the customer site in the Southwest of England should be expected on at least a monthly basis.

  Candidates must be eligible for Security Clearance

  Qualifications

  Required qualifications/experience:

  Linux/OSS: Administration and scripting languages (e.g., Bash, Python, Perl)

  Infrastructure as Code; terraform, bicep or similar.

  Azure Infrastructure including compute, networking, security, identity, governance and storage

  Reliability Engineering Knowledge; designing and implementing systems for fault tolerance, scalability, and resilience.

  Managing and utilizing version control systems such as Git, GitHub, or Azure DevOps.

  Demonstrable experience with continuous integration and continuous deployment (CI/CD) practices, including pipeline configuration and automation.

  Monitoring, demonstrable experience building dashboard and alert infrastructure with tools such as Grafana, Prometheus or Azure Monitor

  Working with service level agreements (SLAs), service level objectives (SLOs), and Service Level Indicators (SLIs) to measure system performance and reliability

  Effective diagnosis of complex technical issues and creation of postmortem/RCA reports.

  Preferred qualifications/experience:

  Software development experience, particularly .NET

  Virtual Desktop Infrastructure configuration, troubleshooting

  Deployment and administration of containerizations solutions eg AKS, Docker, ECS

  Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Dist Suppt Pharmacist PT
Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced
Instrumentation Engineering
Company Description Work with Us. Change the World. At AECOM, we're delivering a better world. Whether improving your commute, keeping the lights on, providing access to clean water, or transforming
Float Pharmacy Technician- Community
$3,000 Sign-on Bonus for External Candidates Opportunities with Genoa Healthcare. A career with Genoa Healthcare means you're part of a collaborative effort to serve behavioral health and addiction t
Software Engineer
Software Engineer– Nike, Inc. Beaverton, OR. Develop, code/configure, test programs/systems and solutions problems independently and with minimal supervision in order to meet defined digital product
Medical Assistant, Senior - CCP Butler
Are you passionate about helping kids be their best selves? Join our team of Life Changers! UPMC Children's Community Pediatrics is hiring a Full-Time Medical Assistant, Senior to support the office
CAREGIVER
Requisition: 2024-132996CAREGIVERJob LocationsUS-OH-SHEFFIELD LAKEID2024-132996Line of BusinessResCare Community LivingPosition TypeFull-TimePay MinUSD $15.75/Hr.Pay MaxUSD $17.10/Hr.Our CompanyResCa
Security Officer
Department/Unit: Security Work Shift: Day (United States of America) Salary Range: $18.00 - $26.15 Security Officers are responsible for ensuring the safety of patients, visitors, and staff at Albany
Parts Manager
Champions Do More Crash Champions is home to a collection of team members driven by a deeply rooted purpose and guided by a powerful principle: Here, We Do More. It is the Crash Champions DNA and it
General Manager in Training (Relocation Required)
What’s Under the Hood DriveTime Family of Brands is the largest privately owned used car sales finance & servicing company in the nation. Headquartered in Tempe, Arizona and Dallas, Texas, we cre
Utility Craftsman
Utility Craftsman Posting Begin Date: 2024/01/11 Posting End Date: 2024/03/11 Category: Construction Trades Work Type: Full Time Location: Salmon, ID, United States Minimum Salary: 19.55 Maximum Sala
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved