Home
/
Comprehensive
/
Principal Site Reliability Engineering Manager- Viva Engage
Principal Site Reliability Engineering Manager- Viva Engage-March 2024
Mountain View
Mar 29, 2026
About Principal Site Reliability Engineering Manager- Viva Engage

  What is Viva Engage?

  Viva Engage is the industry-defining social network for the enterprise. We provide a platform for millions of employees, including those from 85% of Fortune 500 companies, to build community and culture, share knowledge, and connect with their leaders and each other.

  Why Viva Engage?

  Acquired by Microsoft in 2012, Viva Engage combines the benefits of a startup - rapid innovation, cutting-edge technology, outsized individual impact - with the advantages of working for one of the most successful software companies in the world. We believe in mission-driven work and our platform has become more indispensable than ever as it fosters connection and a sense of belonging among remote teams. #VivaEngage

  You will have:

  Autonomy and freedom to innovate

  Choice of the best of open source and Microsoft-internal technology

  The ability to experiment, A/B test, and make data-driven decisions

  Tons of opportunity for outsized impact as part of a small but mighty team on a rapidly-growing product needed now more than ever

  As Principal Site Reliability Engineering Manager in Viva Engage , you will have two critical accountabilities:

  The first is leading efforts to fully embrace site reliability engineering principals while building critical infrastructure, optimizing existing systems, and eliminating toil. You will oversee efforts that combine software and systems engineering to build, scale and operate the large-scale conversation platform that powers Viva Engage experiences. With our origins as a startup but now part of Microsoft, your purview spans our own open-source-based tech stack, Azure managed services, and M365 technology.

  The second expectation is to improve overall reliability for Viva Engage. This means guiding engineering teams to develop missing capabilities, and driving changes to our culture and processes to make reliability a critical aspect of how we work. We have been growing rapidly to become a critical workload for many of the world’s largest organizations and are looking for you to help us get to the next level.

  You should have a well-established playbook developed through years of experience operating world-class systems on a huge scale. You should be able to paint a vision of the future and build consensus across the organization while still being able to dive into details. The day-to-day responsibilities include a blend of technical, hands-on leadership with demonstrated people management and partnership skills.

  Location: By applying to this U.S. based position, relocation does not apply/is not provided for the role.

  Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

  Responsibilities

  Mentor engineers within the infrastructure team and in partner teams in improving service reliability and evangelize reliability practices across the organization

  Drive accountability across the entire engineering organization with well-defined processes, metrics, and goals for reliability. This may include retooling existing rituals and creating new ones.

  Collaborate across various teams to provide input into capacity planning; failure/reliability analysis; performance analysis; security and customer privacy analysis

  Participate in the incident manager on-call rotation to co-ordinate responses to Service Level Agreement (SLA) impacting incidents. Keeping relevant stakeholders and leadership apprised of details related to incident impact and status of resolution

  In addition, you have people management responsibilities including driving employee growth and development, executing projects, and managing performance, while continuing to evolve our infrastructure

  Embody our culture (https://careers.microsoft.com/v2/global/en/culture)  and values (https://www.microsoft.com/en-us/about/corporate-values)  

  Qualifications

  Required/Minimum Qualifications:

  8+ years technical experience in software engineering, network engineering, systems administration, or Site Reliability Engineeringo OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 5+ years technical experience in software engineering, network engineering, systems administration, or Site Reliability Engineering

  o OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, systems administration, or Site Reliability Engineering

  o OR Doctorate Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, systems administration, or Site Reliability Engineering

  3+ years of people management experience leading Site Reliability Engineers or livesite teams.

  6+ years of experience in a Site Reliability Engineering role building and operating systems with world-class reliability at huge scale (100m+ Monthly Active Usage).

  6+ years technical engineering experience with building large scale distributed systems using, but not limited to Golang, Java, Python, containers and container orchestration systems (such as Docker, Kubernetes, Apache Mesos), infrastructure as code (such as Terraform), databases (such as Postgres, data sharding), and Cloud Platforms (such as Microsoft Azure, Amazon Web Services, Google Cloud Platform).

  Additional/Preferred Qualifications:

  Demonstrated experience growing and coaching people, and acts as a role model for others.

  6+ years technical engineering experience with coding in languages including, but not limited to Golang, Java, or Python.

  6+ Experience with containers and container orchestration systems

  6+ Experience operating and evolving large-scale distributed systems in a cloud infrastructure (such as Kubernetes, Apache Mesos, Docker)

  6+ Experience with Infrastructure as code (Terraform)

  6+ Experience with large scale databases (Postgress, data sharding)

  6+ Experience with Linux, Ubuntu, Microsoft Azure, Amazon Web Services, Google Cloud Platform is preferred.

  Site Reliability Engineering M5 - The typical base pay range for this role across the U.S. is USD $133,600 - $256,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $173,200 - $282,200 per year.

  Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

  Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Non Certified Ophthalmic Assistant Pediatrics
Description*This role requires travel to other Cleveland-area Ophthalmology clinics (Mayfield Heights, Westlake, RBC Main Campus, RBC Midtown)Essential FunctionsAssists healthcare providers in the am
Crew Member
Crew Member (24004006) Description CULTIVATING A BETTER WORLD Food served fast doesn't have to be a typical fast food experience. Chipotle has always done things differently, both in and out of our r
Pharmacy Technician
Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced
Prekindergarten Teacher (Downtown Pittsburgh)
We are currently looking for warm and nurturing teachers who have the creativity and desire to engage children positively and effectively, through unique activities, that provide children with a fun,
Store Associate
Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced
Planeador de la demanda y suministro
Job Description Summary ¿Quieres pertenecer a una empresa líder global en tecnología médica que actúa para impulsar el mundo de la salud? ¡Sé parte de algo grandioso! BD es una de las compañías de te
Change Management - Human Capital - Manager
Tax Transformation-Change Management - Manager Description The Tax Transformation Office serves Deloitte Tax, focused on transforming client service delivery inclusive of people, process, and technol
Chiller HVAC Mechanic (Union) - Winter Park, FL
Job Family: Manufacturing Req ID: 401775 Position Details: We are looking for a skilled Union Chiller HVAC Mechanic who has worked in commercial, industrial, and/or institutional markets to join our
Business Development Representative
About PathFactory PathFactory is a content intelligence platform that helps B2B marketing, sales, and support teams create personalized content experiences for their buyers. It uses AI to track user
Crew Member
Crew Member (24004294) Description CULTIVATING A BETTER WORLD Food served fast doesn't have to be a typical fast food experience. Chipotle has always done things differently, both in and out of our r
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved