Home
/
Comprehensive
/
Principal Site Reliability Engineer
Principal Site Reliability Engineer-January 2024
Vancouver
Jan 29, 2026
About Principal Site Reliability Engineer

  What is Viva Engage?

  Viva Engage is the industry-defining social network for the enterprise. We provide a platform for millions of employees, including those from 85% of Fortune 500 companies, to build community and culture, share knowledge, and connect with their leaders and each other.

  Why Viva Engage?

  Acquired by Microsoft in 2012, Viva Engage combines the benefits of a startup - rapid innovation, cutting-edge technology, outsized individual impact - with the advantages of working for one of the most successful software companies in the world. We believe in mission-driven work and in this post-Covid world, our platform has become more indispensable than ever as it fosters connection and a sense of belonging among remote teams. #VivaEngage

  You will have:

  Autonomy and freedom to innovate

  Choice of the best of open source and Microsoft-internal technology

  The ability to experiment, A/B test, and make data-driven decisions

  Opportunity for outsized impact as part of a small but mighty team on a rapidly-growing product needed now more than ever.

  As a Principal Site Reliability Engineer in Viva Engage, you will have two critical accountabilities:

  The first is driving efforts to fully embrace site reliability engineering principals while building critical infrastructure, optimizing existing systems, and eliminating toil. You will lead efforts that combine software and systems engineering to build, scale and operate the large-scale conversation platform that powers Viva Engage experiences.

  The second expectation is to improve overall reliability for Viva Engage. This means guiding and influencing peers to develop missing capabilities, and driving changes to our culture and processes to make reliability a critical aspect of how we work. We have been growing rapidly to become a critical workload for many of the world’s largest organizations and are looking for you to help us get to the next level.

  Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

  Responsibilities

  Develop and execute on the observability and telemetry strategy

  Own the telemetry and monitoring infrastructure

  Continually seek deeper insights into the performance, reliability & scalability of our systems

  Improve service reliability for the entire Yammer team, by reducing mean time to recovery (MTTR)

  Help all of Yammer prevent service incidents altogether

  Qualifications

  Required/Minimum Qualifications

  8+ years technical experience in software engineering, network engineering, or systems administration

  OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 5+ years technical experience in software engineering, network engineering, or systems administration

  OR Master's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration

  OR Doctorate Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.

  6+ years of experience building large scale distributed systems.

  6+ years of experience in a Site Reliability Engineering role building and operating systems with world-class reliability at huge scale

  Preferred Qualifications/Attributes

  Knowledge of log and metrics pipelines (ELK stack or cloud services)

  Troubleshooting skills and ability to trace request through an entire stack.

  Micro services development, deployment, and monitoring.

  Curious about reliability and performance, in all levels of the stack 

  Experience with large datasets and data migrations

  Azure | AWS | GCP automation 

  Site Reliability Engineering IC5 - The typical base pay range for this role across Canada is CAD $132,800 - CAD $247,200 per year.

  Find additional pay information here:

  https://careers.microsoft.com/v2/global/en/canada-pay-information.html

  Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Cook
Hot burgers, cold shakes, and little moments of magic right in the neighborhood. At SONIC, we do things a little differently. We find the fun, the moment of chill in the every-day. Working at SONIC,
Autonomous Database Engineer (Cloud DBA & DevOps)
Job Description Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, sca
Field Project Manager
Clean Harbors is seeking a Field Project Manager to join our team. The Field Project Manager is primarily responsible for estimating remediation projects. This position will manage 1-6 projects in al
Litigation Attorney
Description Our client is a nationally recognized law firm seeking an experienced Litigation Attorney to join the firm’s Columbus team. If you tackle complex commercial disputes, construction complex
Registered Nurse - Home Health Care - 30 hrs/week - Days
Registered Nurse - Home Health Care - Part Time - Days Part-Time Benefit Eligible Shift: Days GENERAL SUMMARY Are you a registered nurse committed to making a meaningful difference in your community?
Commission Route Delivery Driver
BlueTriton is a leading provider of spring and purified water products - the name behind your favorite bottled water brands. We are proud to offer an extensive portfolio of highly recognizable, respo
Sr Financial Analyst
Clorox is the place that’s committed to growth – for our people and our brands. Guided by our purpose and values, and with people at the center of everything we do, we believe every one of us can mak
Shift Manager
SHIFT MANAGER We’re glad you’re here. You may know us as the brand with Roast Beef and Curly Fries – but we are also crafting incredible career opportunities. You’re in the right place if you’re here
Janitor
In most jobs, everyone doesn’t spontaneously erupt into applause and start raining down high-fives. At Buffalo Wild Wings, that’s just a Thursday night. This is the place to start the next phase of y
Crew Member
Crew Member (24001722) Description CULTIVATE A BETTER WORLD Food served fast does not have to be a typical fast-food experience. Chipotle has always done things differently, both in and out of our re
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved