Microsoft has an exciting opportunity for a Principal Software Engineering Manager in the Cloud+AI Silver Core team. We run Azure Core services in the Microsoft airgapped clouds. These services include the Azure Compute platform, the Azure Networking services, the Azure Kubernetes Services, among others. This team is the foundation of the airgapped clouds.
You will lead a team that will take Microsoft Azure Core for Government to the next level. You will solve hyperscale distributed systems problems. You will write software that is aware of network safety and fault domains. You will perform devops for the Compute Platform running VMs for all our Government customers. You will work on security initiatives that span the whole of Azure. You will build the next generation of AI technologies.
You will be responsible for leading and hiring a team of software engineers in the Reston, VA area and the Redmond, WA area. You will collaborate with teams in Azure Core and the broader Azure to ensure the success of the US Government clouds. The team is growing rapidly, and you’ll have the potential for tremendous career growth.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
You will work with other teams in Azure Silver, participating in architectural and design reviews, and delivering your components with high quality. You will also have opportunities to collaborate across the broader Azure organization and with external Microsoft customers. There is high visibility in an area of large and expanding investment for Azure, offering a terrific opportunity for technical and career growth.
In Azure Silver Core, you own features from the design phase through implementation to production. As part of this ownership, you are expected to help support your features by actively participating in an on-call rotation, which includes creating/updating Standard Operating Procedures (SOPs), Troubleshooting Guides, monitoring systems, mitigating/restoring network incidents, and deep-dive analysis of root causes of outages.
The scale of our operations is enormous. Microsoft's products and services are overwhelmingly consumed online, and billions of people use them every day. We need people who enjoy analyzing complicated problems, coming up with creative solutions, working in focused teams to build things no-one has thought of before, all in the service of production reliability.
Hire amazingly talented, cleared, software engineers to make our customers’ missions successful.
Contributes to the development of automation within production and deployment of a complex product feature. Runs code in simulated, or other non-production environments to confirm functionality and error-free runtime for products.
Acts as a Designated Responsible Individual (DRI) and Incident Manager (IM) working on call to monitor service for degradation, downtime, or interruptions. Alerts stakeholders as to the status and gains approval to restore system/product/service for simple problems. Responds within Service Level Agreement (SLA) timeframe. Escalates issues to appropriate owners.
Contributes to efforts to collect, classify, and analyze data with little oversight on a range of metrics (e.g., health of the system, where bugs might be occurring). Contributes to the refinement of product features by escalating findings from analyses to inform decisions regarding the engineering of products.
Applies best practices to reliably build code that is based on well-established methods. Follows best practices for product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations.
Maintains communication with key partners across the Microsoft ecosystem of engineers. Considers partners across teams and their end goals for products to drive and achieve desirable user experiences and fitting the dynamic needs of partners/customers through product development.
Maintains operations of live service as issues arise on a rotational, on-call basis. Implements solutions and mitigations to more complex issues impacting performance or functionality of Live Site service and escalates as necessary. Reviews and writes issues postmortem and shares insights with the team.
Drives efforts to integrate instrumentation for gathering telemetry data on system behavior such as performance, reliability, availability, usage, and safety mechanisms. Drives sustaining feedback loops from telemetry resulting in subsequent designs. Creates outputs of telemetry such as notifications or dashboards.
Builds, enhances, reuses, contributes to, and identifies new software developer tools to support other programs and applications to create, debug, and maintain code for products. Uses open source when possible. Begins to develop skills in other tools outside areas of expertise. Identifies internal tools and creates tools that will be useful for creating the product, determining if methods are still applicable for the current solution. Shares best practices and teaches others about new tools and strategies.
Qualifications
Required/Minimum Qualifications:
Bachelor's Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience.
Other Requirements:
Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
The successful candidate must have an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination.
Clearance Verification : This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment.
Microsoft Cloud Background Check : This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance
Preferred/Additional Qualifications:
4+ years people management experience leading a team of engineers with outstanding collaboration skills
10+ years of experience with PowerShell, C#, C++ or Java.
Experience working on large-scale distributed services with on-call responsibilities.
Ability to build and influence broadly towards common goals and priorities and to own end-to-end project lifecycle with solid project management and communication skills.
Demonstrated experience shipping large scale software services.
Demonstrated experience being a technical lead for software projects.
Expertise in designing distributed systems and concurrent programming.
Familiarity with at least one flavor of Linux as well as knowledge of distributed storage technologies
Knowledge of network switching and routing, Kubernetes, machine learning and Cloud security a plus.
Software Engineering M5 - The typical base pay range for this role across the U.S. is USD $133,600 - $256,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $173,200 - $282,200 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations (https://careers.microsoft.com/v2/global/en/accessibility.html) .