Sr Site Reliability Engineer - RemoteA Staff Site Reliability Engineer on the Site Reliability Engineering team at Myriad Genetics incorporates aspects of software engineering and applies them to infrastructure and operations problems alongside our software developers and IT operations staff. This role acts as a lead between software engineering teams and the SRE team to ensure the needs of the software engineering team are met within established SLAs and that the engineering team is adopting SRE best practiices and supported platforms.A staff site reliability engineer on the SRE team architects and builds tooling for developing and deploying applications and enables software engineering teams to deploy, monitor, and maintain their applications on a Kubernetes platform using GitOps and GitHub actions as well as providing support and modernization of legacy infrastructure.Job Roles and Responsibilities:* Participate in the SRE architecture team, contributing to an overall architectural roadmap. This includes selection of appropriate standard technologies used in applications and designing and developing tools for application development and deployment for use by software engineers and other SREs.* Facilitate the migration of applications to cloud/containers and on-boarding of development teams onto those technology platforms.* Mentor a team of SREs to support software development teams deploying applications using the tools and technologies of the enterprise platform.* Lead an embedded team of SREs assigned to a development team to provide technical support on systems architecture, performance, capacity planning, deployments, environment configuration and monitoring.* Aid assigned development team in maintaining highly available and stable production systems by implementing SLOs for applications including metrics like uptime, performance/latency, and quality of service delivery.* Address production related issues and work with developers in assigned development team to correct systematic issues.* Spearhead the testing and evaluation of new technologies to increase performance, developer efficiency, and reliability for applications.* Be on-call for critical outages in a scheduled 16x6 rotation.* Establish effective working relationships between partners in SRE, IT Operations and assigned development team.Required Skills and Experience* Experience in a technical field including SRE, DevOps, software development, or systems administration: 8-10 years.* Experience with algorithms, data structures, complexity analysis and software design.* Experience with at least one of Python, .NET, Java, Javascript, or other programming language (5+ years).* Experience with containerization (Docker / Kubernetes / Openshift / etc) (3 years).* Experience with Amazon AWS (5 years).* Experience with at least one CI/CD technology, e.g. Tekton, ArgoCD, Jenkins, GoCD, TeamCity, etc.* Experience with monitoring and metrics tools: DataDog, New Relic, Prometheus/Grafana, etc.* Strong... For full info follow application link.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. In hiring and all other employment decisions, we prohibit discrimination and harassment on the basis of any protected characteristic, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as any mental health or physical disabilityneeds.
#LifeSciences