Who We Are
The name ThousandEyes was born from two big ideas: the power to see what’s not ordinarily possible, and the ability to collect intelligence from vantage points as diverse and global as the Internet. As organizations depend on cloud services, the Internet has become their defacto network connecting cloud applications to users. Our Internet and cloud intelligence platform is like a ‘Google maps of the Internet’, providing the only collectively powered view of digital experiences end-to-end. We enable our customers made up of the world’s largest and fastest-growing brands, to identify problems before they impact revenue, brand reputation, or employee productivity.
In August 2020, Cisco Systems completed the acquisition of ThousandEyes, which now forms the ThousandEyes Business Unit within Cisco’s Network Services Business Group, and is a foundational component of Cisco’s growing Observability business.
About The Role
This role is the Senior Site Reliability Engineering Manager for the Observability SRE team at ThousandEyes. The Observability team is responsible for providing a world class developer experience when they need to understand and observe platform behavior. In addition to visibility, this team drive visibility into action, relentlessly pursuing the goal of a platform that is resilient, fault tolerant, and self-healing.
What You'll Do
As a senior engineering manager leading the Observability team, you will be responsible for the design, development and operations of our internal observability platform. Working with a team of strong and mission focused engineers, you’ll bring a user-focused perspective to delivering observability as a platform for a team running the best observability platform in the industry.
Qualifications
Proven site reliability engineering management experience or experience delivering an internal developer platform focused on production operations, ideally managing 4+ engineers
Can provide strong technical vision for your team and ensure consistent delivery on objectives
Have experience formulating a team's technical strategy and roadmap; you've collaborated and partnered effectively with several other teams to execute on shared goals
5+ years of experience building and supporting missing critical services with focus on automation, observability, availability and performance
Experience building infrastructure and operating services in production environments which are required to have high availability and reliability
You have worked on large-scale distributed systems including multi-tiered architecture
Understand how to balance tactical needs with strategic growth and quality-based initiatives that can span multiple quarters
Preferred Qualifications
Cloud Native Observability via Kubernetes, Prometheus, OpenTelemetry, and other industry standard or CNCF technologies
Operated a cloud service at significant scale
Delivered an engineering-wide platform for service visibility
Owned incident response process, post-mortem practices, or service best practice standards
Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis.
Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.