Job#: 2009933
Job Description:
Sr. Observability Engineer
SummaryThe Senior Observability Engineer will design, implement, and maintain comprehensive observability solutions for complex systems and applications. This position requires a deep understanding of monitoring and observability practices, as well as expertise in using various tools and technologies to collect and analyze performance, logging, and metrics data.Experience and Qualifications:
Overall background inInfrastructure, Employee Experience, Synthetic Monitoring, and Application Performance Monitoring.Experience working in or in close partnershipwith Site Reliability Engineering teams (SRE) is strongly desired.Tools Experience Preferred: SolarWinds, Grafana Cloud, ThoursandEyes, Big Panda IO, Azure Observability Stack and other observability toolsExperience with an Enterprise Event Management SystemMonitoring Setup and Configuration: Set up and configure the monitoring tools to collect data from various systems, applications, and network components. This involves defining monitoring metrics, configuring data collection agents or agents, and ensuring proper connectivity and access.Alert Management: Monitor alerts generated by the tools and perform triage to identify critical issues. Analyze alert patterns, fine-tune alert thresholds, and configure alert escalation workflows to ensure timely response and resolution.Performance Analysis and Troubleshooting: Utilize the tools features and functionalities to analyze performance metrics, logs, and traces. Conduct investigations and root cause analysis to troubleshoot and resolve performance issues, identifying bottlenecks and areas for optimization.Incident Response: Collaborate with cross-functional teams to respond to and resolve incidents in a timely manner. Engage in incident management processes, including incident triage, communication, and coordination with relevant stakeholders, and participate in post-incident reviews to identify areas for improvement.Dashboard and Visualization: Create and maintain dashboards and visualizations using tools like Grafana, providing a consolidated view of system health, performance, and key metrics. Customize dashboards to meet specific business and operational requirements and share them with relevant teams and stakeholders.Capacity Planning and Scalability: Monitor resource utilization and performance trends to forecast capacity requirements. Collaborate with capacity planning teams to plan and provision resources based on anticipated growth and workload patterns, ensuring scalability and optimal performance.Tool Integration and Automation: Integrate observability tools with other systems and workflows, such as ticketing systems, incident management platforms, and automation frameworks. Automate monitoring configurations, data collection, and reporting processes to improve efficiency and reduce manual effort.Continuous Improvement and Research: Stay updated with the latest developments in observability practices and technologies. Research and evaluate new tools and techniques that could enhance the monitoring and observability capabilities of the organization. Continuously improve existing monitoring setups, workflows, and processes to align with industry best practices.EEO Employer
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal historie in a manner consistent with the requirements of applicable law.
Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico.
VEVRAA Federal ContractorWe request Priority Protected Veteran and Disabled Referrals for all of our locations within the state.We are an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, or any other protected characteristic. The EEO is the Law poster is available here.