Home
/
Comprehensive
/
Senior Principal Site Reliability Developer
Senior Principal Site Reliability Developer-March 2024
Boston
Mar 28, 2026
About Senior Principal Site Reliability Developer

Job Description

Oracle is looking for a Senior Principal Site Reliability Developer with world-class experience in developing and supporting large scale cloud deployments across the world. The candidate should have expert level knowledge with hands-on experience in managing complex microservices architectures using service mesh technologies like Linkerd, alongside API gateways on Kubernetes. Strong proficiency in monitoring and visualization using Grafana to identify and troubleshoot performance issues within distributed systems is required Oracle Weblogic Application, Automation, and Running the System Production at Operational Level. The position is part of SaaS Reliability Engineering organization and provides a unique opportunity to work on cutting edge of cloud technologies, tools, products, and cloud services. The candidate must be US Citizen should be willing to work beyond regular business hours and during weekends/holidays on need basis.

Organization: SaaS Engineering

The Oracle Cloud is a suite of Oracle applications, middleware and database offerings delivered in a self-service, subscription-based, elastically scalable, reliable, highly available and secure manner. The Oracle Cloud is an enterprise cloud for business. It is an integrated suite of services spanning Oracle's complete portfolio based on open Java and SQL standards offering flexible cloud deployment. The services offered in our cloud are based upon Oracle's complete portfolio of best-in-class solutions.

As part of SaaS Engineering, we consolidate and simplify IT operations and applications instances across hosted services in Oracle Cloud. We partner with engineering teams to develop product modules to be offered as a service. We collaborate with product quality assurance team to run test cases and to ensure high quality service after release of product versions. Our team works with the best of the class next generation Oracle Fusion technical-stack components such as Oracle Autonomous Database, Oracle WebLogic Server 12C/14C, Oracle Business Intelligence, Oracle Identity management, Oracle Virtual machine and Hybrid/Spectra Service. The team has excellent expertise in cutting-edge products and technologies like AI, ML, Oracle Fusion applications, hosting products as Software as a Service Platform as a Service and supporting services in next generation Oracle Cloud Infrastructure.

Key Responsibilities:

DevOps - Kubernetes administration, including installation, configuration, and troubleshooting. Grafana and Prometheus Administration - Develop and implement custom dashboards for monitoring key metrics. Data Analysis and Visualization. Strong understanding of monitoring best practices, alerting, and data analysis.

Middleware Technology Expert - Part of Oracle Weblogic Administration team to manage the server life cycle to monitor the application services. Troubleshooting the key problems on various layers of the SaaS application and infrastructure. Provide internal analysis, enhance, and maintain existing environment(s) capacity and capabilities.

Automation – Clear understanding of automation and orchestration principles is the key. Automate operational tasks/deployments, develop scalable solutions contribute towards transition to algorithmic IT operations. Develop the solutions so that fleet wide deployment, tracking and updates can be done.

Ownership Scope – Good understanding of end-to-end configuration and technical dependencies. In partnership with Service Development and Operations partners will have responsibility to ensure that services are designed and delivered to be mission critical with focus on monitoring, telemetry, security, resiliency, scale, and performance. Engineer solutions so that services are compliant and meet/exceed the service level agreements. Collaborate with various teams during Service outages, capacity expansion, infrastructure maintenance as well as ensure adherence to production deployment standards.

Required Skills:

5+ years of experience in Oracle Weblogic along with automation skills (WLST/Shell Scripting) with BS Comp+uter Science or equivalent qualification. Master’s Degree in Computer Science or Management is preferred.

Should have administration skills on any WebServer like OHS (Oracle HTTP Server) or Apache

Experience in cloud development languages Kubernetes , Python , and Prometheus

Experience in working in Linux OS environments

Experience in deploying and running large scale online systems built on Cloud platforms such as Oracle Cloud, AWS, Azure, Google Cloud Platform, and/or OpenStack.

Experience in designing and implementing solutions for platform and application layer telemetry, monitoring, scalability, performance and reliability.

Knowledge on any parallel job execution framework tools like Marionette Collective (MCollective)

Technical skills and knowledge that extends across Application/Server/Storage/Network technologies to troubleshoot and provide system level guidance/solutions.

Strong ability to solve operational problems, with ability to identify and automate common routines.

Excellent written and verbal communication skills.

Willingness to learn new technologies

Preferred Additional Skills:

Experience in AI and ML is preferred

Experience in ava programing and understanding of structured SQL statements will help.

Prior experience as a Service Reliability Engineer or DevOps Engineer.

Experience with automated service deployment tools

A strong focus on business outcomes

Comfortable with collaboration, open communication and reaching teams in boundaryless manner.

Knowledge on Incidents/Request and change management process is a plus.

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Career Level - IC5

Responsibilities

DevOps - Kubernetes administration, including installation, configuration, and troubleshooting. Grafana and Prometheus Administration - Develop and implement custom dashboards for monitoring key metrics. Data Analysis and Visualization. Strong understanding of monitoring best practices, alerting, and data analysis.

Middleware Technology Expert - Part of Oracle Weblogic Administration team to manage the server life cycle to monitor the application services. Troubleshooting the key problems on various layers of the SaaS application and infrastructure. Provide internal analysis, enhance, and maintain existing environment(s) capacity and capabilities.

Automation – Clear understanding of automation and orchestration principles is the key. Automate operational tasks/deployments, develop scalable solutions contribute towards transition to algorithmic IT operations. Develop the solutions so that fleet wide deployment, tracking and updates can be done.

Ownership Scope – Good understanding of end-to-end configuration and technical dependencies. In partnership with Service Development and Operations partners will have responsibility to ensure that services are designed and delivered to be mission critical with focus on monitoring, telemetry, security, resiliency, scale, and performance. Engineer solutions so that services are compliant and meet/exceed the service level agreements. Collaborate with various teams during Service outages, capacity expansion, infrastructure maintenance as well as ensure adherence to production deployment standards.

A BS or MS in Computer Science, or equivalent. Provides strategic and comprehensive complex business solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Provides strategic and comprehensive complex business solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 12+ years experience of running large scale customer facing web services.

About Us

As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s problems. True innovation starts with diverse perspectives and various abilities and backgrounds.

When everyone’s voice is heard, we’re inspired to go beyond what’s been done before. It’s why we’re committed to expanding our inclusive workforce that promotes diverse insights and perspectives.

We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.

Oracle careers open the door to global opportunities where work-life balance flourishes. We offer a highly competitive suite of employee benefits designed on the principles of parity and consistency. We put our people first with flexible medical, life insurance and retirement options. We also encourage employees to give back to their communities through our volunteer programs.

We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by calling +1 888 404 2494, option one.

Disclaimer:

Oracle is an Equal Employment Opportunity Employer*. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

* Which includes being a United States Affirmative Action Employer

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Instructor/Assistant Professor, Prosthodontics
Details Title Instructor/Assistant Professor, Prosthodontics School Harvard School of Dental Medicine Department/Area Prosthodontics Position Description The Harvard School of Dental Medicine ( HSDM
Seasonal Hire Project Manager- (Hybrid) Radiology
Seasonal Hire Project Manager- (Hybrid) Radiology Location: Texas Medical Center-Houston, Texas Category: Legal McGovern Medical School at UTHealth Houston Requisition #: 2400008V What we do here cha
AVP, Experience Design - Global Wealth & Asset Management
We are a leading financial services provider committed to making decisions easier and lives better for our customers and colleagues around the world. From our environmental initiatives to our communi
Phlebotomist
At LabCorp we have a passion in helping people live happy and healthy lives. Every day we provide vital information that helps our clients and patients understand their health. If you are passionate
Redi-Mix Driver
Position Overview: A ready-mix driver is not your typical driving position. Instead of being on the road all week away from home, hauling the same route, and seeing the same people, you get to experi
Technical Analyst 4-Support
Job Description As a Senior Technical Engineer specializing in Cloud support within our Support organization, your primary objective is to provide advanced support and solutions to Oracle's Cloud cus
Experienced RN Emergency Department PRN
Join the Cleveland Clinic team, where you will work alongside passionate caregivers and provide patient-first healthcare. Cleveland Clinic is recognized as one of the top hospitals in the nation. At
Financial Institution Specialist (Risk Management)
Summary Candidates selected for the Financial Institution Specialist (FIS) position will be assigned to an FDIC Field Office in the Division of Risk Management Supervision (RMS) and participate in fo
Restaurant Team Member Part Time
Req ID: 431236 Address: 1003 S. Medford Dr. Lufkin, TX, 75901 Benefits: * Paid Time Off * Flexible Scheduling * 401(k) – 100% match up to 5% * Medical/Dental/Vision Insurance after 30 days * Competit
General Utility Worker - Boston University
Job Description Are you self-motivated and proud of the work you do? Here at Aramark, we take pride in the level of service and safety we provide! As a General Utility Worker on our team of other ser
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved