Home
/
Comprehensive
/
Senior Software Engineer, DevOps and Infrastructure Automation
Senior Software Engineer, DevOps and Infrastructure Automation-May 2024
Santa Clara
May 15, 2026
ABOUT NVIDIA
NVIDIA is a computing platform company, innovating at the intersection of graphics, HPC, and AI.
10,000+ employees
Technology
VIEW COMPANY PROFILE >>
About Senior Software Engineer, DevOps and Infrastructure Automation

  NVIDIA is searching for a Senior Software Engineer, DevOps Infrastructure Engineering and Automation engineer for the bringing up, development and prototyping a class of products and services for our Metropolis platforms on multi cloud environments and on-Prem. Data is the lifeblood of the modern city. Today, it is captured by over 500 million cameras worldwide, and that number is growing exponentially. This is creating a tsunami of information that's impossible for humans to analyze. AI is the key to turning this information into insight. It's redefining how we collect, inspect, and analyze data to impact everything from public safety, traffic, and parking management to law enforcement and city services. NVIDIA Metropolis is leading this AI revolution, providing the tools, technologies, and expertise to meet every challenge with more thoughtful, faster applications.

  This exciting role will require someone who can build and the deploy sophisticated Artificial Intelligence applications for Streaming video and data analytics to market. Practical experience in the use and administration of server virtualization technology will be highly conducive. Your understanding and knowledge of complex applications built on both on-Prem and cloud infrastructure, across operating systems and device classes and Cloud Services is a prerequisite. Your ability to automate all aspects of a modern application delivery and deployment pipeline using: source code management and build tools, Test automation tools, Containerization, Configuration management tools, Performance analysis tools, monitoring tools will be essential to your success.

  What you'll be doing:

  As a key member of our Metropolis team, you will build, deploy and maintain GPU based Servers for its use in Metropolis platforms and machine learning applications for its test, development and production environments both on Premise and cloud.

  Leading design and be responsible for infrastructure components on Network topologies, Streaming Servers and Security.

  Collaborating with different software, IT, Security and hardware teams across geographies for solving critical problems and performance issues.

  Establish configuration environment for these servers by creating processes and tools that can be widely deployed in the industry for software development, debugging, testing, benchmarking and documentation

  Automate provisioning and management of bare-metals, internal cloud, Microsoft Azure, Amazon AWS

  Automate performance measurement of GPU based AI applications.

  Implement automated monitoring and operating procedures for a range of domains across on-premise/cloud environments

  Build and maintain infrastructures related to the delivery of software artifacts produced by Metropolis application development teams.

  Build detailed documentation that will allow customers and partners and system integrators to replicate the deployment architecture prototyped

  What we need to see:

  BS or MS in Computer Science, Computer Engineering or Electrical Engineering or related field (or equivalent experience)

  5+ years of proven track record in Configuration Management, Server administration (Linux) in an Engineering Hardware Lab environment.

  Excellent programming skills in Python, Shell Scripting, ansible, terraform, Helm Template

  Application Performance analysis measurement and reporting.

  Solid understanding of configuring and handling Elasticsearch, Logstash, Kibana, Kafka ecosystem.

  Software build, package and delivery skills with Jenkins, Pipeline Scripting, Dockerfile, Artifactory integration, Container Registry, Helm Package repositories.

  Good understanding of Kubernetes ecosystem and helm based application deployment patterns.

  Cloud Infrastructure provisioning automation with AWS, GCP, Azure, OCI using Terraform, Cloud Formation etc..

  Ways to stand out from the crowd:

  Building configuration management, monitoring and automation tools

  Familiarity in management of large scale of edge servers deployed in indoor and outdoor environments.

  Strong interpersonal skills

  With competitive salaries and a generous benefits package (www.nvidiabenefits.com ), we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our best-in-class engineering teams are rapidly growing. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you!

  The base salary range is 136,000 USD - 253,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

  You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) . NVIDIA accepts applications on an ongoing basis.

  NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

  NVIDIA is a Learning Machine

  NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest industries and profoundly impacting society.

  Learn more about NVIDIA .

Comments
Welcome to zdrecruit comments! Please keep conversations courteous and on-topic. To fosterproductive and respectful conversations, you may see comments from our Community Managers.
Sign up to post
Sort by
Show More Comments
SIMILAR JOBS
Sr SDE, EC2 HealthAnalytics, dbrown Team
Description EC2 Health Analytics Team is responsible for Classification, Measurement and Analysis of failure events across the EC2 fleet to improve AWS fleet reliability and improve customer experien
Engineering Technician VI (EPFSS)
AI Signal Research, Inc. (ASRI) is recruiting for the Engineering Prototype Fabrication Support Services (EPFSS) Task Order at NSWC Dahlgren Division. Education: High School Diploma or GED Months/Yea
Security Compliance Specialist, Amazon Stores Security
Description Are you interested in driving exceptional security for customers? Do you see information security as a business enabler? Amazon’s Stores Security organization is seeking an experienced Se
Structural Engineer - Seismic SME
DescriptionLJB Inc. is a national engineering firm that provides civil and structural engineering, as well as geospatial, safety and environmental services. Our diverse expertise, client base, and ge
Team Member - Food Champion
Work today, get paid today? Yes!! Apply and learn how! Hospitality Restaurant Group (Taco Bell) is looking for Food Champions who love serving customers and want to further their professional careers
Vice President, Program and Project Management I
Reference #: 48023Project/Program Management - IC3Participates in or leads a wide array of activities associated with project planning and management to ensure that projects are completed on time, wi
Certified Nursing Assistant
$1000 Sign On Bonus for Part time $20-$24 per hour based on experience + Shift and Weekend Differentials where applicable Part-Time C.N.A. Skilled Care: All shifts Days TBD Are you dedicated to provi
After Hours RN (5pm- 8am)- Sign on Bonus $5000
Description Position at Lifespark Lifespark is a complete senior health company headquartered in St. Louis Park, Minnesota. Since 2004, we've been helping seniors stay healthy, navigate their health
Software Engineer II, AWS Platform Cloud Operations
Description The Amazon Web Services (AWS) Change Management is a core Systems Manager feature. Our team simplifies the way you request, approve, implement, and report on operational changes to your a
LPN Shared Services/PRN Physician Services
Southern Tennessee Regional Health System has an opportunity for you to join our team. A Joint Commission accredited hospital, Sothern Tennessee Regional Health System serves the patients of south-ce
Copyright 2023-2026 - www.zdrecruit.com All Rights Reserved