Delta Air Lines, Inc. Senior Site Reliability Engineer in ATLANTA, Georgia
Senior Site Reliability Engineer
United States, Georgia, Atlanta
Ref #: 9347
How you'll help us Keep Climbing (overview & key responsibilities)
The SRE works with developers to improve the Reliability and Resiliency of Delta IT applications to meet the business requirements by implementing SRE tools, processes, and best practices. SRE is what happens when you ask a software engineer to design an operations function. The SRE helps to design, develop, test, debug, and automate tasks for applications. They troubleshoot incidents to address failure patterns, automate remediation through runbooks, and document application optimization.
Be willing to learn whatever technologies, tools, or patterns necessary to solve a problem. Be inquisitive, ask a lot of questions to learn from everyone.
Work collaboratively with business stakeholders, developers, managers, and leaders to create solutions for real world problems. Build domain knowledge and understand the user & business problems you’re solving.
Use various tools and techniques to ensure application resiliency, availability, reliability, and performance of applications
Conduct blameless postmortems, provide training to other teams on SRE best practices and benefits
Gather and maintain SOPs, SATs, One Pagers, SLOs, SLIs, SLAs, How Tos and other useful SRE documentation on SRE wiki page
Actively implement new features and maintain existing web applications created by SRE team
Gathering metrics from various sources and displaying them in meaningful and insightful ways on single page applications
Fostering an environment of self-service by collaborating with members from other teams during build phase for SRE apps which they will be consuming
Adapt quickly to changing business needs and tools ecosystem at Delta
Creating systems that are maintainable, scalable, and extensible and well architected.
Take ownership of problems comes to your attention. Effectively communicate work, decisions, ideas, have good conversations with colleagues.
Look for ways to make the work environment better for everyone. Share knowledge generously, mentor new members. Innovate where solutions don’t exist.
What you need to succeed (minimum qualifications)
Requires a Bachelor's degree in Computer Science, Engineering, or Information Systems or any equivalent combination of experience, education, and/or training in the computer systems engineering field.
7 or more of experience as application developer or Site Reliability Engineer.
2 or more years of team lead experience.
Site Reliability Engineering: Knowledge of the theories and methodologies of reliability
engineering: ability to design, develop and support various tools, services and
applications to maintain a reliable site environment.
Performance Measurement and Tuning: Knowledge of system performance, testing and
programming: ability to monitor, measure, and optimize system performance and
network communication, right sizing of application pods and probes
Experience in Runbook Automation to automate manual tasks and improve efficiencies in processes
Experience in building single page applications / dashboards for monitoring and reporting
Expertise with monitoring solutions: ELK, SUMO Logic, Prometheus, Dynatrace, Grafana
Knowledge of continuous integration/delivery ecosystem: GitLab, Maven/Gradle, Jenkins, Docker, Nexus, Selenium
Unix skills are required. Should have experience with Unix shell scripting.
Experience in cyber security with knowledge of DevSecOps pipeline tools for SAST/DAST.
Experience with modern container orchestration systems: Kubernetes, Swarm
Expertise with infrastructure configuration and automations processes and tools: Ansible
Experience with PagerDuty, ServiceNow
Experience leading support bridge calls for production systems issue resolution.
Experience in supporting application teams and Troubleshooting in all DEV to Production environments
Experience with REST APIs including design, development and build tools supporting APIs
- Embraces diverse people, thinking and styles.
Consistently makes safety and security, of self and others, the priority.
Where permitted by applicable law, must have received or be willing to receive the COVID-19 vaccine by date of hire to be considered for U.S.-based job, if not currently employed by Delta Air Lines, Inc.
What will give you a competitive edge (preferred qualifications)
Knowledge or related experience in the Travel, Tour, or Hospitality industries preferred
Worked in an Agile environment is a plus
Delta Air Lines, Inc. is an Equal Employment Opportunity / Affirmative Action employer and provides reasonable accommodation in its application process for qualified individuals with disabilities and disabled veterans. If you are a qualified individual, you may request a reasonable accommodation if you are unable or limited in your ability to access job openings through this site, apply for jobs through Delta’s online system, or at any point in the selection process. To request a reasonable accommodation, please click here