Explore numerous Site Reliability Engineer (SRE) positions, focusing on system stability, automation, and performance. These roles demand expertise in software development, IT operations, and cloud computing. SREs work to improve system uptime, reduce incidents, and streamline processes through automation. They collaborate with development and operations teams to ensure reliable and scalable infrastructure.

Key responsibilities include monitoring system performance, responding to incidents, and implementing proactive solutions. Professionals in this field often use tools like Kubernetes, Docker, and various cloud platforms. They also engage in capacity planning, performance tuning, and security enhancements. The demand for skilled SREs is growing, reflecting the increasing reliance on robust and efficient IT systems.

Job boards list opportunities for Site Reliability Engineers across diverse industries. These positions offer competitive salaries and career growth potential. Companies seek candidates with strong problem-solving skills and a passion for maintaining high-availability systems. If you have a background in DevOps, software engineering, or system administration, a career as a Site Reliability Engineer might be a great fit.

What People Ask

Site Reliability Engineers focus on ensuring the reliability, scalability, and performance of IT systems. They implement automation, monitor system health, and respond to incidents. They collaborate with development and operations teams to improve system uptime and efficiency.

Important skills include proficiency in software development, IT operations, and cloud computing. Experience with tools like Kubernetes, Docker, and various monitoring systems is beneficial. Strong problem-solving and communication skills are also important for collaborating with different teams.

The salary range for a Site Reliability Engineer in the US typically falls between $120,000 and $180,000 per year, depending on experience and location. Senior roles and positions in high-cost areas may offer higher compensation. This range reflects the demand for skilled professionals in this field.

Site Reliability Engineers can advance to senior SRE roles, become team leads, or specialize in areas like cloud infrastructure or security. Some may transition into DevOps engineering or architecture positions. The career path offers opportunities for growth and specialization based on individual interests and skills.

Top employers for Site Reliability Engineers in the US include Google, Amazon, and Microsoft. These companies invest heavily in maintaining reliable and scalable infrastructure. They offer numerous opportunities for SREs to work on cutting-edge technologies and complex systems.

Industry

View All Site Reliability Engineer Jobs

5,520 Site Reliability Engineer jobs in the United States

Site Reliability Engineer

30081 Smyrna, Georgia Insight Global

Posted 1 day ago

Tap Again To Close

Job Description

Job Description
A client of Insight Global is looking for an SRE to join their infrastructure team working within a heavy API environment. The individual would be responsible for working on Home Depot's internal proxy called Vantage. This person would be responsible for working on OSI networking layers as well as automating the deployment and visualization of the proxy. Additionally, this individual will also have additional responsibilities including debugging, test building, and occasional customer support. Pay rate for this position will be between $65 and $70/hr.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: and Requirements
· 7-10+ years of experience as an SRE in the networking space
· Heavy Terraform experience
o Must know how to use variables within Terraform
· Experience with Kubernetes
o Must be able to build, optimize and modify containers within Docker
· GitHub experience
· Ansible experience
○ Building Ansible templates
· Experience with Bash/Shell scripting as well as low testing
· Experience building Grafana dashboards
· GCP experience
· Candidate must have a public GitHub account · Golang experience
Envoy proxy experience

View Now

Site Reliability Engineer

99811 Juneau, Alaska Oracle

Posted 2 days ago

Tap Again To Close

Job Description

**Job Description**
This role aligns to work done for the US Federal Government and requires US citizenship among other qualification outlined below. Including a Federal Investigation into your background to gain Public Trust.
RTHS DevOps is responsible for the CareAware Cloud Saas across all our cloud regions internal and client facing. The team is responsible for keeping the lights on as well as other needed deployments, projects, and new implementations.
As a member of the RTHS DevOps team you will be responsible for daily operational tasks required to run it for all our cloud clients. You will monitor and maintain server performance, availability, and ensure compliance to Service Level Agreements. You will address operational systems issues as needed. You will deploy new code, onboard new clients or new solutions and complete technology upgrades. As we move into the future projects, we have critical involvement in our OCI cloud build out and client migrations giving an opportunity to get involved from the ground of these new regions and apply dev ops thinking from the beginning.
**Responsibilities**
As a member of the RTHS DevOps team you will be responsible for daily operational tasks required to run it for all our cloud clients. You will monitor and maintain server performance, availability, and ensure compliance to Service Level Agreements. You will address operational systems issues as needed. You will deploy new code, onboard new clients or new solutions and complete technology upgrades. As we move into the future projects, we have critical involvement in our OCI cloud build out and client migrations giving an opportunity to get involved from the ground of these new regions and apply dev ops thinking from the beginning.
Qualifications:
+ Deep Linux Knowledge
+ Strong knowledge of Kubernetes
+ System Monitoring and troubleshooting
+ Networking Monitoring and troubleshooting
+ Cloud experience OCI or AWS preferred
Disclaimer:
**Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.**
**Range and benefit information provided in this posting are specific to the stated locations only**
US: Hiring Range in USD from: $79,800 to $178,100 per annum. May be eligible for bonus and equity.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - IC3
**About Us**
As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector-and continue to thrive after 40+ years of change by operating with integrity.
We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing or by calling in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

View Now

Site Reliability Engineer

99811 Juneau, Alaska iCIMS

Posted 2 days ago

Tap Again To Close

Job Description

**Job Overview**
We are seeking a skilled Engineer, Site Reliability (SRE) to contribute to the reliability, scalability, and performance of our multi-cloud SaaS platform serving thousands of customers worldwide. This role involves hands-on technical work in incident response, system monitoring, automation, and continuous improvement of our platform reliability. The successful candidate will work within a global SRE team to ensure optimal system performance and customer satisfaction.
**About Us**
When you join iCIMS, you join the team helping global companies transform business and the world through the power of talent. Our customers do amazing things: design rocket ships, create vaccines, deliver consumer goods globally, overnight, with a smile. As the Talent Cloud company, we empower these organizations to attract, engage, hire, and advance the right talent. We're passionate about helping companies build a diverse, winning workforce and about building our home team. We're dedicated to fostering an inclusive, purpose-driven, and innovative work environment where everyone belongs.
**Responsibilities**
+ **System Monitoring & Reliability:**
+ Monitor multi-cloud infrastructure (AWS, Azure, GCP) using New Relic, Grafana, and Sumo Logic
+ Maintain reliability of AWS resources, Auth0/Okta authentication, databases, and legacy applications
+ Implement monitoring, alerting, and dashboards for assigned systems
+ **Incident Management & Response:**
+ Respond to alerts and incidents within SLA timeframes
+ Perform root cause analysis and document findings
+ Create and maintain runbooks and troubleshooting procedures
+ Participate in 24/7 on-call rotation
+ **Automation & Improvement:**
+ Develop scripts to reduce manual operational overhead
+ Build monitoring and alerting solutions
+ Support infrastructure-as-code initiatives
+ Implement automated remediation where possible
+ **Success Metrics:**
+ **Customer Impact** : Reduced MTTR and improved customer satisfaction scores
+ **Reliability** : Achievement of 99.9%+ uptime SLAs across all products and regions
+ **Proactive Prevention:** Reduction in incident frequency through automated detection and prevention
+ **Cross-functional Collaboration:** Improved partnership metrics with Product, Engineering, and Customer Success teams
+ **Automation Delivery:** Complete assigned automation projects to reduce manual tasks
+ **Knowledge Sharing:** Contribute to team knowledge base and mentor junior engineers
**Qualifications**
+ 4+ years experience in SRE, DevOps, or Infrastructure Engineering
+ Hands-on experience with AWS (required) and Azure (preferred)
+ Strong Linux system administration skills
+ Experience with monitoring tools (New Relic, Grafana, Prometheus)
+ Scripting skills in Python, Bash, or similar
+ Knowledge of databases (SQL Server, PostgreSQL, MongoDB)
**Preferred**
**Technical Experience:**
+ SaaS experience in a global environment
+ Authentication and identity management systems knowledge
+ Cloud certifications (AWS, Azure, or Google Cloud)
+ Infrastructure-as-code tools (Terraform, CloudFormation)
**Education/Certifications/Licenses:**
+ Bachelor's degree in computer science, Engineering, Information Systems, or related technical field
+ Equivalent combination of education and experience will be considered
**Working Conditions:**
+ Global role requiring flexibility for incident response and team coordination across time zones
+ Occasional client-facing responsibilities during critical incidents
+ Travel may be required for team building
+ Hybrid work environment with team members distributed globally
**EEO Statement**
iCIMS is a place where everyone belongs. We celebrate diversity and are committed to creating an inclusive environment for all employees. Our approach helps us to build a winning team that represents a variety of backgrounds, perspectives, and abilities. So, regardless of how your diversity expresses itself, you can find a home here at iCIMS.
We are proud to be an equal opportunity and affirmative action employer. We prohibit discrimination and harassment of any kind based on race, color, religion, national origin, sex (including pregnancy), sexual orientation, gender identity, gender expression, age, veteran status, genetic information, disability, or other applicable legally protected characteristics. If you would like to request an accommodation due to a disability, please contact us at
**Compensation and Benefits**
We accept applications for this position on an ongoing basis until the position is filled. Applications will be reviewed as they are received, and qualified candidates may be contacted throughout the posting period.
The anticipated base pay range for this position is $100,000-140,000.00 annually. Final compensation will be based on factors such as relevant experience, skills, education, internal equity, and market data. This range aligns with our commitment to equitable and transparent compensation practices, as required by applicable law.
Competitive health and wellness benefits include medical, dental, vision, 401(k), dependent care, short term and long-term disability, life and AD&D insurance, bonding and parental leave, mindfulness resources, an open vacation policy, sick days, paid holidays, quiet hours each workday, and tuition reimbursement. Benefits and eligibility may vary by location, role, and tenure. Learn more here:

View Now

Industry

View All Site Reliability Engineer Jobs

Menu

Search Suggestions

Recent Searches

Popular Searches

Location Suggestions

Popular Locations

What People Ask

Nearby Locations

Other Jobs Near Me

Industry

5,520 Site Reliability Engineer jobs in the United States

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Be The First To Know

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Nearby Locations

Other Jobs Near Me

Industry

Search Suggestions

Recent Searches

Popular Searches

Location Suggestions

Popular Locations

What People Ask

What are the main responsibilities of a Site Reliability Engineer? expand_more

What skills are important for a Site Reliability Engineer? expand_more

What is the typical salary range for a Site Reliability Engineer in the US? expand_more

What career paths are available for Site Reliability Engineers? expand_more

Who are the top employers for Site Reliability Engineers in the US? expand_more

Nearby Locations

Other Jobs Near Me

Industry

5,520 Site Reliability Engineer jobs in the United States

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Be The First To Know

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Site Reliability Engineer

Job Description

Nearby Locations

Other Jobs Near Me

Industry