1,755 Site Reliability Engineer jobs in the United States

Associate Site Reliability Engineer/Site Reliability Engineer (Redwood City)

94063 Woodside, California MedStar Health

Posted 12 days ago

Job Viewed

Tap Again To Close

Job Description

full time

C3 AI (NYSE: AI), is the Enterprise AI application software company. C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing, deploying, and operating enterprise AI applications, C3 AI applications, a portfolio of industry-specific SaaS enterprise AI applications that enable the digital transformation of organizations globally, and C3 Generative AI, a suite of domain-specific generative AI offerings for the enterprise.Learn more at: C3 AI

We are looking for Associate Site Reliability Engineer /Site Reliability Engineer to join our team at our HQ in Redwood City, CA.


Responsibilities:



  • Maximize system uptime and availability, ensuring functional and performance SLAs.

  • Establish end-to-end monitoring and alerting on all critical aspects.

  • Solve complex problems for critical services and build automation to prevent problem recurrence.

  • Influence and create new designs, architectures, standards, and methods for supporting the platform.

  • Initiate and lead scripting and automation to streamline system updates and upgrades.

  • Set up critical infrastructure, tools, and framework to streamline the deployment cycle.

  • Work cross-functionally with Services and Engineering teams.


Qualifications:



  • BS or MS in Computer Science, related field, or equivalent professional experience.

  • Demonstrated experience in deploying, managing, and operating scalable and fault-tolerant Linux/Kubernetes/JVM-based infrastructure in AWS, GCP, and other public clouds.

  • Expertise in Linux Operating Systems, Networking, and Database concepts.

  • Experience deploying, upgrading, and troubleshooting Kubernetes clusters and workloads.

  • Experience with Cassandra (or another NoSQL alternative).

  • Expertise in cloud providers, such as Amazon Web Services, Azure, and GCP.

  • Experience with configuration management systems such as Puppet.

  • Experience in Bash or Python; to automate and monitor systems.

  • Experience with IaC tools like Ansible or Terraform.

  • Excellent problem-solving, critical thinking, and communication skills.

  • Experience supporting as a DevOps or sys admin for commercial SaaS solutions.

C3 AI provides excellent benefits, a competitive compensation package and generous equity plan.

California Base Pay Range

$116,000 $168,000 USD

C3 AI is proud to be an Equal Opportunity and Affirmative Action Employer. We do not discriminate on the basis of any legally protected characteristics, including disabled and veteran status.

#J-18808-Ljbffr
View Now

Site Reliability Engineer

99811 Juneau, Alaska Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now

Site Reliability Engineer

62762 Springfield, Illinois Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now

Site Reliability Engineer

80238 Denver, Colorado Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now

Site Reliability Engineer

96823 Honolulu, Hawaii Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now

Site Reliability Engineer

19904 Rising Sun, Maryland Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now

Site Reliability Engineer

06132 Hartford, Connecticut Motion Recruitment Partners

Posted today

Job Viewed

Tap Again To Close

Job Description

Site Reliability Engineer
**Remote Only**
Contract
$50/hr - $100/hr
You'll closely collaborate with fellow cloud architects and engineers specializing in AWS to design, define, develop, test, and debug cloud solution components. You'll have the chance to work within a GitOps-based framework to create and manage container apps and use products like Kubernetes to further the mission. Use Python to automate across our ecosystem, and work to stabilize and monitor various types of cloud infrastructure following SRE guidelines and principles. Work to resolve immediate issues as a part of an incident response team.
**You Have:**
+ 2+ years of experience with basic SRE principles and guidelines.
+ Experience with troubleshooting, investigating, and resolving issues across many types of cloud infrastructure. Comfortable working during an emergency/incident.
+ 3+ years of experience with the development of tools and processes to drive DevSecOps maturity by automating builds, regression testing, monitoring, and pushing releases across environments
+ 3+ years of experience with troubleshooting, triaging, and resolving issues in CI/CD pipeline failures or latency
+ 3+ years of experience in working with Linux and troubleshooting and upgrading network configuration in RedHat, Ubuntu, and AL flavors
+ Experience with developing enterprise cloud-native platforms using Kubernetes, Docker, or CI/CD tools, including GitHub Actions or GitLab CI/CD
+ Experience with employing an Infrastructure as Code (IaC) approach to managing cloud environments, specifically using Terraform, Terragrunt, and Cloudformation
+ 2+ years of experience with Python automation and frameworks.
+ Ability to obtain a security clearance
+ Bachelor's degree
**Nice If You Have:**
+ Experience in working with GitOps tools (Flux, ArgoCD)
+ CKAD or CKA Certification
+ AWS Certification, including Solutions Architect, DevOps Engineer, Networking, or Security
+ Experience with Python frameworks like FastAPI, Typer and other python tools like pandas and boto3.
**You will receive the following benefits:**
+ Medical Insurance - Four medical plans to choose from for you and your family
+ Dental & Orthodontia Benefits
+ Vision Benefits
+ Health Savings Account (HSA)
+ Health and Dependent Care Flexible Spending Accounts
+ Voluntary Life Insurance, Long-Term & Short-Term Disability Insurance
+ Hospital Indemnity Insurance
+ 401(k) including match with pre and post-tax options
+ Paid Sick Time Leave
+ Legal and Identity Protection Plans
+ Pre-tax Commuter Benefit
+ 529 College Saver Plan
TG Federal is an Equal Opportunity Employer. All applicants must be currently authorized to work on a full-time basis in the country for which they are applying, and no sponsorship is currently available. Employment is subject to the successful completion of a pre-employment screening. Accommodation will be provided in all parts of the hiring process as required under MRP's Employment Accommodation policy. Applicants need to make their needs known in advance.
View Now
Be The First To Know

About the latest Site reliability engineer Jobs in United States !

Site Reliability Engineer

95054 Santa Clara, California Insight Global

Posted today

Job Viewed

Tap Again To Close

Job Description

Job Description
Insight Global is looking for a seasoned SRE to join one of our largest technology clients' multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated internal cloud provisioning products. The team works with various other business units such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their infrastructure & systems needs.
As an SRE, youll also be working in conjunction with various teams such as software engineering to deploy these new products and manage our infrastructure, associated processes and systems. Keen attention to detail, problem-solving abilities, and a solid knowledge base are essential.
This role pays between $60-$65/hour depending on skillset and years of experience.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form ( . The EEOC "Know Your Rights" Poster is available here ( .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: .
Skills and Requirements
-4+ years of proven experience.
-Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent experience.
-System admin and Windows admin experience in an on prem infrastructure environment
-Proficient with Kubernetes, dockers & virtualization.
-Openstack and Ansible experience -Background in Databases like SQL (MySQL) and timeseries DBs like Prometheus.
-Experience with data analytics/visualization tools like Kibana, Grafana, Splunk etc.
-Strong knowledge of networking principles and protocols, including TCP/IP, DNS, DHCP, and VLANs.
-Proficient using source code management and binary repository systems like GitLab, GitHub, Artifactory, Perforce etc.
-Knowledge of monitoring systems such as Zabbix,
-Prometheus, PagerDuty and/or similar systems.
Advanced knowledge of standard methodologies related to security. null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to
View Now

Site Reliability Engineer

95115 San Jose, California IBM

Posted today

Job Viewed

Tap Again To Close

Job Description

**Introduction**
**Your role and responsibilities**
Site Reliability Engineer, IBM Corporation, San Jose, CA:
* Ensure the reliability, scalability, and performance of the data analysis product.
* Design, develop, and optimize scalable data collection and visualization pipelines to enable efficient analysis and insights.
* Build and refine advanced forecasting and anomaly detection models to drive data-driven decision-making and improve system performance.
* Design and implement real-time dashboards using React and Node.js to provide critical performance insights.
* Respond to and manage incidents as part of an on-call rotation, troubleshoot system issues, and implement resolutions to enhance reliability.
* Automate infrastructure provisioning, configuration, and deployment through scripting and CI/CD pipelines, while establishing robust monitoring and alerting systems to proactively address anomalies.
* Employ Infrastructure as Code (IaC) principles to provision and maintain servers, databases, networking, and cloud resources, while planning for capacity and scalability to meet growing demands.
* Ensure security and compliance by implementing best practices, managing updates, and participating in audits.
* Collaborate with software developers, data analysts, and other stakeholders to optimize system performance, while maintaining accurate documentation and creating runbooks for operational excellence.
* Drive continuous learning and innovation by identifying opportunities for optimization, leveraging emerging technologies, and implementing solutions to improve system reliability and efficiency.
* Utilize: Data Analytics, Python, PL/SQL, Data Warehousing, ETL (Extract, Transform and Load), GIT, NumPy, Pandas, Scikit-learn.
Required: Master's degree or equivalent in Computer Science, Engineering or related (employer will accept a Bachelor's degree plus five (5) years of progressive experience in lieu of a Master's degree) and one (1) year of experience as a Software Developer or related. One (1) year of experience must include utilizing Data Analytics, Python, PL/SQL, Data Warehousing, ETL (Extract, Transform and Load), GIT, NumPy, Pandas, Scikit-learn. $226158 to $267100 per year. Full time. AV161.
**Required technical and professional expertise**
Master's degree or equivalent in Computer Science, Engineering or related (employer will accept a Bachelor's degree plus five (5) years of progressive experience in lieu of a Master's degree) and one (1) year of experience as a Software Developer or related. One (1) year of experience must include utilizing Data Analytics, Python, PL/SQL, Data Warehousing, ETL (Extract, Transform and Load), GIT, NumPy, Pandas, Scikit-learn.
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.
View Now

Site Reliability Engineer

33322 Sunrise, Florida American Express

Posted today

Job Viewed

Tap Again To Close

Job Description

**Description**
At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career.
Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.
**How will you make an impact in this role?**
Most of our software development focuses on delivery new features while optimizing existing systems, building infrastructure, and eliminating work through automation. As part of the SRE team, you'll have the opportunity to manage the complex challenges at scale which are unique to American Express, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving, and willingness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big and try new things in a blame-less environment. We promote self-direction to work on relevant projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
Our Software Engineers not only understand how technology works, but how that technology intersects with the people who count on it every single day. Today, creative ideas, insight and new points of view are at the core of how we craft a more powerful, personal and fulfilling experience for all our customers. Member of a Site Reliability Engineering team reporting to a Senior Engineer, Senior Engineering Manager or Engineering Director. We are looking for someone who is passionate about good design and excellent code; someone who can find solutions to hard technical and functional challenges and can work well within and across teams.
**Responsibilities:**
+ Serve as a member of an agile development team that drives discovery, build and implementation of Non-functional requirements.
+ Actively participates in architecture and engineering discussions.
+ Participate in code reviews, ad-hoc pair programming; contribute to iterative improvement of tools, automation and practices used by team.
+ Provide Operational support with building platform monitoring tools/dashboards, ad hoc reports.
+ Build, implement and advise on recovery tooling to adhere to enterprise standards and/or frameworks.
+ Responsible for availability, proactive monitoring / alerting, capacity planning, performance (reducing latency and increasing efficiency) to include testing for technical platforms.
+ Ensure application data flows are accurate and up to date with the objective to increase the knowledge base of all support teams and drive reliability.
+ Facilitates the resolutions of non-application issues (3rd party upstream issues, infrastructure issues, storage, database, network, file transfer etc.)
**Qualifications:**
+ Experience in REST API design and implementation. Micro-Services. Event Based Architecture, Stream processing/Queue - Solace, Kafka.
+ Experience with Databases - Couchbase (Or a different Document DB/ NoSQL), PostgreSQL.
+ Familiar with developer tools like Git, Jenkins, IntelliJ IDEA, Jira, Confluence.
+ Have at least 5 years of experience with Java backend (J2EE).
+ 2 years of experience in Reactive Programming (asynchronous programming paradigm) ideally with Vertx and RxJava.
+ Proven understanding of cloud technologies (eg. docker, Kubernetes, jaeger, open tracing, prometheus).
+ Demonstrated experience in using modern software engineering tools : git workflows, gradle, load testing tools, mock frameworks.
+ A BS or MS degree in Computer Science, Computer Engineering or similar discipline, or equivalent work experience
+ Working in an environment which includes modern web frameworks and complex transaction processing systems leveraging a broad set of technology stacks
+ Write clean code, perform peer code reviews and architecture reviews.
+ Good communication skills - able to explain concepts to product managers and business partners in ways that are relevant to them
+ High levels of energy, engagement, and ownership. Positive attitude is always welcome.
+ Curiosity to learn new technologies and code them into working prototypes
+ Excellent in communication, ability to learn fast & adaptive.
+ Attention to detail with strong thought leadership and analytical abilities
**Qualifications**
Salary Range: $85,000.00 to $150,000.00 annually bonus benefits
The above represents the expected salary range for this job requisition. Ultimately, in determining your pay, we'll consider your location, experience, and other job-related factors.
We back you with benefits that support your holistic well-being so you can be and deliver your best. This means caring for you and your loved ones' physical, financial, and mental health, as well as providing the flexibility you need to thrive personally and professionally:
+ Competitive base salaries
+ Bonus incentives
+ 6% Company Match on retirement savings plan
+ Free financial coaching and financial well-being support
+ Comprehensive medical, dental, vision, life insurance, and disability benefits
+ Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
+ 20 weeks paid parental leave for all parents, regardless of gender, offered for pregnancy, adoption or surrogacy
+ Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
+ Free and confidential counseling support through our Healthy Minds program
+ Career development and training opportunities
For a full list of Team Amex benefits, visit our Colleague Benefits Site .
American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law. American Express will consider for employment all qualified applicants, including those with arrest or conviction records, in accordance with the requirements of applicable state and local laws, including, but not limited to, the California Fair Chance Act, the Los Angeles County Fair Chance Ordinance for Employers, and the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance. For positions covered by federal and/or state banking regulations, American Express will comply with such regulations as it relates to the consideration of applicants with criminal convictions.
We back our colleagues with the support they need to thrive, professionally and personally. That's why we have Amex Flex, our enterprise working model that provides greater flexibility to colleagues while ensuring we preserve the important aspects of our unique in-person culture. Depending on role and business needs, colleagues will either work onsite, in a hybrid model (combination of in-office and virtual days) or fully virtually.
US Job Seekers - Click to view the " Know Your Rights " poster. If the link does not work, you may access the poster by copying and pasting the following URL in a new browser window: eligibility to work with American Express in the U.S. is required as the company will not pursue visa sponsorship for these positions.
**Job:** Technologies
**Primary Location:** US-Florida-Sunrise
**Schedule** Full-time
**Req ID:** 25014483
View Now
 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Site Reliability Engineer Jobs