289 Hpc Professionals jobs in the United States
HPC Engineer
Posted 1 day ago
Job Viewed
Job Description
Playing a key role in defining and operating some of the most complex compute platforms that the client has to bring to bear against complex problems. These systems enable complex analysis, simulation and modeling leveraging massively parallel computing and disparate holding of very large data sets, to answer difficult questions. To do this you will assist the users in deploying jobs to these systems to harness the capabilities of these systems producing answers in the form of analytic product, models and simulations. This mission enablement is the heart of the hardest problems to solve.
- Responsible for the normal day-to-day HPC operations and maintenance of the HPC systems
- Provide day to day systems administration duties for Nvidia GPUs, Commodity Cluster Systems and Cray HPC environments
- Perform system monitoring, software installations, debug, upgrades, health checks, and identification/implementation of automated business processes
- Provide assessments, on-going performance analysis and recommendations for future architectures
- Responsible for operating all the host systems for the analysis
- Works in a liaison role, linking the analysts and their specialty codes and applications, to the computing systems that are focused on yielding in-depth technically sound results.
- Oversees analytic applications running on a clustered HPC fabric including CPU and GPU systems
- Managing job submission to clients applications and codes using MPI/OpenMPI
- Provide in-depth analytic results, to achieve a best-tool-for-the-job approach.
- Partners with data scientists, engineers, and analysts conducting specialized scientific and engineering analysis.
- Escalate issues and problems to hardware support and/or engineering management as necessary
- Responsible for continuous performance analysis and tuning the HPC environment
- Assist with the identification, troubleshooting, and repair of software problems impacting performance of implemented HPC solutions
- Perform installation of software patches including upgrades to operating systems and firmware
- Assist with the resolution of trouble tickets and software problems identified by system's users
- Identify and expand services and functionalities offered in HPC environment
- Be a primary point of contact to resolve any hardware or software malfunctions, including working with service personnel as necessary
- Review system logs to identify and resolve software and systems related issues
- Prepare reports related to the operational efficiency of the hardware and execution of users jobs
- Experience with MPI/OpenMPI, SLURM, and Linux Operating Systems essential
- Prior experience as a Systems Administrator essential, with a preference for experience working with clustered systems including GPUs in the hardware stack
- Experience with high speed networking, and CUDA preferred
- Software integration experience a plus
- Other duties could be required to support the customer's mission
- Minimum of 6 years demonstrated on-the-job experience
- Demonstrated on-the-job experience with integrating functionality from disparate systems via scripting/tooling/automation
- Demonstrated on-the-job experience with the Sponsor's system security environment and requirements
- Demonstrated experience leading systems architecture, operations, maintenance and administration
Benefits
Eligibility requirements apply.
- Employer-Paid Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k, IRA) with a generous matching program
- Life Insurance (Basic, Voluntary & AD&D)
- Paid Time Off (Vacation, Sick & Public Holidays)
- Short Term & Long Term Disability
- Training & Development
- Employee Assistance Program
HPC Engineer
Posted 1 day ago
Job Viewed
Job Description
Role Summary:
Our client, a global quantitative trading firm, is seeking an experienced HPC Engineer to join their dynamic Platform Development team. As an HPC Engineer, you will play a pivotal role in designing and enhancing high-performance trading systems, research compute clusters, and databases. This position offers the opportunity to work with cutting-edge technologies in a fast-paced and collaborative environment.
Responsibilities:
- Automate monitoring and maintenance tasks using Python and Bash scripts
- Collaborate with cross-functional teams to develop scalable solutions for complex problems
- Optimize operating systems and batch workflows for enhanced performance
- Manage and optimize the HPC environment, including storage solutions like Lustre and GPFS
- Conduct capacity planning and design to ensure optimal resource allocation
- Troubleshoot and tune systems using monitoring and diagnostic tools
- Bachelor's degree in Engineering, Computer Science, Information Systems, or related field
- 5-7 years of experience in building Linux and/or Windows-based HPC platforms
- Proficiency in kernel-level and I/O subsystem tweaks and tools such as sysctl, strace, and netstat
- Hands-on experience with automation in Python or similar tools
- Previous experience administering Lustre, GPFS, VAST, or similar parallel filesystems
- Familiarity with resource scheduling tools like HTCondor, SLURM, or equivalent
- Knowledge of Windows-based HPC platforms
- Experience with storage solutions such as Lustre, VAST, and GPFS
- Understanding of market-making and trading strategies
If you possess the required skills and are excited about the opportunity to work in a collaborative and innovative environment, we encourage you to apply by submitting your CV today. Join Our Client's team and be part of a culture driven by problem-solving and achieving winning results together.
HPC Engineer
Posted 3 days ago
Job Viewed
Job Description
Wolverine is seeking an experienced HPC Engineer to join our team and help build our next-generation HPC infrastructure. In this critical role, you will be instrumental in designing, building, optimizing, and maintaining the high-performance computing and storage systems that are the backbone of our research operations. You will empower quantitative researchers to develop cutting-edge trading strategies and train models on vast amounts of data. The ideal candidate is extremely self-motivated, loves getting their hands dirty, and is passionate about designing cutting-edge systems.
What You'll Do:
- Collaborate with quants, developers, and other infrastructure teams to design and evolve our next-generation HPC infrastructure
- Continuously identify and resolve performance bottlenecks within our HPC clusters to ensure maximum throughput for research workloads.
- Develop robust automation tools and scripts for provisioning, configuration management, and system monitoring.
- Provide expert-level troubleshooting and root cause analysis for complex hardware and software issues across the HPC stack.
- Work cross-functionally with quantitative researchers, traders, software developers, and other engineering teams to understand their computational needs and provide effective solutions.
- Work with quantitative researchers and software engineers to develop and expand simulated trading environments.
- Research, evaluate, and integrate new technologies and methodologies to enhance our HPC capabilities.
- Degree in Computer Science, Electrical Engineering, or a related highly quantitative or technical field.
- 3+ years of hands-on experience designing, implementing, and managing large-scale HPC environments.
- Exceptional proficiency in Linux system administration and preferably a deep understanding of the Linux kernel.
- Strong programming and scripting skills in Python and Bash are essential. Experience with C++ is a significant plus.
- In-depth knowledge of parallel file systems (e.g., Lustre, GPFS).
- Experience with job scheduling systems (e.g., Slurm, Grid Engine).
- Solid understanding of networking fundamentals and experience with high-speed interconnects (e.g., InfiniBand, RDMA).
- Familiarity with performance profiling tools and techniques.
- Excellent problem-solving, analytical, and debugging skills with a keen attention to detail.
- Strong communication and interpersonal skills, with the ability to collaborate effectively with both technical and non-technical stakeholders.
- Familiarity with GPU and Machine Learning workloads is a plus.
$140,000 - $00,000 a year
The base compensation range for this role is approximately 140,000- 200,000 contingent on experience. Wolverine Trading's total compensation model includes base salary and an annual discretionary bonus.
A Statement on Prior Trading Experience:
With an above average rate of tenure for our engineers, we value individuals who innately strive to push boundaries and pursue constant improvement. Given a long-term focus, the ability to innovate, challenge limits, and deliver lasting impact matters far more to us than prior exposure to the trading ecosystem.
⎯⎯⎯
Why Wolverine?
Wolverine Culture:
Our flat organizational structure promotes teamwork across the Firm and offers easy access to senior staff (don't worry, they won't be wearing a suit either). While we work exceptionally well as a team in the office, our bonds are further strengthened through company events, activities and giving back. Volleyball, soccer, hockey, 5K runs, picnic, parties, and trivia nights provide friendly competition and build better relationships. By getting out of our usual environment and doing out-of-the-ordinary things together, we foster creativity and broaden our imaginations to accomplish new challenges.
Wolverine Benefits:
• Highly competitive salary & bonus opportunity
• Generous paid time off and flexible scheduling
• 100% coverage of medical, dental, vision, life, and disability benefits for single coverage
• Generous Paid Parental Leave
• Retirement Plans: 401K and Roth 401K
• Profit sharing plan
• Long- and short-term disability
Perks of being at Wolverine:
• Free breakfast and lunch from our in-house kitchen with rotating menus (including snacks!)
• On-site gym with a subsidized membership
• Frequent company outings
• Opportunity to give back to organizations that help individuals in need in the Chicagoland area
Professional Development:
• In-house education team - classes and resources are offered for continuous learning opportunities
• Mentorship Program through your first six months of employment
About Us:
Founded in 1994, the Wolverine companies comprise a number of diversified financial institutions specializing in proprietary trading, asset management, order execution services, and technology solutions. We are recognized as a market leader in derivatives valuation, trading, and value-added order execution across global equity, options, and futures markets. With a focus on innovation, achievement, and integrity, we take pride in serving the interests of both our clients and colleagues. The Wolverine companies are headquartered in Chicago with an office in New York and a proprietary trading affiliate office located in London.
HPC Engineer
Posted 5 days ago
Job Viewed
Job Description
As a Senior Systems Engineer at OPS consulting, you will provide portfolio level advisory support to the customer, facilitating the development, acquisition and support of complex systems. You will liaise with various technical and non-technical stakeholders in support of system requirements. You will contribute technical data and expertise to strategic team documents including the team charter, acquisition strategy and technical development strategy schedule development, 'briefing material development and action tracking. When needed, provide support for the development, review and execution of contract documentation (Contract Strategy Competition in Contracting Act (CICA) Contract Data Requirements List (CDRLs) and Data Item Description (DIDs).
Additional Experience:
Contribute to the development of sections of systems engineering documentation such as system Engineering Plans, Initial Capabilities Documents, Requirements specifications, and Interface Control Documents
Manage system requirements and derived requirements to ensure the delivery of production systems that are compatible with the defined system architecture(s) - Department of Defense Architecture Framework (DoDAF), service-oriented Architecture (SOA), etc.
Assist with the development of system requirements, functional requirements, and allocation of the same to individual hardware, software, facility and personnel components
Coordinate the resolution of action items from Configuration Control Board (CCB) meetings, design reviews, program reviews, and test reviews that require cross-discipline coordination
Participate in an Integrated Product Team to design new capabilities based upon evaluation of all necessary development and operational considerations
Participate in the development of system engineering documentation, such as System Engineering Plans, Initial Capabilities Documents, Requirements Specifications, and Interface Control Documents
Participate in interface definition design, and changes to the configuration between affected groups and individuals throughout the life cycle
Allocate real-time process budgets and error budgets to systems and subsystem components
Derive from the system requirements an understanding of stakeholder needs functions that may be logically inferred and implied as essential to system effectiveness
Derive lower-level requirements from higher-level allocated requirements that describe in detail the functions that a system component must fulfill, and ensure these requirements are complete, correct, unique, unambiguous, realizable, and verifiable
Generate alternative system concepts, physical architectures, and design solutions
Participate in establishing and gaining approval of the definition of a system or component under development (requirements, designs, interfaces, test procedures, etc.) that provides a common reference point for hardware and software developers
Define the methods, processes, and evaluation criteria by which the systems, subsystems and work products are verified against their requirements in a written plan
Develop system design solution that satisfies the system requirements and fulfills the functional analysis
Develop derived requirements for Information Assurance Services (Confidentiality, Integrity, Non-repudiation, and Availability); Basic Information Assurance Mechanisms (e.g., Identification, Authentication, Access Control, Accountability); and Security Mechanism Technology (Passwords, cryptography, discretionary access control, mandatory access control, hashing, key management, etc.)
Review and provide input to program and contract work breakdown structure (WBS), work packages and the integrated master plan (IMP)
Provide technical direction for the development, engineering, interfacing, integration, and testing of specific components of complex hardware/software systems to include requirements elicitation, analysis and functional allocation, conducting systems requirements reviews, developing concepts of operation and interface standards, developing system architectures, and performing technical/non-technical assessment and management as well as end-to-end flow analysis
Implement comprehensive SOA solutions
Implement operational view, technical standards view, and system and services view for architectures using applicable DoDAF standards
Develop scenarios (threads) and an Operational Concept that describes the interactions between the system, the user, and the environment, that satisfies operational, support, maintenance, and disposal needs .
Review and/or approve system engineering documentation to ensure that processes and specifications meet system needs and are accurate, comprehensive, and complete
Conduct quantitative analysis in non-functional system performance areas like Reliability, Maintainability, Vulnerability, Survivability, Producibility, etc.)
Establish and follow a formal procedure for coordinating system integration activities among multiple teams, ensuring complete coverage of all interfaces
Capture all interface designs in a common interface control format, and store interface data in a commonly accessible repository
Prepare time-line analysis diagrams illustrating the flow of time-dependent functions
Establish a process to formally and proactively control and manage changes to requirements, consider impacts prior to commitment to change, gain stakeholder buy-in, eliminate ambiguity, ensure traceability to source requirements, and track and settle open actions
Assess each risk to the program and determine the probability of occurrence and quantified consequence of failure in accordance with an approved risk management plan
Manage and ensure the technical integrity of the system baseline over time, continually updating it as various changes are imposed on the system during the lifecycle from development through deployment and operations & maintenance
In conjunction with system stakeholders, plan the verification efforts of new and unproven designs early in the development life cycle to ensure compliance with established requirements
Support the planning and test analysis of the DoD Certification/Accreditation Process (as well as other Government Certification and Accreditation (C&A) processes)
Support the development and review of Joint Capability Integration Development System (JCIDS) documents (i.e., Initial Capability Document, Capabilities Description Document, IA Strategy)
Required Experience:
A minimum of 14 years of experience and a bachelor s degree in Engineering, Systems Engineering, Computer Science Information Systems, Engineering Science Engineering Management, or related discipline from an accredited college or university is required.
Desired Experience:
High Performance Computing systems engineering experience
Systems Architecture development experience
Hardware and Software Design
Custom Engineering experience
Microelectronics Experience
Advance degree in Engineering, Systems Engineering or other relevant technical degree.
DAWIA Certification in Systems Engineering or Program Management
Security Clearance: A current government clearance, background investigation, and polygraph are required.
The Swift Group and Subsidiaries are an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.
Pay Range: $49,996.80 - $290,004.00
Pay ranges are a general guideline and not intended as a guaranteed and/or implied final compensation or salary for this job opening. Determination of official compensation or salary relies on several different factors including, but not limited to: level of position, complexity of job responsibilities, geographic location, work experience, education, certifications, Federal Government contract labor categories, and contract wage rates.
At The Swift Group and Subsidiaries, you will receive comprehensive benefits including but not limited to: healthcare, wellness, financial, retirement, education, and time off benefits.
HPC Engineer
Posted 19 days ago
Job Viewed
Job Description
HPC Engineer
Location
US-TX-Dallas
ID
2025-2486
Category
Software & Systems Development
Position Type
Full-Time
Remote
No
Clearance Required
Top Secret/SCI
Overview
AMERICAN SYSTEMS is an employee-owned federal government contractor supporting national priority programs through our strategic solutions in the areas of Information Technology, Test & Evaluation, Program Mission Support, Engineering & Analysis, and Training.
ResponsibilitiesAs a HPC Engineer with AMERICAN SYSTEMS you will have the opportunity to do the following:
- Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security compliance; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution.
- Diagnose and troubleshoot technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies.
- Conduct needs analysis, planning, and scheduling the installation of a wide variety of new or modified hardware/software.
- Develop functional and technical IT system requirements and specifications. Configure and optimize system tools and applications, to include job schedulers (Slurm and PBSPro) and system resources (GitLab, LUA/TCL modules, and system support applications).
- Create and brief technical presentations to technical and non-technical stakeholders. Maintain detailed documentation of system configurations, procedures, and troubleshooting guides. Develop user facing documentation.
#recruitingsurge
Qualifications- DoD Top Secret (TS) clearance with SCI eligibility
- Bachelor's in Computer Engineering, Computer Science, or related field and ten or more years of job related experience.
- Thorough knowledge of complex concepts, practices, and troubleshooting associated with HPC cluster systems design, installation, and maintenance.
- Advanced knowledge in distributed computing theory, parallel processing, applications, and associated infrastructure is required.
- Extensive experience with Linux/Unix systems including installation, configuration, networking, backups, updates and patching, data archiving, and system security. Functional knowledge of HPC middleware, and platform managers such as Bright Cluster Manager; employing job schedulers such as PBS, Slurm, Torque, etc.; and, optimizing job queues.
- Experience with HPC or large-scale distributed computing environments and technologies such as high-speed low-latency interconnects (e.g. InifiniBand), parallel file systems (e.g. Lustre), and virtualization environments and tools (e.g. VMWare).
- Experience developing Python/bash/Perl scripts and employing automation frameworks such as Ansible.
- General knowledge employing Docker containers and Kubernetes ecosystems.
- Working knowledge in one or more programming languages (e.g. C/C++, Fortran, etc.)
- Must be able and willing to travel to northern Virginia approximately 25% of the time
Pay Transparency Statement
AMERICAN SYSTEMS is committed to pay transparency for our applicants and employee-owners. The salary range for this position is USD $129,800.00/Yr. - USD $216,700.00/Yr. Actual compensation will be determined based on several factors permitted by law. AMERICAN SYSTEMS provides for the welfare of its employees and their dependents through a comprehensive benefits program by offering healthcare benefits, paid leave, retirement plans, insurance programs, and education and training assistance.
EEO Statement
EEO Race/Sex/Disability Status/Veteran Status
HPC Engineer
Posted 20 days ago
Job Viewed
Job Description
As an HPC Engineer, you will be tasked with building and managing high-performance computing (HPC) systems. You will ensure the efficient operation of HPC clusters, contribute to performance tuning.
Essential Functions
• Design and deploy high-performance computing systems and clusters.
• Monitor and maintain the performance of HPC resources.
• Troubleshoot hardware, software, and network issues within HPC environments.
• Optimize and tune applications for performance in HPC systems.
• Introduce and apply new technologies for improving computational effectiveness.
• Manage security protocols and ensure data integrity and confidentiality.
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Proven experience with HPC systems and parallel computing.
- Strong understanding of Linux operating systems.
- Experience with job scheduling and queuing systems such as SLURM & Kubernetes.
- Knowledge of network architectures and high-speed interconnects.
- Excellent problem-solving and analytical skills.
- Parallel Computing
- Cluster Management
- Linux
- SLURM
- PBS
- Networking
- Scripting (Bash, Python)
- MPI
- CUDA
- Performance Tuning
Benefits
- Medical Insurance
- Dental Insurance
- Vision Insurance
- 401(k)
- Flexible spending account
- Commuter benefits
- Disability insurance
We also have a perfect location for all types of commuters: AMAX is located right between I-680 and I-880. Warm Springs/South Fremont BART station and bus stops are within a 10-minute walking distance. 5 grocery stores, 6+ coffee/tea places, and numerous restaurants within 1 mile. Feel free to try the delicious fusions or grab your daily groceries after work!
About AMAXEstablished in 1979, AMAX is a globally recognized leader in GPU-accelerated IT infrastructure, specializing in transforming standard IT systems into advanced, high-performance computing solutions. Catering to industries such as AI, cloud computing, autonomous vehicles, and high-performance computing, AMAX has set benchmarks in innovation, including pioneering liquid-cooled HPC systems for the semiconductor industry. With a global footprint spanning North America, Europe, and Asia, AMAX offers end-to-end services from design and manufacturing to deployment. Committed to addressing the growing demands of AI, AMAX delivers advanced solutions that help organizations achieve their technology goal and drive progress on a global scale. To learn more about AMAX’s advanced AI solutions, visit amax.com.
Join UsBecome part of a diverse and inclusive team that values your technical expertise and innovative thinking. Together, we’ll push the boundaries of what’s possible in the hardware industry.
AMAX is proud to be an equal-opportunity employer. We welcome all applicants and provide equal employment opportunities regardless of age, race, gender, or other legally protected characteristics.
HPC Admin
Posted 22 days ago
Job Viewed
Job Description
Location: Collegeville, PA (100% Remote)
Job Type: Long Term Contract
Role Description:
- Associates should have a deep understanding of Posit workbench, connect and package manager administration.
- The candidate should have strong problem-solving skills, including the ability to identify, analyze, and resolve complex technical issues.
- The candidate should have a strong understanding of HPC (Slurm) including the installation and configuration
- The candidate should have strong communication skills, as they will need to communicate with various stakeholders, including developers, system administrators, and end users.
- Expert in building Docker containers
- Ability to write automation with Packer and Ansible
- Experience with Kubernetes
- Experience in Linux Administration
- Azure Fundamentals.
- Posit Workbench, Connect and Package Manager Administration.
- Expert in Linux Administration.
- Deep understanding and implementation experience with Slurm.
- Expert knowledge in managing Kubernetes.
- Work experience in building HPC platform.
- Expert in building Docker containers.
- Ability to write automation with Packer and Ansible.
- Ability to understand basic R scripts.
- Working knowledge on Cgroups.
- Working knowledge of software deployment such as on MATLAB, MONOLIX, NONMEM and PsN installation.
- Knowledge on Lustre.
- Good experience in integrating decoupled infrastructure components.
Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.
Be The First To Know
About the latest Hpc professionals Jobs in United States !
HPC Consultant
Posted 22 days ago
Job Viewed
Job Description
Role: HPC support Consultant
Location: Fremont CA-day 1 onsite
Hire type: FTE/Contract
Customer: LAM RESEARCH CORPORATION
PW ID: TechM136203
• We are seeking an experienced High Performance Computing platform consultant to provide Support to India/Asia/EU region users and carry out platform enhancements and reliability improvement projects as aligned with HPC architect
Minimum qualifications:
• Bachelor's or Master's degree in Computer Science or equivalent with ~8-10 years of experience in High Performance Computing technologies
• Experienced in
¿ HPC Environment: Familiar with use fo HPC - Ansys/Fluent over MPI, Helping users to tune their jobs in an HPC environment
¿ Linux administration
¿ Parallel file system (Eg. Gluster, Lustre, ZFS, Gluster, Luster, NFS, CIFS)
¿ MPI (OpenMPI, MPICH2, IntelMIP), Infiniband
¿ parallel computing
¿ Monitoring tools - Eg. Nagios
¿ Programming skills such as in Python would be nice to have, especially using MPI
• Experienced and hands on with Cloud technologies: Prefer using Azure and Terraform for VM creations and maintenance
• Effective communication skills (the resource would independently engage and address user requests and resolve incidents for global regions - Asia, EU included)
• Ability to work independently with minimal supervision
Preferred Qualifications:
• Experience with ANSYS, COMSOL and complex simulation Products for engineering analysis
HPC Engineer
Posted 24 days ago
Job Viewed
Job Description
What you will be doing
Playing a key role in defining and operating some of the most complex compute platforms that the client has to bring to bear against complex problems. These systems enable complex analysis, simulation and modeling leveraging massively parallel computing and disparate holding of very large data sets, to answer difficult questions. To do this you will assist the users in deploying jobs to these systems to harness the capabilities of these systems producing answers in the form of analytic product, models and simulations. This mission enablement is the heart of the hardest problems to solve.
- Responsible for the normal day-to-day HPC operations and maintenance of the HPC systems
- Provide day to day systems administration duties for Nvidia GPUs, Commodity Cluster Systems and Cray HPC environments
- Perform system monitoring, software installations, debug, upgrades, health checks, and identification/implementation of automated business processes
- Provide assessments, on-going performance analysis and recommendations for future architectures
- Responsible for operating all the host systems for the analysis
- Works in a liaison role, linking the analysts and their specialty codes and applications, to the computing systems that are focused on yielding in-depth technically sound results.
- Oversees analytic applications running on a clustered HPC fabric including CPU and GPU systems
- Managing job submission to clients applications and codes using MPI/OpenMPI
- Provide in-depth analytic results, to achieve a best-tool-for-the-job approach.
- Partners with data scientists, engineers, and analysts conducting specialized scientific and engineering analysis.
- Escalate issues and problems to hardware support and/or engineering management as necessary
- Responsible for continuous performance analysis and tuning the HPC environment
- Assist with the identification, troubleshooting, and repair of software problems impacting performance of implemented HPC solutions
- Perform installation of software patches including upgrades to operating systems and firmware
- Assist with the resolution of trouble tickets and software problems identified by system’s users
- Identify and expand services and functionalities offered in HPC environment
- Be a primary point of contact to resolve any hardware or software malfunctions, including working with service personnel as necessary
- Review system logs to identify and resolve software and systems related issues
- Prepare reports related to the operational efficiency of the hardware and execution of users jobs
- Experience with MPI/OpenMPI, SLURM, and Linux Operating Systems essential
- Prior experience as a Systems Administrator essential, with a preference for experience working with clustered systems including GPUs in the hardware stack
- Experience with high speed networking, and CUDA preferred
- Software integration experience a plus
- Other duties could be required to support the customer’s mission
Requirements
- Minimum of 6 years demonstrated on-the-job experience
- Demonstrated on-the-job experience with integrating functionality from disparate systems via scripting/tooling/automation
- Demonstrated on-the-job experience with the Sponsor's system security environment and requirements
- Demonstrated experience leading systems architecture, operations, maintenance and administration
Clearance: Active TS/SCI with an appropriate current polygraph is required to be considered for this role; Ability to receive privileged access rights.
Benefits
Eligibility requirements apply.
- Employer-Paid Health Care Plan (Medical, Dental & Vision)
- Retirement Plan (401k, IRA) with a generous matching program
- Life Insurance (Basic, Voluntary & AD&D)
- Paid Time Off (Vacation, Sick & Public Holidays)
- Short Term & Long Term Disability
- Training & Development
- Employee Assistance Program
HPC Engineer
Posted 24 days ago
Job Viewed
Job Description
Optiver is seeking a HPC Engineer to contribute significantly to the development and management of our research infrastructure across both on-premises and cloud platforms.
This role involves hands-on work in scaling and supporting high-performance computing (HPC) and storage systems, which are critical for our growing demand in research across various trading-related domains, including quantitative and options research.
What You'll Do:
- Assist in designing and supporting HPC and storage systems on-premises and in cloud clusters (AWS, GCP, Azure).
- Help troubleshoot issues related to OS, storage, network, and other infrastructure components, in collaboration with other infrastructure teams.
- Contribute to innovation in research infrastructure by supporting new technologies and methodologies.
- Support management of diverse compute capabilities, including CPU, GPU, and specialized processing solutions.
- Work directly with software engineers to provide a fully integrated research platform for the company.
- Knowledge of Linux systems and environment.
- Exposure to Infrastructure as Code tools (Ansible, Terraform, etc.).
- Understanding of cloud solutions like AWS, GCP, or Azure.
- Interest in high-performance computing (HPC) environments.
- Ability to troubleshoot in infrastructure scenarios.
- Ability to effectively collaborate with diverse and global teams.
You'll join a culture of collaboration and excellence, surrounded by curious thinkers and creative problem-solvers. Motivated by a passion for continuous improvement, you'll thrive in a supportive, high-performing environment alongside talented colleagues, collectively tackling some of the toughest challenges in the financial markets.
In addition, you'll receive:
- The opportunity to work alongside best-in-class professionals from over 40 different countries
- A highly competitive compensation package
- Global profit-sharing pool and performance-based bonus structure
- 401(k) match up to 50%
- Comprehensive health, mental, dental, vision, disability, and life coverage
- 25 paid vacation days alongside market holidays
- Extensive office perks, including breakfast, lunch and snacks, regular social events, clubs, sporting leagues and more
Who we are:
At Optiver, our mission is to improve the market by injecting liquidity, providing accurate pricing, increasing transparency and stabilizing the market no matter the conditions. With a focus on continuous improvement, we prioritize safeguarding the health and efficiency of the markets for all participants. As one of the largest market making institutions, we are a respected partner on 100+ exchanges across the globe.
Our differences are our edge. Optiver does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, physical or mental disability, or other legally protected characteristics.
Below is the expected base salary for this position. This is a good-faith estimate of the base pay scale for this position and offers will ultimately be determined based on experience, education, skill set, and performance in the interview process. This position will also be eligible for a discretionary bonus (if determined by Optiver) and Optiver's benefits package with the benefits listed above.
Base Salary Range
$150,000-$200,000 USD