289 Hpc Professionals jobs in the United States

HPC Engineer

21403 Annapolis, Maryland Avalore, LLC

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

What you will be doing

Playing a key role in defining and operating some of the most complex compute platforms that the client has to bring to bear against complex problems. These systems enable complex analysis, simulation and modeling leveraging massively parallel computing and disparate holding of very large data sets, to answer difficult questions. To do this you will assist the users in deploying jobs to these systems to harness the capabilities of these systems producing answers in the form of analytic product, models and simulations. This mission enablement is the heart of the hardest problems to solve.
  • Responsible for the normal day-to-day HPC operations and maintenance of the HPC systems
  • Provide day to day systems administration duties for Nvidia GPUs, Commodity Cluster Systems and Cray HPC environments
  • Perform system monitoring, software installations, debug, upgrades, health checks, and identification/implementation of automated business processes
  • Provide assessments, on-going performance analysis and recommendations for future architectures
  • Responsible for operating all the host systems for the analysis
  • Works in a liaison role, linking the analysts and their specialty codes and applications, to the computing systems that are focused on yielding in-depth technically sound results.
  • Oversees analytic applications running on a clustered HPC fabric including CPU and GPU systems
  • Managing job submission to clients applications and codes using MPI/OpenMPI
  • Provide in-depth analytic results, to achieve a best-tool-for-the-job approach.
  • Partners with data scientists, engineers, and analysts conducting specialized scientific and engineering analysis.
  • Escalate issues and problems to hardware support and/or engineering management as necessary
  • Responsible for continuous performance analysis and tuning the HPC environment
  • Assist with the identification, troubleshooting, and repair of software problems impacting performance of implemented HPC solutions
  • Perform installation of software patches including upgrades to operating systems and firmware
  • Assist with the resolution of trouble tickets and software problems identified by system's users
  • Identify and expand services and functionalities offered in HPC environment
  • Be a primary point of contact to resolve any hardware or software malfunctions, including working with service personnel as necessary
  • Review system logs to identify and resolve software and systems related issues
  • Prepare reports related to the operational efficiency of the hardware and execution of users jobs
  • Experience with MPI/OpenMPI, SLURM, and Linux Operating Systems essential
  • Prior experience as a Systems Administrator essential, with a preference for experience working with clustered systems including GPUs in the hardware stack
  • Experience with high speed networking, and CUDA preferred
  • Software integration experience a plus
  • Other duties could be required to support the customer's mission
Requirements
  • Minimum of 6 years demonstrated on-the-job experience
  • Demonstrated on-the-job experience with integrating functionality from disparate systems via scripting/tooling/automation
  • Demonstrated on-the-job experience with the Sponsor's system security environment and requirements
  • Demonstrated experience leading systems architecture, operations, maintenance and administration
Clearance: Active TS/SCI with an appropriate current polygraph is required to be considered for this role; Ability to receive privileged access rights.

Benefits

Eligibility requirements apply.
  • Employer-Paid Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA) with a generous matching program
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Short Term & Long Term Disability
  • Training & Development
  • Employee Assistance Program
View Now

HPC Engineer

15689 United, Pennsylvania iO Associates

Posted 1 day ago

Job Viewed

Tap Again To Close

Job Description

Our Client is Hiring a Skilled HPC Engineer!

Role Summary:

Our client, a global quantitative trading firm, is seeking an experienced HPC Engineer to join their dynamic Platform Development team. As an HPC Engineer, you will play a pivotal role in designing and enhancing high-performance trading systems, research compute clusters, and databases. This position offers the opportunity to work with cutting-edge technologies in a fast-paced and collaborative environment.

Responsibilities:

  • Automate monitoring and maintenance tasks using Python and Bash scripts
  • Collaborate with cross-functional teams to develop scalable solutions for complex problems
  • Optimize operating systems and batch workflows for enhanced performance
  • Manage and optimize the HPC environment, including storage solutions like Lustre and GPFS
  • Conduct capacity planning and design to ensure optimal resource allocation
  • Troubleshoot and tune systems using monitoring and diagnostic tools
Essential Skills & Experience:
  • Bachelor's degree in Engineering, Computer Science, Information Systems, or related field
  • 5-7 years of experience in building Linux and/or Windows-based HPC platforms
  • Proficiency in kernel-level and I/O subsystem tweaks and tools such as sysctl, strace, and netstat
  • Hands-on experience with automation in Python or similar tools
  • Previous experience administering Lustre, GPFS, VAST, or similar parallel filesystems
  • Familiarity with resource scheduling tools like HTCondor, SLURM, or equivalent
Desirable Skills & Experience:
  • Knowledge of Windows-based HPC platforms
  • Experience with storage solutions such as Lustre, VAST, and GPFS
  • Understanding of market-making and trading strategies


If you possess the required skills and are excited about the opportunity to work in a collaborative and innovative environment, we encourage you to apply by submitting your CV today. Join Our Client's team and be part of a culture driven by problem-solving and achieving winning results together.
View Now

HPC Engineer

60290 Chicago, Illinois Wolverine Trading

Posted 3 days ago

Job Viewed

Tap Again To Close

Job Description

Wolverine is seeking an experienced HPC Engineer to join our team and help build our next-generation HPC infrastructure. In this critical role, you will be instrumental in designing, building, optimizing, and maintaining the high-performance computing and storage systems that are the backbone of our research operations. You will empower quantitative researchers to develop cutting-edge trading strategies and train models on vast amounts of data. The ideal candidate is extremely self-motivated, loves getting their hands dirty, and is passionate about designing cutting-edge systems.

What You'll Do:

    • Collaborate with quants, developers, and other infrastructure teams to design and evolve our next-generation HPC infrastructure
    • Continuously identify and resolve performance bottlenecks within our HPC clusters to ensure maximum throughput for research workloads.
    • Develop robust automation tools and scripts for provisioning, configuration management, and system monitoring.
    • Provide expert-level troubleshooting and root cause analysis for complex hardware and software issues across the HPC stack.
    • Work cross-functionally with quantitative researchers, traders, software developers, and other engineering teams to understand their computational needs and provide effective solutions.
    • Work with quantitative researchers and software engineers to develop and expand simulated trading environments.
    • Research, evaluate, and integrate new technologies and methodologies to enhance our HPC capabilities.
What We're Looking For:
    • Degree in Computer Science, Electrical Engineering, or a related highly quantitative or technical field.
    • 3+ years of hands-on experience designing, implementing, and managing large-scale HPC environments.
    • Exceptional proficiency in Linux system administration and preferably a deep understanding of the Linux kernel.
    • Strong programming and scripting skills in Python and Bash are essential. Experience with C++ is a significant plus.
    • In-depth knowledge of parallel file systems (e.g., Lustre, GPFS).
    • Experience with job scheduling systems (e.g., Slurm, Grid Engine).
    • Solid understanding of networking fundamentals and experience with high-speed interconnects (e.g., InfiniBand, RDMA).
    • Familiarity with performance profiling tools and techniques.
    • Excellent problem-solving, analytical, and debugging skills with a keen attention to detail.
    • Strong communication and interpersonal skills, with the ability to collaborate effectively with both technical and non-technical stakeholders.
    • Familiarity with GPU and Machine Learning workloads is a plus.


$140,000 - $00,000 a year

The base compensation range for this role is approximately 140,000- 200,000 contingent on experience. Wolverine Trading's total compensation model includes base salary and an annual discretionary bonus.

A Statement on Prior Trading Experience:

With an above average rate of tenure for our engineers, we value individuals who innately strive to push boundaries and pursue constant improvement. Given a long-term focus, the ability to innovate, challenge limits, and deliver lasting impact matters far more to us than prior exposure to the trading ecosystem.

⎯⎯⎯

Why Wolverine?

Wolverine Culture:

Our flat organizational structure promotes teamwork across the Firm and offers easy access to senior staff (don't worry, they won't be wearing a suit either). While we work exceptionally well as a team in the office, our bonds are further strengthened through company events, activities and giving back. Volleyball, soccer, hockey, 5K runs, picnic, parties, and trivia nights provide friendly competition and build better relationships. By getting out of our usual environment and doing out-of-the-ordinary things together, we foster creativity and broaden our imaginations to accomplish new challenges.

Wolverine Benefits:

• Highly competitive salary & bonus opportunity

• Generous paid time off and flexible scheduling

• 100% coverage of medical, dental, vision, life, and disability benefits for single coverage

• Generous Paid Parental Leave

• Retirement Plans: 401K and Roth 401K

• Profit sharing plan

• Long- and short-term disability

Perks of being at Wolverine:

• Free breakfast and lunch from our in-house kitchen with rotating menus (including snacks!)

• On-site gym with a subsidized membership

• Frequent company outings

• Opportunity to give back to organizations that help individuals in need in the Chicagoland area

Professional Development:

• In-house education team - classes and resources are offered for continuous learning opportunities

• Mentorship Program through your first six months of employment

About Us:

Founded in 1994, the Wolverine companies comprise a number of diversified financial institutions specializing in proprietary trading, asset management, order execution services, and technology solutions. We are recognized as a market leader in derivatives valuation, trading, and value-added order execution across global equity, options, and futures markets. With a focus on innovation, achievement, and integrity, we take pride in serving the interests of both our clients and colleagues. The Wolverine companies are headquartered in Chicago with an office in New York and a proprietary trading affiliate office located in London.
View Now

HPC Engineer

20724 Maryland City, Maryland The Swift Group

Posted 5 days ago

Job Viewed

Tap Again To Close

Job Description

OPS Consulting is seeking an HPC Engineer Level 2 to work in Laurel, MD.

As a Senior Systems Engineer at OPS consulting, you will provide portfolio level advisory support to the customer, facilitating the development, acquisition and support of complex systems. You will liaise with various technical and non-technical stakeholders in support of system requirements. You will contribute technical data and expertise to strategic team documents including the team charter, acquisition strategy and technical development strategy schedule development, 'briefing material development and action tracking. When needed, provide support for the development, review and execution of contract documentation (Contract Strategy Competition in Contracting Act (CICA) Contract Data Requirements List (CDRLs) and Data Item Description (DIDs).

Additional Experience:
Contribute to the development of sections of systems engineering documentation such as system Engineering Plans, Initial Capabilities Documents, Requirements specifications, and Interface Control Documents
Manage system requirements and derived requirements to ensure the delivery of production systems that are compatible with the defined system architecture(s) - Department of Defense Architecture Framework (DoDAF), service-oriented Architecture (SOA), etc.
Assist with the development of system requirements, functional requirements, and allocation of the same to individual hardware, software, facility and personnel components
Coordinate the resolution of action items from Configuration Control Board (CCB) meetings, design reviews, program reviews, and test reviews that require cross-discipline coordination
Participate in an Integrated Product Team to design new capabilities based upon evaluation of all necessary development and operational considerations
Participate in the development of system engineering documentation, such as System Engineering Plans, Initial Capabilities Documents, Requirements Specifications, and Interface Control Documents
Participate in interface definition design, and changes to the configuration between affected groups and individuals throughout the life cycle
Allocate real-time process budgets and error budgets to systems and subsystem components
Derive from the system requirements an understanding of stakeholder needs functions that may be logically inferred and implied as essential to system effectiveness
Derive lower-level requirements from higher-level allocated requirements that describe in detail the functions that a system component must fulfill, and ensure these requirements are complete, correct, unique, unambiguous, realizable, and verifiable
Generate alternative system concepts, physical architectures, and design solutions
Participate in establishing and gaining approval of the definition of a system or component under development (requirements, designs, interfaces, test procedures, etc.) that provides a common reference point for hardware and software developers
Define the methods, processes, and evaluation criteria by which the systems, subsystems and work products are verified against their requirements in a written plan
Develop system design solution that satisfies the system requirements and fulfills the functional analysis
Develop derived requirements for Information Assurance Services (Confidentiality, Integrity, Non-repudiation, and Availability); Basic Information Assurance Mechanisms (e.g., Identification, Authentication, Access Control, Accountability); and Security Mechanism Technology (Passwords, cryptography, discretionary access control, mandatory access control, hashing, key management, etc.)
Review and provide input to program and contract work breakdown structure (WBS), work packages and the integrated master plan (IMP)
Provide technical direction for the development, engineering, interfacing, integration, and testing of specific components of complex hardware/software systems to include requirements elicitation, analysis and functional allocation, conducting systems requirements reviews, developing concepts of operation and interface standards, developing system architectures, and performing technical/non-technical assessment and management as well as end-to-end flow analysis
Implement comprehensive SOA solutions
Implement operational view, technical standards view, and system and services view for architectures using applicable DoDAF standards
Develop scenarios (threads) and an Operational Concept that describes the interactions between the system, the user, and the environment, that satisfies operational, support, maintenance, and disposal needs .
Review and/or approve system engineering documentation to ensure that processes and specifications meet system needs and are accurate, comprehensive, and complete
Conduct quantitative analysis in non-functional system performance areas like Reliability, Maintainability, Vulnerability, Survivability, Producibility, etc.)
Establish and follow a formal procedure for coordinating system integration activities among multiple teams, ensuring complete coverage of all interfaces
Capture all interface designs in a common interface control format, and store interface data in a commonly accessible repository
Prepare time-line analysis diagrams illustrating the flow of time-dependent functions
Establish a process to formally and proactively control and manage changes to requirements, consider impacts prior to commitment to change, gain stakeholder buy-in, eliminate ambiguity, ensure traceability to source requirements, and track and settle open actions
Assess each risk to the program and determine the probability of occurrence and quantified consequence of failure in accordance with an approved risk management plan
Manage and ensure the technical integrity of the system baseline over time, continually updating it as various changes are imposed on the system during the lifecycle from development through deployment and operations & maintenance
In conjunction with system stakeholders, plan the verification efforts of new and unproven designs early in the development life cycle to ensure compliance with established requirements
Support the planning and test analysis of the DoD Certification/Accreditation Process (as well as other Government Certification and Accreditation (C&A) processes)
Support the development and review of Joint Capability Integration Development System (JCIDS) documents (i.e., Initial Capability Document, Capabilities Description Document, IA Strategy)

Required Experience:

A minimum of 14 years of experience and a bachelor s degree in Engineering, Systems Engineering, Computer Science Information Systems, Engineering Science Engineering Management, or related discipline from an accredited college or university is required.

Desired Experience:
High Performance Computing systems engineering experience
Systems Architecture development experience
Hardware and Software Design
Custom Engineering experience
Microelectronics Experience
Advance degree in Engineering, Systems Engineering or other relevant technical degree.
DAWIA Certification in Systems Engineering or Program Management

Security Clearance: A current government clearance, background investigation, and polygraph are required.

The Swift Group and Subsidiaries are an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.

Pay Range: $49,996.80 - $290,004.00

Pay ranges are a general guideline and not intended as a guaranteed and/or implied final compensation or salary for this job opening. Determination of official compensation or salary relies on several different factors including, but not limited to: level of position, complexity of job responsibilities, geographic location, work experience, education, certifications, Federal Government contract labor categories, and contract wage rates.

At The Swift Group and Subsidiaries, you will receive comprehensive benefits including but not limited to: healthcare, wellness, financial, retirement, education, and time off benefits.
View Now

HPC Engineer

75215 Park Cities, Texas American Systems

Posted 19 days ago

Job Viewed

Tap Again To Close

Job Description



HPC Engineer

Location

US-TX-Dallas

ID

2025-2486

Category

Software & Systems Development

Position Type

Full-Time

Remote

No

Clearance Required

Top Secret/SCI

Overview

AMERICAN SYSTEMS is an employee-owned federal government contractor supporting national priority programs through our strategic solutions in the areas of Information Technology, Test & Evaluation, Program Mission Support, Engineering & Analysis, and Training.

Responsibilities

As a HPC Engineer with AMERICAN SYSTEMS you will have the opportunity to do the following:

    Utilize a wide variety of skills in system and network monitoring; large-scale systems administration; scripting and automation; security compliance; network distributed services; storage and backups; and hardware and software problem diagnosis and resolution.
  • Diagnose and troubleshoot technical problems, often of a complex nature, associated with computer hardware and software interrelationships and dependencies.
  • Conduct needs analysis, planning, and scheduling the installation of a wide variety of new or modified hardware/software.
  • Develop functional and technical IT system requirements and specifications. Configure and optimize system tools and applications, to include job schedulers (Slurm and PBSPro) and system resources (GitLab, LUA/TCL modules, and system support applications).
  • Create and brief technical presentations to technical and non-technical stakeholders. Maintain detailed documentation of system configurations, procedures, and troubleshooting guides. Develop user facing documentation.

#recruitingsurge

Qualifications

  • DoD Top Secret (TS) clearance with SCI eligibility
  • Bachelor's in Computer Engineering, Computer Science, or related field and ten or more years of job related experience.
  • Thorough knowledge of complex concepts, practices, and troubleshooting associated with HPC cluster systems design, installation, and maintenance.
  • Advanced knowledge in distributed computing theory, parallel processing, applications, and associated infrastructure is required.
  • Extensive experience with Linux/Unix systems including installation, configuration, networking, backups, updates and patching, data archiving, and system security. Functional knowledge of HPC middleware, and platform managers such as Bright Cluster Manager; employing job schedulers such as PBS, Slurm, Torque, etc.; and, optimizing job queues.
  • Experience with HPC or large-scale distributed computing environments and technologies such as high-speed low-latency interconnects (e.g. InifiniBand), parallel file systems (e.g. Lustre), and virtualization environments and tools (e.g. VMWare).
  • Experience developing Python/bash/Perl scripts and employing automation frameworks such as Ansible.
  • General knowledge employing Docker containers and Kubernetes ecosystems.
  • Working knowledge in one or more programming languages (e.g. C/C++, Fortran, etc.)
  • Must be able and willing to travel to northern Virginia approximately 25% of the time


Pay Transparency Statement

AMERICAN SYSTEMS is committed to pay transparency for our applicants and employee-owners. The salary range for this position is USD $129,800.00/Yr. - USD $216,700.00/Yr. Actual compensation will be determined based on several factors permitted by law. AMERICAN SYSTEMS provides for the welfare of its employees and their dependents through a comprehensive benefits program by offering healthcare benefits, paid leave, retirement plans, insurance programs, and education and training assistance.

EEO Statement

EEO Race/Sex/Disability Status/Veteran Status
View Now

HPC Engineer

94537 Fremont, California AMAX

Posted 20 days ago

Job Viewed

Tap Again To Close

Job Description

As an HPC Engineer, you will be tasked with building and managing high-performance computing (HPC) systems. You will ensure the efficient operation of HPC clusters, contribute to performance tuning.

Essential Functions

•    Design and deploy high-performance computing systems and clusters.

•    Monitor and maintain the performance of HPC resources.

•    Troubleshoot hardware, software, and network issues within HPC environments.

•    Optimize and tune applications for performance in HPC systems.

•    Introduce and apply new technologies for improving computational effectiveness.

•    Manage security protocols and ensure data integrity and confidentiality.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • Proven experience with HPC systems and parallel computing.
  • Strong understanding of Linux operating systems.
  • Experience with job scheduling and queuing systems such as SLURM & Kubernetes.
  • Knowledge of network architectures and high-speed interconnects.
  • Excellent problem-solving and analytical skills.
  • Parallel Computing
  • Cluster Management
  • Linux
  • SLURM
  • PBS
  • Networking
  • Scripting (Bash, Python)
  • MPI
  • CUDA
  • Performance Tuning

Benefits

  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • 401(k)
  • Flexible spending account
  • Commuter benefits
  • Disability insurance

We also have a perfect location for all types of commuters: AMAX is located right between I-680 and I-880. Warm Springs/South Fremont BART station and bus stops are within a 10-minute walking distance. 5 grocery stores, 6+ coffee/tea places, and numerous restaurants within 1 mile. Feel free to try the delicious fusions or grab your daily groceries after work!

About AMAX

Established in 1979, AMAX is a globally recognized leader in GPU-accelerated IT infrastructure, specializing in transforming standard IT systems into advanced, high-performance computing solutions. Catering to industries such as AI, cloud computing, autonomous vehicles, and high-performance computing, AMAX has set benchmarks in innovation, including pioneering liquid-cooled HPC systems for the semiconductor industry. With a global footprint spanning North America, Europe, and Asia, AMAX offers end-to-end services from design and manufacturing to deployment. Committed to addressing the growing demands of AI, AMAX delivers advanced solutions that help organizations achieve their technology goal and drive progress on a global scale. To learn more about AMAX’s advanced AI solutions, visit amax.com.

Join Us

Become part of a diverse and inclusive team that values your technical expertise and innovative thinking. Together, we’ll push the boundaries of what’s possible in the hardware industry.

AMAX is proud to be an equal-opportunity employer. We welcome all applicants and provide equal employment opportunities regardless of age, race, gender, or other legally protected characteristics.

View Now

HPC Admin

19426 Collegeville, Pennsylvania Diverse Lynx

Posted 22 days ago

Job Viewed

Tap Again To Close

Job Description

Position: HPC Admin

Location: Collegeville, PA (100% Remote)

Job Type: Long Term Contract

Role Description:
  • Associates should have a deep understanding of Posit workbench, connect and package manager administration.
Problem-Solving Skills:
  • The candidate should have strong problem-solving skills, including the ability to identify, analyze, and resolve complex technical issues.
HPC Skills:
  • The candidate should have a strong understanding of HPC (Slurm) including the installation and configuration
Communication Skills:
  • The candidate should have strong communication skills, as they will need to communicate with various stakeholders, including developers, system administrators, and end users.
  • Expert in building Docker containers
  • Ability to write automation with Packer and Ansible
  • Experience with Kubernetes
  • Experience in Linux Administration
Essential Skills:
  • Azure Fundamentals.
  • Posit Workbench, Connect and Package Manager Administration.
  • Expert in Linux Administration.
  • Deep understanding and implementation experience with Slurm.
  • Expert knowledge in managing Kubernetes.
Desirable Skills:
  • Work experience in building HPC platform.
  • Expert in building Docker containers.
  • Ability to write automation with Packer and Ansible.
  • Ability to understand basic R scripts.
Nice to have:
  • Working knowledge on Cgroups.
  • Working knowledge of software deployment such as on MATLAB, MONOLIX, NONMEM and PsN installation.
  • Knowledge on Lustre.
  • Good experience in integrating decoupled infrastructure components.


Diverse Lynx LLC is an Equal Employment Opportunity employer. All qualified applicants will receive due consideration for employment without any discrimination. All applicants will be evaluated solely on the basis of their ability, competence and their proven capability to perform the functions outlined in the corresponding role. We promote and support a diverse workforce across all levels in the company.
View Now
Be The First To Know

About the latest Hpc professionals Jobs in United States !

HPC Consultant

94537 Fremont, California Keylent Inc

Posted 22 days ago

Job Viewed

Tap Again To Close

Job Description

HPC Consultant MAHIN-JOB-36702

Role: HPC support Consultant


Location: Fremont CA-day 1 onsite

Hire type: FTE/Contract

Customer: LAM RESEARCH CORPORATION

PW ID: TechM136203
• We are seeking an experienced High Performance Computing platform consultant to provide Support to India/Asia/EU region users and carry out platform enhancements and reliability improvement projects as aligned with HPC architect
Minimum qualifications:
• Bachelor's or Master's degree in Computer Science or equivalent with ~8-10 years of experience in High Performance Computing technologies
• Experienced in
¿ HPC Environment: Familiar with use fo HPC - Ansys/Fluent over MPI, Helping users to tune their jobs in an HPC environment
¿ Linux administration
¿ Parallel file system (Eg. Gluster, Lustre, ZFS, Gluster, Luster, NFS, CIFS)
¿ MPI (OpenMPI, MPICH2, IntelMIP), Infiniband
¿ parallel computing
¿ Monitoring tools - Eg. Nagios
¿ Programming skills such as in Python would be nice to have, especially using MPI
• Experienced and hands on with Cloud technologies: Prefer using Azure and Terraform for VM creations and maintenance
• Effective communication skills (the resource would independently engage and address user requests and resolve incidents for global regions - Asia, EU included)
• Ability to work independently with minimal supervision
Preferred Qualifications:
• Experience with ANSYS, COMSOL and complex simulation Products for engineering analysis
View Now

HPC Engineer

20701 Annapolis Junction, Maryland Avalore, LLC

Posted 24 days ago

Job Viewed

Tap Again To Close

Job Description

What you will be doing

Playing a key role in defining and operating some of the most complex compute platforms that the client has to bring to bear against complex problems. These systems enable complex analysis, simulation and modeling leveraging massively parallel computing and disparate holding of very large data sets, to answer difficult questions. To do this you will assist the users in deploying jobs to these systems to harness the capabilities of these systems producing answers in the form of analytic product, models and simulations. This mission enablement is the heart of the hardest problems to solve.

  • Responsible for the normal day-to-day HPC operations and maintenance of the HPC systems
  • Provide day to day systems administration duties for Nvidia GPUs, Commodity Cluster Systems and Cray HPC environments
  • Perform system monitoring, software installations, debug, upgrades, health checks, and identification/implementation of automated business processes
  • Provide assessments, on-going performance analysis and recommendations for future architectures
  • Responsible for operating all the host systems for the analysis
  • Works in a liaison role, linking the analysts and their specialty codes and applications, to the computing systems that are focused on yielding in-depth technically sound results.
  • Oversees analytic applications running on a clustered HPC fabric including CPU and GPU systems
  • Managing job submission to clients applications and codes using MPI/OpenMPI
  • Provide in-depth analytic results, to achieve a best-tool-for-the-job approach.
  • Partners with data scientists, engineers, and analysts conducting specialized scientific and engineering analysis.
  • Escalate issues and problems to hardware support and/or engineering management as necessary
  • Responsible for continuous performance analysis and tuning the HPC environment
  • Assist with the identification, troubleshooting, and repair of software problems impacting performance of implemented HPC solutions
  • Perform installation of software patches including upgrades to operating systems and firmware
  • Assist with the resolution of trouble tickets and software problems identified by system’s users
  • Identify and expand services and functionalities offered in HPC environment
  • Be a primary point of contact to resolve any hardware or software malfunctions, including working with service personnel as necessary
  • Review system logs to identify and resolve software and systems related issues
  • Prepare reports related to the operational efficiency of the hardware and execution of users jobs
  • Experience with MPI/OpenMPI, SLURM, and Linux Operating Systems essential
  • Prior experience as a Systems Administrator essential, with a preference for experience working with clustered systems including GPUs in the hardware stack
  • Experience with high speed networking, and CUDA preferred
  • Software integration experience a plus
  • Other duties could be required to support the customer’s mission

Requirements

  • Minimum of 6 years demonstrated on-the-job experience
  • Demonstrated on-the-job experience with integrating functionality from disparate systems via scripting/tooling/automation
  • Demonstrated on-the-job experience with the Sponsor's system security environment and requirements
  • Demonstrated experience leading systems architecture, operations, maintenance and administration

Clearance: Active TS/SCI with an appropriate current polygraph is required to be considered for this role; Ability to receive privileged access rights.

Benefits

Eligibility requirements apply.

  • Employer-Paid Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA) with a generous matching program
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Short Term & Long Term Disability
  • Training & Development
  • Employee Assistance Program
View Now

HPC Engineer

10261 New York, New York Optiver

Posted 24 days ago

Job Viewed

Tap Again To Close

Job Description

Optiver is seeking a HPC Engineer to contribute significantly to the development and management of our research infrastructure across both on-premises and cloud platforms.

This role involves hands-on work in scaling and supporting high-performance computing (HPC) and storage systems, which are critical for our growing demand in research across various trading-related domains, including quantitative and options research.

What You'll Do:

  • Assist in designing and supporting HPC and storage systems on-premises and in cloud clusters (AWS, GCP, Azure).
  • Help troubleshoot issues related to OS, storage, network, and other infrastructure components, in collaboration with other infrastructure teams.
  • Contribute to innovation in research infrastructure by supporting new technologies and methodologies.
  • Support management of diverse compute capabilities, including CPU, GPU, and specialized processing solutions.
  • Work directly with software engineers to provide a fully integrated research platform for the company.
Who You Are:
  • Knowledge of Linux systems and environment.
  • Exposure to Infrastructure as Code tools (Ansible, Terraform, etc.).
  • Understanding of cloud solutions like AWS, GCP, or Azure.
  • Interest in high-performance computing (HPC) environments.
  • Ability to troubleshoot in infrastructure scenarios.
  • Ability to effectively collaborate with diverse and global teams.
What You'll Get:

You'll join a culture of collaboration and excellence, surrounded by curious thinkers and creative problem-solvers. Motivated by a passion for continuous improvement, you'll thrive in a supportive, high-performing environment alongside talented colleagues, collectively tackling some of the toughest challenges in the financial markets.

In addition, you'll receive:
  • The opportunity to work alongside best-in-class professionals from over 40 different countries
  • A highly competitive compensation package
  • Global profit-sharing pool and performance-based bonus structure
  • 401(k) match up to 50%
  • Comprehensive health, mental, dental, vision, disability, and life coverage
  • 25 paid vacation days alongside market holidays
  • Extensive office perks, including breakfast, lunch and snacks, regular social events, clubs, sporting leagues and more

Who we are:

At Optiver, our mission is to improve the market by injecting liquidity, providing accurate pricing, increasing transparency and stabilizing the market no matter the conditions. With a focus on continuous improvement, we prioritize safeguarding the health and efficiency of the markets for all participants. As one of the largest market making institutions, we are a respected partner on 100+ exchanges across the globe.

Our differences are our edge. Optiver does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, physical or mental disability, or other legally protected characteristics.

Below is the expected base salary for this position. This is a good-faith estimate of the base pay scale for this position and offers will ultimately be determined based on experience, education, skill set, and performance in the interview process. This position will also be eligible for a discretionary bonus (if determined by Optiver) and Optiver's benefits package with the benefits listed above.

Base Salary Range

$150,000-$200,000 USD
View Now

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Hpc Professionals Jobs