15,584 AI Infrastructure jobs in the United States
AI Infrastructure Engineer
Posted today
Job Viewed
Job Description
Job DescriptionJob Description
Location: Preferably Greater Chicago Area
Duration: 12 weeks, 20 hrs/week
Note : Must have unrestricted authorization to work in the US. We don’t sponsor visas.
About DALAVE
DALAVE is a leading cloud consulting company that offers a comprehensive suite of cloud solutions and services. We specialize in cutting-edge, cloud- technologies on Google Cloud to safeguard customer environments. As a small but mighty team of engineers, we punch above our weight - delivering enterprise-grade solutions with startup speed.
What drives us: Making cloud security accessible, not intimidating. Effective security should accelerate innovation, not hinder it.
The Role
This isn't your typical "fetch coffee and shadow meetings" . You'll write production code in your first week, own real features, and directly impact how enterprises secure their AI infrastructure.
AI is transforming every industry, but most companies are rushing to adopt it without understanding the security implications. That's where we come in. Working directly with our founding engineering team, you'll build the MLOps and security tools that become our competitive advantage. Your code won't sit in a sandbox - it'll run in production environments processing terabytes of data, protecting AI systems that power real businesses.
What You'll Build
-
ML pipelines that scale from prototype to production on GCP
-
Security automation that catches vulnerabilities before they hit production
-
Infrastructure tools in Python and Go that our team uses daily
-
Cost optimization frameworks that save real money (think $100K+ monthly)
-
Technical content that establishes our thought leadership in cloud security
You Should Have
Required:
-
Currently pursuing a Bachelor's/Master's in CS, Data Science, or related field
-
Strong Python skills + experience (or excitement to learn) Go
-
SQL mastery - complex queries, optimization, the works
-
Understanding of machine learning concepts and model lifecycle
-
Git proficiency and collaborative development experience
-
Cloud computing fundamentals
-
Solid grasp of computer science fundamentals (data structures, algorithms, networking)
-
Understanding of software development lifecycle (SDLC) and agile methodologies
-
Familiarity with Linux/Unix command line
Bonus Points:
-
Security mindset or certifications
-
Database design and optimization experience
-
Projects on GitHub we can geek out over
-
GCP experience (especially Vertex AI, BigQuery)
-
Experience with Terraform
-
Docker/Kubernetes knowledge
-
Understanding of CI/CD concepts
We're looking for hungry engineers who want to build, not just learn. If you'd rather write code that impacts real systems than sit through videos, we should talk.
This is perfect if you:
-
Prefer shipping code over writing reports
-
Are excited about AI/ML but care about doing it securely
-
Want to say "I built that" instead of "I helped with that"
DALAVE LLC is an equal opportunity employer committed to building a diverse and inclusive team.
Company DescriptionDALAVE is a leading cloud security company offering a comprehensive suite of solutions and services. We specialize in cutting-edge, cloud- technologies on Google Cloud to safeguard customer environments. Our team comprises highly skilled security professionals dedicated to protecting your digital assets.Company DescriptionDALAVE is a leading cloud security company offering a comprehensive suite of solutions and services. We specialize in cutting-edge, cloud- technologies on Google Cloud to safeguard customer environments. Our team comprises highly skilled security professionals dedicated to protecting your digital assets.
AI Infrastructure Engineer
Posted 1 day ago
Job Viewed
Job Description
AI Infrastructure Engineer
Category:
Engineering
Employment Type:
Direct Hire
Reference:
BH-
AI Infrastructure Engineer
About the Position
We are looking for a senior AI Inference Infrastructure Software Engineer with strong hands-on experience building, optimizing, and deploying high-performance, scalable inference systems. This position is focused on designing, implementing, and delivering production-grade software that powers real-world applications of Large Language Models (LLMs) and Vision-Language Models (VLMs).
This is an exciting opportunity for an engineer who thrives at the intersection of AI systems, hardware acceleration, and large-scale robust deployment, and who wants to see their contributions ship in production, at scale.
In this role, you will directly shape the architecture, roadmap and performance of AI capabilities of our AIOS platform, driving innovations that make LLM/VLM systems fast, efficient, and scalable across cloud, edge, and hybrid edge-cloud environments. You will work closely with system, hardware, and product teams to deliver high-performance inference kernels for hardware accelerators, design scalable inference serving systems, and integrate optimizations such tensor parallelism and custom kernels into production pipelines. Your work will have immediate impact, powering intelligent automotive systems in the next generation of electric vehicles.
Roles and Responsibilities:
- Design and implement high-performance, scalable inference systems for LLMs and VLMs across cloud, edge, and edge-cloud hybrid platforms.
- Develop and optimize custom kernels and operators for specific hardware accelerators (GPU, NPU, DSP, etc.), improving throughput, latency, and memory efficiency.
- Integrate advanced optimization techniques such as KV-cache management, tensor/model parallelism, quantization, and memory-efficient execution into production inference systems.
- Partner with system and hardware teams to ensure tight hardware-software integration and optimal performance across diverse compute environments.
- Translate architectural requirements into robust, maintainable, production-ready software that meets performance, safety, and reliability standards.
- Define and drive the evolution roadmap for LLM/VLM inference in the AIOS stack, ensuring scalability and adaptability to new workloads.
- Stay ahead of industry trends and competitor solutions, applying best practices from both AI and large-scale systems engineering.
- 5+ years of hands-on software development experience in building and optimizing AI inference systems at scale.
- Direct experience in LLM/VLM model internals, including Transformer-based architectures, inference bottlenecks, and optimization techniques.
- Strong expertise in performance engineering: kernel development, parallelism strategies, memory optimization, and distributed inference systems.
- Proficiency with GPU/NPU programming (CUDA, or vendor-specific SDKs), compiler toolchains, and deep learning frameworks (PyTorch, or TensorFlow).
- Strong programming skills in C/C++, with a track record of delivering high-performance, production-grade software.
- Solid foundation in computer architecture, systems programming (CPU/GPU pipelines, memory hierarchy, scheduling), and embedded systems.
- BS/MS in Computer Science, Computer Engineering, or related technical field.
- Excellent communication and collaboration skills, with the ability to work across cross-functional teams.
- Master's or PhD degree in Computer Science, Electrical/Computer Engineering, or related fields, plus 5 years industry experience
- Experience building inference serving systems for large models, including batching, scheduling, caching, and load balancing.
- Expertise in hardware-aware model optimization (e.g., kernel fusion, mixed precision, quantization, pruning).
- Familiarity with edge and embedded AI, including real-time constraints and limited-resource optimization.
- Contributions to widely used AI frameworks, libraries, or performance-critical software (open source or proprietary).
Estimated Max Rate : $300,000.00
What's In It for You?
We welcome you to be a part of the largest and legendary global staffing companies to meet your career aspirations. Yoh's network of client companies has been employing professionals like you for over 65 years in the U.S., UK and Canada. Join Yoh's extensive talent community that will provide you with access to Yoh's vast network of opportunities and gain access to this exclusive opportunity available to you. Benefit eligibility is in accordance with applicable laws and client requirements. Benefits include:
- Medical, Prescription, Dental & Vision Benefits (for employees working 20+ hours per week)
- Health Savings Account (HSA) (for employees working 20+ hours per week)
- Life & Disability Insurance (for employees working 20+ hours per week)
- MetLife Voluntary Benefits
- Employee Assistance Program (EAP)
- 401K Retirement Savings Plan
- Direct Deposit & weekly epayroll
- Referral Bonus Programs
- Certification and training opportunities
Note: Any pay ranges displayed are estimations. Actual pay is determined by an applicant's experience, technical expertise, and other qualifications as listed in the job description. All qualified applicants are welcome to apply.
Yoh, a Day & Zimmermann company, is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
Visit
to contact us if you are an individual with a disability and require accommodation in the application process.
For California applicants, qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. All of the material job duties described in this posting are job duties for which a criminal history may have a direct, adverse, and negative relationship potentially resulting in the withdrawal of a conditional offer of employment.
By applying and submitting your resume, you authorize Yoh to review and reformat your resume to meet Yoh's hiring clients' preferences. To learn more about Yoh's privacy practices, please see our Candidate Privacy Notice:
AI Infrastructure Engineer
Posted 3 days ago
Job Viewed
Job Description
LifeMD is a leading digital healthcare company committed to expanding access to virtual care, pharmacy services, and diagnostics by making them more affordable and convenient for all. Focused on both treatment and prevention, our unique care model is designed to optimize the patient experience and improve outcomes across more than 200 health concerns.
To support our expanding patient base, LifeMD leverages a vertically-integrated, proprietary digital care platform, a 50-state affiliated medical group, a 22,500-square-foot affiliated pharmacy, and a U.S.-based patient care center. Our company with offices in New York City; Greenville, SC; and Huntington Beach, CA is powered by a dynamic team of passionate professionals. From clinicians and technologists to creatives and analysts, we're united by a shared mission to revolutionize healthcare. Employees enjoy a collaborative and inclusive work environment, hybrid work culture, and numerous opportunities for growth. Want your work to matter? Join us in building a future of accessible, innovative, and compassionate care.
We're looking for an AI Infrastructure Engineer to own and maintain our AI infrastructure on GCP, which powers our voice assistants, message routing, and knowledge retrieval systems. This role will manage the technical backbone that enables our AI services to operate safely and effectively in healthcare settings. You will be the technical steward of our AI operations infrastructure and a partner to clinical and product stakeholders. Your job is to build new AI capabilities, maintain existing systems, and ensure our AI services meet LifeMD's standards. This role will operate across our entire AI stack: from service development and deployment to testing frameworks to production monitoring systems.
Responsibilities
- Own and maintain the AI infrastructure, including deployment pipelines, testing frameworks, and evaluation systems
- Build new AI capabilities and use cases while maintaining existing services and systems
- Develop and maintain AI evaluation models for regression testing
- Implement monitoring and alerting systems for AI service performance, cost management, and safety compliance
- Collaborate with engineering teams to integrate AI testing into deployment pipelines and ensure proper governance
- Manage BigQuery datasets, data pipelines, and cloud infrastructure supporting AI operations
AI Infrastructure Engineer
Posted 3 days ago
Job Viewed
Job Description
At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at Visit our Mission and Culture doc here.
About HeyGen
HeyGen stands at the forefront of cutting-edge AI-powered platforms, revolutionizing the realm of video creation.
Position Summary:
At HeyGen, we are at the forefront of developing applications powered by our cutting-edge AI research. As an AI Infrastructure Engineer, you will lead the development of fundamental AI systems and infrastructure. These systems are essential for powering our innovative applications, including Photo Avatar, Instant Avatar, Streaming Avatar, and Video Translation. Your role will be crucial in enhancing the efficiency and scalability of these systems, which are vital to HeyGen's success.
Key Responsibilities:
- Design, build, and maintain the AI infrastructure and systems needed to support our AI applications. Examples include
- AI workflow scheduling system to improve GPU efficiency and throughput of our batch inference systems
- Model optimization to improve inference performance
- Auto Train systems to power our avatar models
- Large scale model evaluation systems
- Online model serving systems
- Collaborate with data scientists and machine learning engineers to understand their computational and data needs and provide efficient solutions.
- Stay up-to-date with the latest industry trends in AI infrastructure technologies and advocate for best practices and continuous improvement.
- Assist in budget planning and management of cloud resources and other infrastructure expenses.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- 5+ years of experience
- Proven experience in managing infrastructure for large-scale AI or machine learning projects
- Excellent problem-solving skills and the ability to work independently or as part of a team.
- Proficiency in Python and C++
- Experience with GPU computing and optimizing computational workflows
- Familiarity with AI and machine learning frameworks like TensorFlow or PyTorch.
- Experience with CUDA
- Experience optimizing large deep learning model performance
- Experience building large scale batch inference system
- Prior experience in a startup or fast-paced tech environment.
- Competitive salary and benefits package.
- Dynamic and inclusive work environment.
- Opportunities for professional growth and advancement.
- Collaborative culture that values innovation and creativity.
- Access to the latest technologies and tools.
HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
AI Infrastructure Engineer
Posted 3 days ago
Job Viewed
Job Description
Primary Location: Remote
Must be local to one of the following locations: San Francisco, CA; Boston, MA; Chicago, IL; Portland, ME
V-Soft Consulting is currently hiring for an AI Infrastructure Engineer (Remote) for our premier client.
Education and Experience "
Must Haves Skills:
- Sr level + Building, training, and deploying ML models.
- Python.
- A combo of Pandas, Numpy, PyTorch, Tensorflow.
- Experience with transformer models - specifically GPT (Generative predictive transformer).
- Cloud experience- AWS, Azure, or GCP.
- CI/CD pipeline implementation.
Nice to Have Skills:
- Kubernetes and Docker.
WHAT YOU'LL DO:
Job Responsibilities:
- Will work on both teams/projects.
- Responsible for assisting in design and implementation of ML models (need to be a do-er in addition to design).
- Should be able to look at the product and connect the business need with the technology capabilities.
- Split up into different teams - some teams focus on GenAI and ML, some teams focus on IaC.
- Collaborate with operational teams, Product Owners, Product engineering, and leadership on business needs and enabling various AI features.
Projects:
- Building two horizontal AI platforms.
- MLOps - using predictive models.
- Detecting fraudulent transaction.
- Determine credit lines/amounts for small business loans.
- GenAI.
- Claims - upload documentation, AI read and analyze, auto-populate claims form, launch validation/classification for claims.
- Auto approvals.
Interested?
Qualified candidates should send their resumes to
V-Soft Consulting Group is recognized among the top 100 fastest growing staffing companies in North America, V-Soft Consulting Group is headquartered in Louisville, KY with strategic locations in India, Canada and the U.S. V-Soft is known as an agile, innovative technology services company holding several awards and distinctions and has a wide variety of partnerships across diverse technology stacks.
As a valued V-Soft Consultant, you're eligible for full benefits (Medical, Dental, Vision), a 401(k) plan, competitive compensation and more. V-Soft is partnered with numerous Fortune 500 companies, exceptionally positioned to advance your career growth.
V-Soft Consulting provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
For more information or to view all our open jobs, please visit or call .
#LI-SR3
AI Infrastructure Engineer
Posted 3 days ago
Job Viewed
Job Description
About NIO
NIO is a pioneer and a leading company in the premium smart electric vehicle market. Founded in November 2014, NIO's mission is to shape a joyful lifestyle. NIO aims to build a community starting with smart electric vehicles to share joy and grow together with users.
NIO designs, develops, jointly manufactures and sells premium smart electric vehicles, driving innovations in next-generation technologies in autonomous driving, digital technologies, electric powertrains and batteries. NIO differentiates itself through its continuous technological breakthroughs and innovations, such as its industry-leading battery swapping technologies, Battery as a Service, or BaaS, as well as its proprietary autonomous driving technologies and Autonomous Driving as a Service, or ADaaS.
NIO's product portfolio consists of the ES8, a six-seater smart electric flagship SUV, the ES7 (or the EL7), a mid-large five-seater smart electric SUV, the ES6, a five-seater all-round smart electric SUV, the EC7, a five-seater smart electric flagship coupe SUV, the EC6, a five-seater smart electric coupe SUV, the ET7, a smart electric flagship sedan, and the ET5, a mid-size smart electric sedan.
About the Position
We are looking for a senior AI Inference Infrastructure Software Engineer with strong hands-on experience building, optimizing, and deploying high-performance, scalable inference systems. This position is focused on designing, implementing, and delivering production-grade software that powers real-world applications of Large Language Models (LLMs) and Vision-Language Models (VLMs).
This is an exciting opportunity for an engineer who thrives at the intersection of AI systems, hardware acceleration, and large-scale robust deployment, and who wants to see their contributions ship in production, at scale.
In this role, you will directly shape the architecture, roadmap and performance of AI capabilities of our AIOS platform, driving innovations that make LLM/VLM systems fast, efficient, and scalable across cloud, edge, and hybrid edge-cloud environments. You will work closely with system, hardware, and product teams to deliver high-performance inference kernels for hardware accelerators, design scalable inference serving systems, and integrate optimizations such tensor parallelism and custom kernels into production pipelines. Your work will have immediate impact, powering intelligent automotive systems in the next generation of electric vehicles.
Roles and Responsibilities :
- Design and implement high-performance, scalable inference systems for LLMs and VLMs across cloud, edge, and edge-cloud hybrid platforms.
- Develop and optimize custom kernels and operators for specific hardware accelerators (GPU, NPU, DSP, etc.), improving throughput, latency, and memory efficiency.
- Integrate advanced optimization techniques such as KV-cache management, tensor/model parallelism, quantization, and memory-efficient execution into production inference systems.
- Partner with system and hardware teams to ensure tight hardware-software integration and optimal performance across diverse compute environments.
- Translate architectural requirements into robust, maintainable, production-ready software that meets performance, safety, and reliability standards.
- Define and drive the evolution roadmap for LLM/VLM inference in the AIOS stack, ensuring scalability and adaptability to new workloads.
- Stay ahead of industry trends and competitor solutions, applying best practices from both AI and large-scale systems engineering.
- 5+ years of hands-on software development experience in building and optimizing AI inference systems at scale.
- Direct experience in LLM/VLM model internals, including Transformer-based architectures, inference bottlenecks, and optimization techniques.
- Strong expertise in performance engineering: kernel development, parallelism strategies, memory optimization, and distributed inference systems.
- Proficiency with GPU/NPU programming (CUDA, or vendor-specific SDKs), compiler toolchains, and deep learning frameworks (PyTorch, or TensorFlow).
- Strong programming skills in C/C++, with a track record of delivering high-performance, production-grade software.
- Solid foundation in computer architecture, systems programming (CPU/GPU pipelines, memory hierarchy, scheduling), and embedded systems.
- BS/MS in Computer Science, Computer Engineering, or related technical field.
- Excellent communication and collaboration skills, with the ability to work across cross-functional teams.
- Master's or PhD degree in Computer Science, Electrical/Computer Engineering, or related fields, plus 5 years industry experience
- Experience building inference serving systems for large models, including batching, scheduling, caching, and load balancing.
- Expertise in hardware-aware model optimization (e.g., kernel fusion, mixed precision, quantization, pruning).
- Familiarity with edge and embedded AI, including real-time constraints and limited-resource optimization.
- Contributions to widely used AI frameworks, libraries, or performance-critical software (open source or proprietary).
Compensation:
The US base salary range for this full-time position is $192,100.00 - $49,600.00.
- Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
- Please note that the compensation details listed in US role postings reflect the base salary only. It does not include discretionary bonus, equity, or benefits.
Benefits:
Along with competitive pay, as a full-time NIO employee, you are eligible for the following benefits on the first day you join NIO:
- CIGNA EPO, HSA, and Kaiser HMO medical plans with 0 for Employee Only Coverage.
- Dental (including orthodontic coverage) and vision plan. Both provide options with a 0 paycheck contribution covering you and your eligible dependents.
- Company Paid HSA (Health Savings Account) Contribution when enrolled in the High Deductible CIGNA medical plan
- Healthcare and Dependent Care Flexible Spending Accounts (FSA)
- 401(k) with Brokerage Link option
- Company paid Basic Life, AD&D, short-term and long-term disability insurance
- Employee Assistance Program
- Sick and Vacation time
- 13 Paid Holidays a year
- Paid Parental Leave for first 8 weeks at full pay (eligible after 90 days of employment with NIO)
- Paid Disability Leave for first 6 weeks at full pay (eligible after 90 days of employment with NIO)
- Voluntary benefits including: Voluntary Life and AD&D options for you, your spouse/domestic partner and dependent child(ren), pet insurance
- Commuter benefits
- Mobile Cell Phone Credit
- Healthjoy mobile benefit app supporting you and your dependents with benefit questions on the go & support with benefit billing questions
- Free lunch and snacks
- Onsite gym
- Employee discounts and perks program
AI Infrastructure Engineer, Agents
Posted 3 days ago
Job Viewed
Job Description
As a Software Engineer on the ML Infrastructure team, you will design and build the platform for our agent sandboxing platform: the secure, high-performance code execution layer powering our agentic workflows. This system underpins critical applications and research initiatives, and is deployed across both internal and customer-managed environments.
This position requires deep expertise in systems engineering: operating systems, virtualization, networking, containers, and performance optimization. Your work will directly enable agents to execute untrusted or user-submitted code safely, efficiently, and repeatedly, and with fast startup times, strong isolation guarantees, and support for snapshotting and inspection.
You will:- Design and build the sandboxing platform for code execution across containerized and virtualized environments.
- Ensure strong isolation, security, and reproducibility of execution across user sessions and workloads.
- Optimize for cold-start latency, memory footprint, and resource utilization at scale.
- Collaborate across security, infra, and product teams to support both internal research use cases and enterprise customer deployments.
- Lead architecture reviews and own projects from design through deployment in fast-paced, cross-functional settings.
- 3+ years of experience building high-performance systems software (e.g. OS, container runtime, VMM, networking stack).
- Deep understanding of Linux internals, process isolation, memory management, cgroups, namespaces, etc.
- Experience with containerization and virtualization technologies (e.g., Docker, Firecracker, gVisor, QEMU, Kata Containers).
- Proficiency in a systems programming language such as Go, Rust, or C/C++.
- Familiarity with networking, security hardening, sandboxing techniques, and kernel-level performance tuning.
- Comfort working across infrastructure layers, from kernel modules to orchestration frameworks (e.g., Kubernetes).
- Strong debugging skills and the ability to make performance/security tradeoffs in production systems.
- Familiarity with LLM agents and agent frameworks (e.g., OpenHands, Agent2Agent, MCP).
- Experience running secure workloads in multi-tenant or untrusted environments (e.g., FaaS, CI sandboxes, remote notebooks).
- Exposure to snapshotting and restore techniques (e.g., CRIU, VM snapshots, overlayfs).
Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is: $156,000—$225,600 USDPLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision .
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.
Be The First To Know
About the latest Ai infrastructure Jobs in United States !
AI Infrastructure Engineer - Autonomy
Posted 3 days ago
Job Viewed
Job Description
Applied Intuition is the vehicle intelligence company that accelerates the global adoption of safe, AI-driven machines. Founded in 2017, Applied Intuition delivers the toolchain, Vehicle OS, and autonomy stacks to help customers build intelligent vehicles and shorten time to market. Eighteen of the top 20 global automakers and major programs across the Department of Defense trust Applied Intuition's solutions to deliver vehicle intelligence. Applied Intuition services the automotive, defense, trucking, construction, mining, and agriculture industries and is headquartered in Mountain View, CA, with offices in Washington, D.C., San Diego, CA, Ft. Walton Beach, FL, Ann Arbor, MI, London, Stuttgart, Munich, Stockholm, Seoul, and Tokyo. Learn more at appliedintuition.com.
We are an in-office company, and our expectation is that employees primarily work from their Applied Intuition office 5 days a week. However, we also recognize the importance of flexibility and trust our employees to manage their schedules responsibly. This may include occasional remote work, starting the day with morning meetings from home before heading to the office, or leaving earlier when needed to accommodate family commitments. (Note: For EpiSci job openings, fully remote work will be considered by exception.)
About the role
We are looking for both infrastructure engineers with expertise in machine learning pipelines and ML engineers that want to work beyond modeling to join our AI Infrastructure group. This role will work across the entire AI lifecycle (dataset generation, training frameworks, compute, evaluation, and deployment) and work directly with modeling teams. This team is a good fit if you are excited to work on broad, ambiguous problems and develop across the entire ML stack. At Applied Intuition, we encourage all engineers to take ownership over technical and product decisions, closely interact with external and internal users to collect feedback, and contribute to a thoughtful, dynamic team culture.
At Applied Intuition, you will:
- Design and build training, inference, and evaluation infrastructure to support our current autonomy stack development, orchestrating massive GPU clusters to process petabytes of multimodal sensor data
- Optimize multimodal data ingestion and preprocessing pipelines (LiDAR, camera, radar, map priors) to support cutting-edge perception and planning model development
- Work across cloud environments to support high-throughput distributed training
- Collaborate closely with the AI research team and autonomy teams
- Technologies: Pytorch, CUDA, Ray, Flyte, K8s
- Experience with building software components to address production, full-stack machine learning challenges.
- Opinions about building a company-wide platform for ML training, evaluation, and deployment
- Knowledge of the open source landscape with judgment on when to choose open source versus build in-house
- Excellent analytical and problem-solving skills
Compensation at Applied Intuition for eligible roles includes base salary, equity, and benefits. Base salary is a single component of the total compensation package, which may also include equity in the form of options and/or restricted stock units, comprehensive health, dental, vision, life and disability insurance coverage, 401k retirement benefits with employer match, learning and wellness stipends, and paid time off. Note that benefits are subject to change and may vary based on jurisdiction of employment.
Applied Intuition pay ranges reflect the minimum and maximum intended target base salary for new hire salaries for the position. The actual base salary offered to a successful candidate will additionally be influenced by a variety of factors including experience, credentials & certifications, educational attainment, skill level requirements, interview performance, and the level and scope of the position.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the location listed is: $153,000 - $222,000 USD annually.
Don't meet every single requirement? If you're excited about this role but your past experience doesn't align perfectly with every qualification in the job description, we encourage you to apply anyway. You may be just the right candidate for this or other roles.
Applied Intuition is an equal opportunity employer and federal contractor or subcontractor. Consequently, the parties agree that, as applicable, they will abide by the requirements of 41 CFR 60-1.4(a), 41 CFR 60-300.5(a) and 41 CFR 60-741.5(a) and that these laws are incorporated herein by reference. These regulations prohibit discrimination against qualified individuals based on their status as protected veterans or individuals with disabilities, and prohibit discrimination against all individuals based on their race, color, religion, sex, sexual orientation, gender identity or national origin. These regulations require that covered prime contractors and subcontractors take affirmative action to employ and advance in employment individuals without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status or disability. The parties also agree that, as applicable, they will abide by the requirements of Executive Order 13496 (29 CFR Part 471, Appendix A to Subpart A), relating to the notice of employee rights under federal labor laws.
AI Infrastructure Engineer - PlayerZero
Posted 3 days ago
Job Viewed
Job Description
A stealth-stage AI infrastructure company is building a self-healing system for software that automates defect resolution and development. The platform is used by engineering and support teams to:
- Autonomously debug problems in production software
- Fix issues directly in the codebase
- Prevent recurring issues through intelligent root-cause automation
We believe that as software development accelerates, the burden of maintaining quality and reliability shifts heavily onto engineering and support teams. This challenge creates a rare opportunity to reimagine how software is supported and sustained -with AI-powered systems that respond autonomously.
About the Role
We're looking for an experienced backend/infrastructure engineer who thrives at the intersection of systems and AI - and who loves turning research prototypes into rock-solid production services. You'll design and scale the core backend that powers our AI inference stack - from ingestion pipelines and feature stores to GPU orchestration and vector search.
If you care deeply about performance, correctness, observability, and fast iteration , you'll fit right in.
What You'll Do
- Own mission-critical services end-to-end - from architecture and design reviews to deployment, observability, and service-level objectives.
- Scale LLM-driven systems : build RAG pipelines, vector indexes, and evaluation frameworks handling billions of events per day.
- Design data-heavy backends : streaming ETL, columnar storage, time-series analytics - all fueling the self-healing loop.
- Optimize for cost and latency across compute types (CPUs, GPUs, serverless); profile hot paths and squeeze out milliseconds.
- Drive reliability : implement automated testing, chaos engineering, and progressive rollout strategies for new models.
- Work cross-functionally with ML researchers, product engineers, and real customers to build infrastructure that actually matters.
- Have 2-5+ years of experience building scalable backend or infra systems in production environments
- Bring a builder mindset - you like owning projects end-to-end and thinking deeply about data, scale, and maintainability
- Have transitioned ML or data-heavy prototypes to production , balancing speed and robustness
- Are comfortable with data engineering workflows : parsing, transforming, indexing, and querying structured or unstructured data
- Have some exposure to search infrastructure or LLM-backed systems (e.g., document retrieval, RAG, semantic search)
- Experience with vector databases (e.g., pgvector, Pinecone, Weaviate) or inverted-index search (e.g., Elasticsearch, Lucene)
- Hands-on with GPU orchestration (Kubernetes, Ray, KServe) or model-parallel inference tuning
- Familiarity with Go / Rust (primary stack), with some TypeScript for light full-stack tasks
- Deep knowledge of observability tooling (OpenTelemetry, Grafana, Datadog) and profiling distributed systems
- Contributions to open-source ML or systems infrastructure projects
Let me know if you'd like a version optimized for careers pages, job boards, or stealth pitch decks.
AI Infrastructure Engineer, Agents
Posted 3 days ago
Job Viewed
Job Description
As a Software Engineer on the ML Infrastructure team, you will design and build the platform for our agent sandboxing platform: the secure, high-performance code execution layer powering our agentic workflows. This system underpins critical applications and research initiatives, and is deployed across both internal and customer-managed environments.
This position requires deep expertise in systems engineering: operating systems, virtualization, networking, containers, and performance optimization. Your work will directly enable agents to execute untrusted or user-submitted code safely, efficiently, and repeatedly, and with fast startup times, strong isolation guarantees, and support for snapshotting and inspection.
You will:
- Design and build the sandboxing platform for code execution across containerized and virtualized environments.
- Ensure strong isolation, security, and reproducibility of execution across user sessions and workloads.
- Optimize for cold-start latency, memory footprint, and resource utilization at scale.
- Collaborate across security, infra, and product teams to support both internal research use cases and enterprise customer deployments.
- Lead architecture reviews and own projects from design through deployment in fast-paced, cross-functional settings.
- 3+ years of experience building high-performance systems software (e.g. OS, container runtime, VMM, networking stack).
- Deep understanding of Linux internals, process isolation, memory management, cgroups, namespaces, etc.
- Experience with containerization and virtualization technologies (e.g., Docker, Firecracker, gVisor, QEMU, Kata Containers).
- Proficiency in a systems programming language such as Go, Rust, or C/C++.
- Familiarity with networking, security hardening, sandboxing techniques, and kernel-level performance tuning.
- Comfort working across infrastructure layers, from kernel modules to orchestration frameworks (e.g., Kubernetes).
- Strong debugging skills and the ability to make performance/security tradeoffs in production systems.
- Familiarity with LLM agents and agent frameworks (e.g., OpenHands, Agent2Agent, MCP).
- Experience running secure workloads in multi-tenant or untrusted environments (e.g., FaaS, CI sandboxes, remote notebooks).
- Exposure to snapshotting and restore techniques (e.g., CRIU, VM snapshots, overlayfs).
Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.
Please reference the job posting's subtitle for where this position will be located. For pay transparency purposes, the base salary range for this full-time position in the locations of San Francisco, New York, Seattle is:
$156,000-$225,600 USD
PLEASE NOTE: Our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.
About Us:
At Scale, our mission is to develop reliable AI systems for the world's most important decisions. Our products provide the high-quality data and full-stack technologies that power the world's leading models, and help enterprises and governments build, deploy, and oversee AI applications that deliver real impact. We work closely with industry leaders like Meta, Cisco, DLA Piper, Mayo Clinic, Time Inc., the Government of Qatar, and U.S. government agencies including the Army and Air Force. We are expanding our team to accelerate the development of AI applications.
We believe that everyone should be able to bring their whole selves to work, which is why we are proud to be an inclusive and equal opportunity workplace. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity or Veteran status.
We are committed to working with and providing reasonable accommodations to applicants with physical and mental disabilities. If you need assistance and/or a reasonable accommodation in the application or recruiting process due to a disability, please contact us at Please see the United States Department of Labor's Know Your Rights poster for additional information.
We comply with the United States Department of Labor's Pay Transparency provision.
PLEASE NOTE: We collect, retain and use personal data for our professional business purposes, including notifying you of job opportunities that may be of interest and sharing with our affiliates. We limit the personal data we collect to that which we believe is appropriate and necessary to manage applicants' needs, provide our services, and comply with applicable laws. Any information we collect in connection with your application will be treated in accordance with our internal policies and programs designed to protect personal data. Please see our privacy policy for additional information.