1 Online Performance jobs in the United States

Machine Learning Performance Engineer - CUDA Python - REMOTE with Travel

32099 Jacksonville, Florida Simple Solutions

Posted 15 days ago

Job Viewed

Tap Again To Close

Job Description

Job Description:Machine Learning Performance Engineer - CUDA Python Duration: 6 month contract with the likelihood to extendLocation: Remote but candidates must be willing to travel to different customer sites.*Must be willing to travel *Must have strong pre-sales abilities i.e. presentation skills, communication skills, etc. *Must be willing to help train WWT employees and customers We are needing very strong technical CUDA Python ML engineers. They can sit anywhere in the US but must be willing to travel 30% of the time. They will be very client facing so professionalism and presentation skills are very key to this role. Your part here is optimizing the performance of our models – both training and inference. We care about efficient large-scale training, low-latency inference in real-time systems, and high-throughput inference in research. Part of this is improving straightforward CUDA, but the interesting part needs a whole-systems approach, including storage systems, networking, and host- and GPU-level considerations. Zooming in, we also want to ensure our platform makes sense even at the lowest level – is all that throughput actually goodput? Does loading that vector from the L2 cache really take that long?An understanding of modern ML techniques and toolsetsThe experience and systems knowledge required to debug a training run’s performance end to endLow-level GPU knowledge of PTX, SASS, warps, cooperative groups, Tensor Cores, and the memory hierarchyDebugging and optimization experience using tools like CUDA GDB, NSight Systems, NSight ComputeLibrary knowledge of Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLASIntuition about the latency and throughput characteristics of CUDA graph launch, tensor core arithmetic, warp-level synchronization, and asynchronous memory loadsBackground in Infiniband, RoCE, GPUDirect, PXN, rail optimization, and NVLink, and how to use these networking technologies to link up GPU clustersAn understanding of the collective algorithms supporting distributed GPU training in NCCL or MPIAn inventive approach and the willingness to ask hard questions about whether we're taking the right approaches and using the right tools 
Apply Now
Be The First To Know

About the latest Online performance Jobs in United States !

 

Nearby Locations

Other Jobs Near Me

Industry

  1. request_quote Accounting
  2. work Administrative
  3. eco Agriculture Forestry
  4. smart_toy AI & Emerging Technologies
  5. school Apprenticeships & Trainee
  6. apartment Architecture
  7. palette Arts & Entertainment
  8. directions_car Automotive
  9. flight_takeoff Aviation
  10. account_balance Banking & Finance
  11. local_florist Beauty & Wellness
  12. restaurant Catering
  13. volunteer_activism Charity & Voluntary
  14. science Chemical Engineering
  15. child_friendly Childcare
  16. foundation Civil Engineering
  17. clean_hands Cleaning & Sanitation
  18. diversity_3 Community & Social Care
  19. construction Construction
  20. brush Creative & Digital
  21. currency_bitcoin Crypto & Blockchain
  22. support_agent Customer Service & Helpdesk
  23. medical_services Dental
  24. medical_services Driving & Transport
  25. medical_services E Commerce & Social Media
  26. school Education & Teaching
  27. electrical_services Electrical Engineering
  28. bolt Energy
  29. local_mall Fmcg
  30. gavel Government & Non Profit
  31. emoji_events Graduate
  32. health_and_safety Healthcare
  33. beach_access Hospitality & Tourism
  34. groups Human Resources
  35. precision_manufacturing Industrial Engineering
  36. security Information Security
  37. handyman Installation & Maintenance
  38. policy Insurance
  39. code IT & Software
  40. gavel Legal
  41. sports_soccer Leisure & Sports
  42. inventory_2 Logistics & Warehousing
  43. supervisor_account Management
  44. supervisor_account Management Consultancy
  45. supervisor_account Manufacturing & Production
  46. campaign Marketing
  47. build Mechanical Engineering
  48. perm_media Media & PR
  49. local_hospital Medical
  50. local_hospital Military & Public Safety
  51. local_hospital Mining
  52. medical_services Nursing
  53. local_gas_station Oil & Gas
  54. biotech Pharmaceutical
  55. checklist_rtl Project Management
  56. shopping_bag Purchasing
  57. home_work Real Estate
  58. person_search Recruitment Consultancy
  59. store Retail
  60. point_of_sale Sales
  61. science Scientific Research & Development
  62. wifi Telecoms
  63. psychology Therapy
  64. pets Veterinary
View All Online Performance Jobs