Derived from job-description analysis by Serendipath's career intelligence engine.
Original posting from Fintal Partners via LinkedIn
A market-leading high-frequency trading firm is seeking Senior Machine Learning Engineers to join a specialist performance engineering team focused on low-level optimization for large-scale AI workloads.
This role is heavily focused on GPU performance, CUDA kernel optimization, and systems-level acceleration work later in the ML pipeline. The team works on extracting maximum performance from modern hardware architectures to support highly demanding training and inference workloads.
You will work close to the metal, optimizing critical components across CUDA, C++, memory management, and GPU execution paths. The work combines deep systems engineering with cutting-edge machine learning infrastructure.
Key responsibilities:
- Develop and optimize CUDA kernels for high-performance ML workloads
- Improve GPU utilization, memory efficiency, and execution performance
- Profile and optimize bottlenecks across training and inference pipelines
- Work on compiler/runtime-level optimizations and kernel fusion strategies
- Collaborate with ML systems and infrastructure teams on end-to-end acceleration
- Build highly optimized C++ components for latency and throughput-sensitive systems
Requirements:
- Strong C++ and CUDA development experience
- Deep understanding of GPU architecture and performance optimization
- Experience profiling and debugging GPU workloads using tools such as Nsight
- Knowledge of PyTorch internals, Triton, NCCL, CUTLASS, or similar frameworks
- Strong systems programming background with focus on performance engineering
- Experience working on high-throughput or low-latency distributed systems
- Computer Science, Mathematics, Physics, Engineering, or related technical degree preferred
This is an opportunity to work on some of the most technically challenging AI infrastructure problems in the industry, within an environment that values engineering excellence, autonomy, and performance.