Math PhD + NVIDIA NIM/CUDA Engineer
Description
Our team is tackling high-complexity challenges in computational modeling and GenAI-driven analytics. We are currently building a next-generation acceleration layer for predictive simulations that requires a blend of deep theoretical knowledge and low-level GPU optimization. You will be responsible for ensuring our mathematical models don’t just work they run at the theoretical limits of available hardware.
Requirements PhD in Computer Science, Physics, Mathematics, or a related quantitative field with a focus on high-performance computing or numerical methods 3+ years of experience in Python Engineering (Middle/Senior level) with a deep understanding of asynchronous programming and system architecture 1+ year of hands-on experience with NVIDIA NIM and Triton Inference Server for deploying optimized LLMs or specialized AI models Strong proficiency in CUDA C++ and CuPy for developing and accelerating custom GPU kernels and parallel algorithms Proven track record of translating complex theoretical papers/models into production-ready, GPU-accelerated code
Job responsibilities Design and Architect high-performance inference pipelines using NVIDIA NIM to serve LLMs and custom generative models at scale Develop and Optimize custom GPU-accelerated operators using CuPy and raw CUDA kernels to bypass CPU bottlenecks in mathematical computations Profile and Debug GPU memory utilization and compute kernels using NVIDIA Nsight Systems/Compute to hit aggressive latency targets Bridge the Gap between research-grade prototypes and production systems, ensuring code is modular, tested, and scalable Implement GPU-efficient data structures for real-time processing of large-scale industrial or scientific datasets Collaborate with cross-functional teams to integrate specialized AI microservices into broader cloud-native architectures (Kubernetes/Azure) Stay at the forefront of GPU computing, evaluating new NVIDIA hardware features (like H100 Transformer Engines) for project applicability
Tools NVIDIA: NIM, CUDA, CuPy, TensorRT, Triton Inference Server, Nsight, DCGM Backend/AI: Python (Expert), FastAPI, PyTorch, NumPy/SciPy, Numba Platform: Docker, Kubernetes, Helm, gRPC/REST, Prometheus/Grafana
Skills
Want AI to find more roles like this?
Upload your CV once. Get matched to relevant assignments automatically.