Design and implement high intensity stress workloads using PyTorch and Triton to identify performance bottlenecks and improve platform stability and maturity
Design and implement high intensity stress workloads using PyTorch and Triton Exercise core MAIA execution paths including compute memory DMA and collectives Enable early detection of performance cliffs stability issues and system bottlenecks across simulator and real hardware Improve platform maturity reduce latestage escapes and increase confidence for broader internal and external adoption Develop PyTorch workloads stressing modellevel execution such as large GEMMs attention patterns MoElike behavior mixed precision and longrunning loops Author custom Triton kernels to stress hardware execution units memory hierarchies and synchronization paths Build parameterized stress harnesses scalable by problem size number of devices and runtime duration Integrate workloads with existing profiling monitoring and failure triage tooling Collaborate with platform firmware and SDK teams to target known risk areas and emerging issues Document usage patterns and provide reproducible scripts for lab and continuous integration CI usage Roles and Responsibilities :
For applications and inquiries, contact: [email protected]
By continuing you agree to our Terms & Privacy Policy.