MCP servers from io.github.RightNow-AI
Turn PyTorch into fast CUDA/Triton kernels on real datacenter GPUs with up to 14x speedup.