facebook
/

KernelLLM

Text Generation

text-generation-inference

Model card Files Files and versions

Zacharias030 commited on 15 days ago

Commit

86d9bc4

·

verified ·

1 Parent(s): 1019e2b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ Our 8B parameter model achieves competitive or superior performance compared to
 The resulting model is competitive with state of the art LLMs despite its small size. We evaluate our model on KernelBench which is an open-source benchmark to evaluate the ability of LLMs to write efficient GPU kernels. It contains 250 selected PyTorch modules organized into difficulty levels, from single torch operators such as Conv2D or Swish (level 1), to full model architectures (level 3). The benchmark measures both correctness (by comparing against reference PyTorch outputs) and performance (by measuring speedup over baseline implementations). We implemented a new KernelBench-Triton variant that evaluates an LLMs ability to generate Triton kernels, making it an ideal benchmark for evaluating KernelLLM's capabilities. All our measurements were done on Nvidia H100 GPUs.
 ![pass at k analysis plot](media/kernelllm_pass_at_k_scaling.png)
 For more information, please see [Project Popcorn](https://gpu-mode.github.io/popcorn/).

 The resulting model is competitive with state of the art LLMs despite its small size. We evaluate our model on KernelBench which is an open-source benchmark to evaluate the ability of LLMs to write efficient GPU kernels. It contains 250 selected PyTorch modules organized into difficulty levels, from single torch operators such as Conv2D or Swish (level 1), to full model architectures (level 3). The benchmark measures both correctness (by comparing against reference PyTorch outputs) and performance (by measuring speedup over baseline implementations). We implemented a new KernelBench-Triton variant that evaluates an LLMs ability to generate Triton kernels, making it an ideal benchmark for evaluating KernelLLM's capabilities. All our measurements were done on Nvidia H100 GPUs.
 ![pass at k analysis plot](media/kernelllm_pass_at_k_scaling.png)
+*KernelLLM shows quasi log-linear scaling behavior during pass@k analysis.*
 For more information, please see [Project Popcorn](https://gpu-mode.github.io/popcorn/).