Benjamin Consolvo commited on
Commit
ad676d5
·
1 Parent(s): 7645d86

doc updates 3

Browse files
Files changed (2) hide show
  1. app.py +1 -1
  2. info/deployment.py +7 -1
app.py CHANGED
@@ -30,7 +30,7 @@ with demo:
30
  follow the instructions and complete the form in the 🏎️ Submit tab. Models submitted to the leaderboard are evaluated
31
  on the Intel Developer Cloud ☁️. The evaluation platform consists of Gaudi Accelerators and Xeon CPUs running benchmarks from
32
  the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness).""")
33
- gr.Markdown("""![DevHub-image](assets/DevHub_Logo.png) Join 5000+ developers on the [Intel DevHub Discord](https://discord.gg/yNYNxK2k) to get support with your submission and
34
  talk about everything from GenAI, HPC, to Quantum Computing.""")
35
  gr.Markdown("""A special shout-out to the 🤗 [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
36
  team for generously sharing their code and best
 
30
  follow the instructions and complete the form in the 🏎️ Submit tab. Models submitted to the leaderboard are evaluated
31
  on the Intel Developer Cloud ☁️. The evaluation platform consists of Gaudi Accelerators and Xeon CPUs running benchmarks from
32
  the [Eleuther AI Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness).""")
33
+ gr.Markdown("""Join 5000+ developers on the [Intel DevHub Discord](https://discord.gg/yNYNxK2k) to get support with your submission and
34
  talk about everything from GenAI, HPC, to Quantum Computing.""")
35
  gr.Markdown("""A special shout-out to the 🤗 [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
36
  team for generously sharing their code and best
info/deployment.py CHANGED
@@ -95,9 +95,11 @@ The Intel® Data Center GPU Max Series is Intel's highest performing, highest de
95
 
96
  ### INT4 Inference (GPU) with Intel Extension for Transformers and Intel Extension for Python
97
  Intel® Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU.
 
98
  👍 [Intel Extension for Transformers GitHub](https://github.com/intel/intel-extension-for-transformers)
99
 
100
  Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.
 
101
  👍 [Intel Extension for PyTorch GitHub](https://github.com/intel/intel-extension-for-pytorch)
102
 
103
  ```python
@@ -125,6 +127,7 @@ The Intel® Xeon® CPUs have the most built-in accelerators of any CPU on the ma
125
 
126
  ### Optimum Intel and Intel Extension for PyTorch (no quantization)
127
  🤗 Optimum Intel is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.
 
128
  👍 [Optimum Intel GitHub](https://github.com/huggingface/optimum-intel)
129
 
130
  Requires installing/updating optimum `pip install --upgrade-strategy eager optimum[ipex]`
@@ -179,6 +182,7 @@ Intel® Core™ Ultra Processors are optimized for premium thin and powerful lap
179
 
180
  ### Intel® NPU Acceleration Library
181
  The Intel® NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.
 
182
  👍 [Intel NPU Acceleration Library GitHub](https://github.com/intel/intel-npu-acceleration-library)
183
 
184
  ```python
@@ -214,6 +218,7 @@ _ = model.generate(**generation_kwargs)
214
 
215
  ### OpenVINO Tooling with Optimum Intel
216
  OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference.
 
217
  👍 [OpenVINO GitHub](https://github.com/openvinotoolkit/openvino)
218
 
219
  ```python
@@ -235,12 +240,13 @@ pipe("In the spring, beautiful flowers bloom...")
235
  # Intel® Gaudi Accelerators
236
  The Intel Gaudi 2 accelerator is Intel's most capable deep learning chip. You can learn about Gaudi 2 [here](https://habana.ai/products/gaudi2/).
237
 
238
- Habana's SDK, Intel Gaudi Software, supports PyTorch and DeepSpeed for accelerating LLM training and inference.
239
  The Intel Gaudi Software graph compiler will optimize the execution of the operations accumulated in the graph
240
  (e.g. operator fusion, data layout management, parallelization, pipelining and memory management,
241
  and graph-level optimizations).
242
 
243
  Optimum Habana provides covenient functionality for various tasks. Below is a command line snippet to run inference on Gaudi with meta-llama/Llama-2-7b-hf.
 
244
  👍[Optimum Habana GitHub](https://github.com/huggingface/optimum-habana)
245
 
246
  The "run_generation.py" script below can be found [here on GitHub](https://github.com/huggingface/optimum-habana/tree/main/examples/text-generation)
 
95
 
96
  ### INT4 Inference (GPU) with Intel Extension for Transformers and Intel Extension for Python
97
  Intel® Extension for Transformers is an innovative toolkit designed to accelerate GenAI/LLM everywhere with the optimal performance of Transformer-based models on various Intel platforms, including Intel Gaudi2, Intel CPU, and Intel GPU.
98
+
99
  👍 [Intel Extension for Transformers GitHub](https://github.com/intel/intel-extension-for-transformers)
100
 
101
  Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Vector Neural Network Instructions (VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs through the PyTorch* xpu device.
102
+
103
  👍 [Intel Extension for PyTorch GitHub](https://github.com/intel/intel-extension-for-pytorch)
104
 
105
  ```python
 
127
 
128
  ### Optimum Intel and Intel Extension for PyTorch (no quantization)
129
  🤗 Optimum Intel is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.
130
+
131
  👍 [Optimum Intel GitHub](https://github.com/huggingface/optimum-intel)
132
 
133
  Requires installing/updating optimum `pip install --upgrade-strategy eager optimum[ipex]`
 
182
 
183
  ### Intel® NPU Acceleration Library
184
  The Intel® NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.
185
+
186
  👍 [Intel NPU Acceleration Library GitHub](https://github.com/intel/intel-npu-acceleration-library)
187
 
188
  ```python
 
218
 
219
  ### OpenVINO Tooling with Optimum Intel
220
  OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference.
221
+
222
  👍 [OpenVINO GitHub](https://github.com/openvinotoolkit/openvino)
223
 
224
  ```python
 
240
  # Intel® Gaudi Accelerators
241
  The Intel Gaudi 2 accelerator is Intel's most capable deep learning chip. You can learn about Gaudi 2 [here](https://habana.ai/products/gaudi2/).
242
 
243
+ Intel Gaudi Software supports PyTorch and DeepSpeed for accelerating LLM training and inference.
244
  The Intel Gaudi Software graph compiler will optimize the execution of the operations accumulated in the graph
245
  (e.g. operator fusion, data layout management, parallelization, pipelining and memory management,
246
  and graph-level optimizations).
247
 
248
  Optimum Habana provides covenient functionality for various tasks. Below is a command line snippet to run inference on Gaudi with meta-llama/Llama-2-7b-hf.
249
+
250
  👍[Optimum Habana GitHub](https://github.com/huggingface/optimum-habana)
251
 
252
  The "run_generation.py" script below can be found [here on GitHub](https://github.com/huggingface/optimum-habana/tree/main/examples/text-generation)