and run with
!python /content/mamba/benchmarks/benchmark_generation_mamba_simple.py --model-name "mistralai/Mamba-Codestral-7B-v0.1" --prompt "My cat wrote all this CUDA code for a new language model and" --topp 0.9 --temperature 0.7 --repetition-penalty 1.2
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py:164: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py:240: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/layer_norm.py:986: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/layer_norm.py:1045: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout, *args):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/distributed/tensor_parallel.py:26: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, x, weight, bias, process_group=None, sequence_parallel=True):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/distributed/tensor_parallel.py:62: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, grad_output):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/ssd_combined.py:758: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, zxbcdt, conv1d_weight, conv1d_bias, dt_bias, A, D, chunk_size, initial_states=None, seq_idx=None, dt_limit=(0.0, float("inf")), return_final_states=False, activation="silu",
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/ssd_combined.py:836: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout, *args):
Loading model mistralai/Mamba-Codestral-7B-v0.1
2024-11-26 02:45:02.180780: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-26 02:45:02.202236: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-26 02:45:02.208529: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-26 02:45:02.224834: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-11-26 02:45:03.636865: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Loading checkpoint shards: 100% 3/3 [01:26<00:00, 28.80s/it]
Number of parameters: 7285403648
[" My cat wrote all this CUDA code for a new language model and I'm trying to run it on my GPU but it fails with the following error:\nbash\nnvcc fatal : Unsupported gencode architecture 'compute_61', sm_37\n
\nI have an Nvidia GeForce GTX 980 Ti. The compute capability of that card is sm_52
. So why does it fail?\n\nThe problem was that I had installed the latest version (10) of"]
Prompt length: 16, generation length: 100
mistralai/Mamba-Codestral-7B-v0.1 prompt processing + decoding time: 9234ms
!python /content/mamba/benchmarks/benchmark_generation_mamba_simple.py --model-name "mistralai/Mamba-Codestral-7B-v0.1" --prompt "how is ai?" --topp 0.9 --temperature 0.7 --repetition-penalty 1.2 --genlen 512
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py:164: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, xz, conv1d_weight, conv1d_bias, x_proj_weight, delta_proj_weight,
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/selective_scan_interface.py:240: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/layer_norm.py:986: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/layer_norm.py:1045: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout, *args):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/distributed/tensor_parallel.py:26: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, x, weight, bias, process_group=None, sequence_parallel=True):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/distributed/tensor_parallel.py:62: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, grad_output):
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/ssd_combined.py:758: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
def forward(ctx, zxbcdt, conv1d_weight, conv1d_bias, dt_bias, A, D, chunk_size, initial_states=None, seq_idx=None, dt_limit=(0.0, float("inf")), return_final_states=False, activation="silu",
/usr/local/lib/python3.10/dist-packages/mamba_ssm/ops/triton/ssd_combined.py:836: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
def backward(ctx, dout, *args):
Loading model mistralai/Mamba-Codestral-7B-v0.1
2024-11-26 03:13:14.757808: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-26 03:13:14.799402: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-26 03:13:14.811939: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-26 03:13:14.850803: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-11-26 03:13:17.430454: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Loading checkpoint shards: 100% 3/3 [01:27<00:00, 29.29s/it]
Number of parameters: 7285403648
[' how is ai?\n\n# How to Use AI in Your Business: 10 Examples of Successful Implementations\n\nArtificial intelligence (AI) has the potential to revolutionize every aspect of business. By automating repetitive tasks, improving decision-making processes, and providing valuable insights into customer behavior, AI can help businesses become more efficient, effective, and competitive. In this article, we'll explore ten examples of successful implementations of AI in various industries. From healthcare to retail, these use cases demonstrate the power of AI to transform businesses and create new opportunities for growth.\n\n## Healthcare Industry\n\n### Predictive Maintenance with IoT Sensors\n\nIn the healthcare industry, predictive maintenance refers to using data from Internet of Things (IoT) sensors to anticipate when equipment or machinery will fail before it breaks down completely. This approach helps prevent unexpected downtime, reduces repair costs, and extends the lifespan of assets. For example. GE Aviion uses sensor data collected by their Turbofan Engine to train machine learning models that detect anomalies weeks before a component fails. The company then sends alerts to engineers who can schedule repairs proactively, minimizing unplanned downtime and maximizing operational efficiency.\n\n### Personalized Medicine\n\nAnother application of AI in healthcare involves personalized medicine. By analyzing genomic data, researchers at Stanford University have developed an algorithm capable of identifying genetic mutations associated with specific diseases such as cancer. This breakthrough could lead to tailored treatments based on individual patient characteristics, potentially reducing treatment time and increasing response rates.\n\n### Diagnosing Rare Diseases\n\nRare diseases account for up to 35% of all medical conditions but only affect around 2 million people worldwide. Due to their rarity, they often go undiagnosed, leading to delayed access to appropriate care. To address this issue, IBM Watson launched a project called "Watson Health" aimed at accelerating diagnosis and treatment development for rare diseases. Leveraging advanced natural language processing (NLP), speech recognition, and cognitive computing technologies, Watson Health can analyze textual information extracted from scientific literature and clinical trial reports to identify patterns related to symptoms, causes, and treatments for these conditions. As a result, patients suffering from rare diseases may receive timely support and guidance earlier than ever before.\n\n## Retail Industry\n\n### Fraud Detection\n\nFraud detection systems play a crucial']
Prompt length: 5, generation length: 512
mistralai/Mamba-Codestral-7B-v0.1 prompt processing + decoding time: 40840ms