YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

python build.py --model_dir ./llama_7b/ --quant_ckpt_path ./llama-7b-4bit-gs128-awq.pt --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --weight_only_precision int4_awq --per_group --output_dir ./tmp/llama/7B/trt_engines/int8_kv_cache_int4_AWQ/1-gpu/ --int8_kv_cache --ft_model_dir ./int8_kv_cache/1-gpu/ --use_inflight_batching --max_batch_size 256

Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .