What would be the average inference time for this model using beam width =4
#31
by
						
ashwin26
	
							
						- opened
							
					
@ashwin26 thank you for testing out our model. You may try https://blog.vllm.ai/2023/06/20/vllm.html or https://huggingface.co/docs/text-generation-inference/en/index for optimized serving

