internlm-xcomposer2-vl-7b-4bit / README.md~

update

3b05b6b 9 months ago

4.16 kB

	---
	license: other
	pipeline_tag: text-generation
	---


	<p align="center">
	<img src="logo_en.png" width="400"/>
	<p>

	<p align="center">
	<b><font size="6">InternLM-XComposer2</font></b>
	<p>

	<div align="center">

	[💻Github Repo](https://github.com/InternLM/InternLM-XComposer)

	[Paper](https://arxiv.org/abs/2401.16420)

	</div>

	InternLM-XComposer2 is a vision-language large model (VLLM) based on [InternLM2](https://github.com/InternLM/InternLM) for advanced text-image comprehension and composition.

	We release InternLM-XComposer2 series in two versions:

	- InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks.
	- InternLM-XComposer2: The finetuned VLLM for Free-from Interleaved Text-Image Composition.

	This is the 4-bit version of InternLM-XComposer2-VL

	## Quickstart
	We provide a simple example to show how to use InternLM-XComposer with 🤗 Transformers.
	```python
	import torch, auto_gptq
	from transformers import AutoModel, AutoTokenizer
	from auto_gptq.modeling import BaseGPTQForCausalLM

	auto_gptq.modeling._base.SUPPORTED_MODELS = ["internlm"]
	torch.set_grad_enabled(False)

	class InternLMXComposer2QForCausalLM(BaseGPTQForCausalLM):
	layers_block_name = "model.layers"
	outside_layer_modules = [
	'vit', 'vision_proj', 'model.tok_embeddings', 'model.norm', 'output',
	]
	inside_layer_modules = [
	["attention.wqkv.linear"],
	["attention.wo.linear"],
	["feed_forward.w1.linear", "feed_forward.w3.linear"],
	["feed_forward.w2.linear"],
	]

	# init model and tokenizer
	model = InternLMXComposer2QForCausalLM.from_quantized(
	'internlm/internlm-xcomposer2-vl-7b-4bit', trust_remote_code=True, device="cuda:0").eval()
	tokenizer = AutoTokenizer.from_pretrained(
	'internlm/internlm-xcomposer2-vl-7b-4bit', trust_remote_code=True)

	text = '<ImageHere>Please describe this image in detail.'
	image = 'examples/image1.webp'
	with torch.cuda.amp.autocast():
	response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
	print(response)
	#The image features a quote by Oscar Wilde, "Live life with no excuses, travel with no regrets."
	#The quote is displayed in white text against a dark background. In the foreground, there are two silhouettes of people standing on a hill at sunset.
	#They appear to be hiking or climbing, as one of them is holding a walking stick.
	#The sky behind them is painted with hues of orange and purple, creating a beautiful contrast with the dark figures.

	```

	### Import from Transformers
	To load the InternLM-XComposer2-VL-7B model using Transformers, use the following code:
	```python
	import torch
	from PIL import image
	from transformers import AutoTokenizer, AutoModelForCausalLM
	ckpt_path = "internlm/internlm-xcomposer2-vl-7b"
	tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
	# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
	model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
	model = model.eval()
	```

	### 通过 Transformers 加载
	通过以下的代码加载 InternLM-XComposer2-VL-7B 模型

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	ckpt_path = "internlm/internlm-xcomposer2-vl-7b"
	tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
	# `torch_dtype=torch.float16` 可以令模型以 float16 精度加载，否则 transformers 会将模型加载为 float32，导致显存不足
	model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
	model = model.eval()
	```

	### Open Source License
	The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表（中文）. For other questions or collaborations, please contact internlm@pjlab.org.cn.