Align-Anything Chameleon 7B Base

Introduction

Repository for Align-Anything Chameleon 7B Base, a powerful model for text-image interleaved input and output. This model is based on the Chameleon model, and is trained on the Align-Anything framework to further unlock its capability of image generation.

Usage

To use this model, you can refer to the Align-Anything repository for more details, including the training, inference and evaluation:

git clone https://github.com/PKU-Alignment/align-anything.git
cd align-anything/projects/text_image_to_text_image

Then follow the instructions in the README.md file to set up the environment and run the scripts.

Currently, the official Transformer repo does not support Chameleon model with image output (see this PR for more details), so we rely on a certain fork of the repo.

After installing Align-Anything and correctly set up the envrionment, you can install the forked stable version of the repo by running:

pip install git+https://github.com/htlou/transformers.git@hantao_stable_cham

If you want to generate image (pure text generation can be directly done by Transformers), you can follow the instructions in the mmsg_chameleon repo to run the inference.

git clone https://github.com/htlou/mmsg_chameleon.git
cd mmsg_chameleon

Then set up the envrionment using

pip install -e . 

After setting up the envrioment, set up the correct paths in scripts/interleaved_gen.sh and then run

bash scripts/interleaved_gen.sh
Downloads last month
20
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for PKU-Alignment/AA-chameleon-7b-base

Quantizations
2 models

Collection including PKU-Alignment/AA-chameleon-7b-base