Esteban Cara de Sexo commited on
Commit
979228f
·
2 Parent(s): b2be736 b5772c1

Update README with comprehensive Mixture of Experts documentation and add transformers[agents] to requirements

Browse files
Files changed (3) hide show
  1. README.md +154 -2
  2. app.py +10 -0
  3. requirements.txt +2 -1
README.md CHANGED
@@ -1,13 +1,165 @@
1
  ---
2
  title: Mixture Of Experts
3
- emoji: 💬
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.0.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
 
 
 
11
  ---
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
1
  ---
2
  title: Mixture Of Experts
3
+ emoji: 📚
4
  colorFrom: yellow
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.19.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
+ models:
12
+ - rhymes-ai/Aria-Chat
13
+ short_description: Hugging Face Space with Gradio Interface
14
  ---
15
 
16
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
17
+ [![Python 3.9+](https://img.shields.io/badge/python-%3E%3D3.9-blue.svg)](https://www.python.org/downloads)
18
+ [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
19
+
20
+ ---
21
+
22
+ # Mixture of Experts
23
+
24
+ Welcome to **Mixture of Experts** – a Hugging Face Space built to interact with advanced multimodal conversational AI using Gradio. This Space leverages the Aria-Chat model, which excels in handling open-ended, multi-round dialogs with text and image inputs.
25
+
26
+ ## Key Features
27
+
28
+ - **Multimodal Interaction:** Seamlessly integrate text and image inputs for rich, conversational experiences.
29
+ - **Advanced Conversational Abilities:** Benefit from Aria-Chat’s fine-tuned performance in generating coherent and context-aware responses.
30
+ - **Optimized Performance:** Designed for reliable, long-format outputs, reducing common pitfalls like incomplete markdown or endless list outputs.
31
+ - **Multilingual Support:** Optimized to handle multiple languages including Chinese, Spanish, French, and Japanese.
32
+
33
+ ## Quick Start
34
+
35
+ ### Installation
36
+
37
+ To run the Space locally or to integrate into your workflow, ensure you have the following dependencies installed:
38
+
39
+ ```bash
40
+ pip install transformers==4.45.0 accelerate==0.34.1 sentencepiece==0.2.0 torchvision requests torch Pillow
41
+ pip install flash-attn --no-build-isolation
42
+
43
+ # Optionally, for improved inference performance:
44
+ pip install grouped_gemm==0.1.6
45
+
46
+ ```
47
+
48
+ Usage
49
+ Below is a simple code snippet demonstrating how to interact with the Aria-Chat model. Customize it further to suit your integration needs:
50
+
51
+ ```python
52
+ import requests
53
+ import torch
54
+ from PIL import Image
55
+ from transformers import AutoModelForCausalLM, AutoProcessor
56
+
57
+ model_id_or_path = "rhymes-ai/Aria-Chat"
58
+
59
+ model = AutoModelForCausalLM.from_pretrained(
60
+ model_id_or_path,
61
+ device_map="auto",
62
+ torch_dtype=torch.bfloat16,
63
+ trust_remote_code=True
64
+ )
65
+
66
+ processor = AutoProcessor.from_pretrained(
67
+ model_id_or_path,
68
+ trust_remote_code=True
69
+ )
70
+
71
+ # Example image input
72
+ image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
73
+ image = Image.open(requests.get(image_url, stream=True).raw)
74
+
75
+ # Prepare a conversation message
76
+ messages = [
77
+ {
78
+ "role": "user",
79
+ "content": [
80
+ {"text": None, "type": "image"},
81
+ {"text": "What is the image?", "type": "text"},
82
+ ],
83
+ }
84
+ ]
85
+
86
+ # Format text input with chat template
87
+ text = processor.apply_chat_template(messages, add_generation_prompt=True)
88
+ inputs = processor(text=text, images=image, return_tensors="pt")
89
+ inputs["pixel_values"] = inputs["pixel_values"].to(model.dtype)
90
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
91
+
92
+ # Generate the response
93
+ with torch.inference_mode(), torch.cuda.amp.autocast(dtype=torch.bfloat16):
94
+ output = model.generate(
95
+ **inputs,
96
+ max_new_tokens=500,
97
+ stop_strings=["<|im_end|>"],
98
+ tokenizer=processor.tokenizer,
99
+ do_sample=True,
100
+ temperature=0.9,
101
+ )
102
+ output_ids = output[0][inputs["input_ids"].shape[1]:]
103
+ result = processor.decode(output_ids, skip_special_tokens=True)
104
+
105
+ print(result)
106
+ ```
107
+
108
+ ### Running the Space with Gradio
109
+ Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
110
+
111
+ - Interact in real time with the multimodal capabilities of Aria-Chat.
112
+ - Test various inputs including images and text for a dynamic conversational experience.
113
+
114
+ ## Advanced Usage
115
+ For more complex use cases:
116
+
117
+ - Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
118
+ - vLLM Inference: Explore advanced inference options to optimize latency and throughput.
119
+
120
+ ### Credits & Citation
121
+ If you find this work useful, please consider citing the Aria-Chat model:
122
+
123
+ ```bibtex
124
+ Copy
125
+ Edit
126
+ @article{aria,
127
+ title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
128
+ author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
129
+ year={2024},
130
+ journal={arXiv preprint arXiv:2410.05993},
131
+ }
132
+ ```
133
+
134
+ ## License
135
+ This project is licensed under the Apache-2.0 License.
136
+
137
+ Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.Running the Space with Gradio
138
+ Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
139
+
140
+ - Interact in real time with the multimodal capabilities of Aria-Chat.
141
+ - Test various inputs including images and text for a dynamic conversational experience.
142
+
143
+
144
+ ## Advanced Usage
145
+ For more complex use cases:
146
+
147
+ - Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
148
+ vLLM Inference: Explore advanced inference options to optimize latency and throughput.
149
+ ## Credits & Citation
150
+ If you find this work useful, please consider citing the Aria-Chat model:
151
+
152
+ bibtex
153
+ @article{aria,
154
+ title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
155
+ author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
156
+ year={2024},
157
+ journal={arXiv preprint arXiv:2410.05993},
158
+ }
159
+
160
+ ## License
161
+ This project is licensed under the Apache-2.0 License.
162
+
163
+ Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.
164
+
165
  An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
app.py CHANGED
@@ -1,5 +1,15 @@
 
1
  # Load model directly
2
  from transformers import AutoProcessor, AutoModelForImageTextToText
 
 
 
 
 
 
 
 
 
3
 
4
  processor = AutoProcessor.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
5
  model = AutoModelForImageTextToText.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
 
1
+ <<<<<<< HEAD
2
  # Load model directly
3
  from transformers import AutoProcessor, AutoModelForImageTextToText
4
+ =======
5
+ import gradio as gr
6
+ from huggingface_hub import InferenceClient
7
+
8
+ """
9
+ For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
10
+ """
11
+ client = InferenceClient("rhymes-ai/Aria-Chat")
12
+ >>>>>>> b5772c182ca5f2afb7bf3f072e726b1b6bb80b4c
13
 
14
  processor = AutoProcessor.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
15
  model = AutoModelForImageTextToText.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
requirements.txt CHANGED
@@ -1 +1,2 @@
1
- huggingface_hub==0.25.2
 
 
1
+ huggingface_hub==0.25.2
2
+ transformers[agents]