Spaces:

whackthejacker
/

mixture-of-experts

Running

App Files Files Community

Esteban Cara de Sexo commited on 13 days ago

Commit

979228f

2 Parent(s): b2be736 b5772c1

Update README with comprehensive Mixture of Experts documentation and add transformers[agents] to requirements

Browse files

Files changed (3) hide show

README.md +154 -2
app.py +10 -0
requirements.txt +2 -1

README.md CHANGED Viewed

@@ -1,13 +1,165 @@
 ---
 title: Mixture Of Experts
-emoji: 💬
 colorFrom: yellow
 colorTo: purple
 sdk: gradio
-sdk_version: 5.0.1
 app_file: app.py
 pinned: false
 license: mit
 ---
 An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

 ---
 title: Mixture Of Experts
+emoji: 📚
 colorFrom: yellow
 colorTo: purple
 sdk: gradio
+sdk_version: 5.19.0
 app_file: app.py
 pinned: false
 license: mit
+models:
+- rhymes-ai/Aria-Chat
+short_description: Hugging Face Space with Gradio Interface
 ---
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Python 3.9+](https://img.shields.io/badge/python-%3E%3D3.9-blue.svg)](https://www.python.org/downloads)
+[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
+---
+# Mixture of Experts
+Welcome to **Mixture of Experts** – a Hugging Face Space built to interact with advanced multimodal conversational AI using Gradio. This Space leverages the Aria-Chat model, which excels in handling open-ended, multi-round dialogs with text and image inputs.
+## Key Features
+- **Multimodal Interaction:** Seamlessly integrate text and image inputs for rich, conversational experiences.
+- **Advanced Conversational Abilities:** Benefit from Aria-Chat’s fine-tuned performance in generating coherent and context-aware responses.
+- **Optimized Performance:** Designed for reliable, long-format outputs, reducing common pitfalls like incomplete markdown or endless list outputs.
+- **Multilingual Support:** Optimized to handle multiple languages including Chinese, Spanish, French, and Japanese.
+## Quick Start
+### Installation
+To run the Space locally or to integrate into your workflow, ensure you have the following dependencies installed:
+  ```bash
+    pip install transformers==4.45.0 accelerate==0.34.1 sentencepiece==0.2.0 torchvision requests torch Pillow
+    pip install flash-attn --no-build-isolation
+    # Optionally, for improved inference performance:
+    pip install grouped_gemm==0.1.6
+  ```
+Usage
+Below is a simple code snippet demonstrating how to interact with the Aria-Chat model. Customize it further to suit your integration needs:
+```python
+import requests
+import torch
+from PIL import Image
+from transformers import AutoModelForCausalLM, AutoProcessor
+model_id_or_path = "rhymes-ai/Aria-Chat"
+model = AutoModelForCausalLM.from_pretrained(
+    model_id_or_path,
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True
+)
+processor = AutoProcessor.from_pretrained(
+    model_id_or_path,
+    trust_remote_code=True
+)
+# Example image input
+image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
+image = Image.open(requests.get(image_url, stream=True).raw)
+# Prepare a conversation message
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"text": None, "type": "image"},
+            {"text": "What is the image?", "type": "text"},
+        ],
+    }
+]
+# Format text input with chat template
+text = processor.apply_chat_template(messages, add_generation_prompt=True)
+inputs = processor(text=text, images=image, return_tensors="pt")
+inputs["pixel_values"] = inputs["pixel_values"].to(model.dtype)
+inputs = {k: v.to(model.device) for k, v in inputs.items()}
+# Generate the response
+with torch.inference_mode(), torch.cuda.amp.autocast(dtype=torch.bfloat16):
+    output = model.generate(
+        **inputs,
+        max_new_tokens=500,
+        stop_strings=["<|im_end|>"],
+        tokenizer=processor.tokenizer,
+        do_sample=True,
+        temperature=0.9,
+    )
+    output_ids = output[0][inputs["input_ids"].shape[1]:]
+    result = processor.decode(output_ids, skip_special_tokens=True)
+print(result)
+```
+### Running the Space with Gradio
+Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
+- Interact in real time with the multimodal capabilities of Aria-Chat.
+- Test various inputs including images and text for a dynamic conversational experience.
+## Advanced Usage
+For more complex use cases:
+- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
+- vLLM Inference: Explore advanced inference options to optimize latency and throughput.
+### Credits & Citation
+If you find this work useful, please consider citing the Aria-Chat model:
+```bibtex
+Copy
+Edit
+@article{aria,
+  title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
+  author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
+  year={2024},
+  journal={arXiv preprint arXiv:2410.05993},
+}
+```
+## License
+This project is licensed under the Apache-2.0 License.
+Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.Running the Space with Gradio
+Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
+- Interact in real time with the multimodal capabilities of Aria-Chat.
+- Test various inputs including images and text for a dynamic conversational experience.
+## Advanced Usage
+For more complex use cases:
+- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
+vLLM Inference: Explore advanced inference options to optimize latency and throughput.
+## Credits & Citation
+If you find this work useful, please consider citing the Aria-Chat model:
+bibtex
+@article{aria,
+  title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
+  author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
+  year={2024},
+  journal={arXiv preprint arXiv:2410.05993},
+}
+## License
+This project is licensed under the Apache-2.0 License.
+Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.
 An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

app.py CHANGED Viewed

@@ -1,5 +1,15 @@
 # Load model directly
 from transformers import AutoProcessor, AutoModelForImageTextToText
 processor = AutoProcessor.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
 model = AutoModelForImageTextToText.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)

+<<<<<<< HEAD
 # Load model directly
 from transformers import AutoProcessor, AutoModelForImageTextToText
+=======
+import gradio as gr
+from huggingface_hub import InferenceClient
+"""
+For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
+"""
+client = InferenceClient("rhymes-ai/Aria-Chat")
+>>>>>>> b5772c182ca5f2afb7bf3f072e726b1b6bb80b4c
 processor = AutoProcessor.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)
 model = AutoModelForImageTextToText.from_pretrained("rhymes-ai/Aria-Chat", trust_remote_code=True)

requirements.txt CHANGED Viewed

	@@ -1 +1,2 @@
1	- huggingface_hub==0.25.2


1	+ huggingface_hub==0.25.2
2	+ transformers[agents]