OpenMOSS-Team
/

MOSS-Music-8B-Instruct

@@ -53,8 +53,8 @@ analysis.
 ## News
-* 2026.04.27: 🎉🎉🎉 We have released [MOSS-Music](https://huggingface.co/OpenMOSS-Team/MOSS-Music-8B-Instruct).
-* 2026.04.27: 🎉🎉🎉 We have released [MOSS-Music-Data-Pipeline](https://github.com/wx9songs/MOSS-Music-Data-Pipeline) for large-scale music data annotation and processing.
 ## Contents
@@ -67,9 +67,8 @@ analysis.
 - [Evaluation](#evaluation)
 - [Quickstart](#quickstart)
   - [Environment Setup](#environment-setup)
-  - [Basic Usage](#basic-usage)
-  - [Gradio App](#gradio-app)
   - [SGLang Serving](#sglang-serving)
 - [More Information](#more-information)
 - [LICENSE](#license)
 - [Citation](#citation)
@@ -96,6 +95,10 @@ model.
   grounded in a full track, including chain-of-thought reasoning in the
   *Thinking* variant.
 ## Model Architecture
 MOSS-Music inherits the MOSS-Audio modular design, comprising three
@@ -337,55 +340,21 @@ command with:
 pip install --extra-index-url https://download.pytorch.org/whl/cu128 -e ".[torch-runtime,flash-attn]"
 ```
-### Basic Usage
-Download the model first:
-```bash
-hf download OpenMOSS-Team/MOSS-Music-8B-Instruct --local-dir ./weights/MOSS-Music-8B-Instruct
-hf download OpenMOSS-Team/MOSS-Music-8B-Thinking --local-dir ./weights/MOSS-Music-8B-Thinking
-```
-Then edit `MODEL_PATH` / `AUDIO_PATH` in `infer.py` as needed, and run:
-```bash
-python infer.py
-```
 > [!IMPORTANT]
-> To achieve the best generation quality and fully leverage the model’s capabilities, we
 > **strongly recommend using SGLang Serving for inference**.
-The default prompt in `infer.py` is
-`Please give a detailed musical description of this clip.`. You can directly
-edit that line if you want to try lyrics transcription, chord / key / tempo
-analysis, structural segmentation, or open-ended musical QA. Typical prompts:
-- `Describe this piece of music in terms of style and tempo, tonal quality and harmony, instrumentation and arrangement, structural organization, and overall emotional mood.`
-- `Please give a detailed musical description of this clip.`
-- `Transcribe the lyrics of this song (with timestamps).`
-- `Transcribe the chord progression of this piece of music with timestamps, and output it in JSON format.`
-- `What is the key, tempo and mood of this track?`
-- `Segment the song into verse / chorus / bridge sections.`
-### Gradio App
-Start the Gradio demo with:
 ```bash
-python app.py
 ```
-The server address and port can be overridden via the
-`MOSS_MUSIC_SERVER_NAME` and `MOSS_MUSIC_SERVER_PORT` environment variables,
-and the default model ID via `MOSS_MUSIC_MODEL_ID`.
-### SGLang Serving
-If you want to serve MOSS-Music with SGLang, see the full guide in
-`moss_music_usage_guide.md`.
 The shortest setup is:
 ```bash
@@ -405,6 +374,18 @@ You can replace `./weights/MOSS-Music-8B-Instruct` with
 If you use the default `torch==2.9.1+cu128` runtime, installing
 `nvidia-cudnn-cu12==9.16.0.29` is recommended before starting `sglang serve`.
 ## More Information
 - **MOSI.AI**: [https://mosi.cn](https://mosi.cn)

 ## News
+* 2026.05.01: 🎉🎉🎉 We have released [MOSS-Music](https://huggingface.co/OpenMOSS-Team/MOSS-Music-8B-Instruct).
+* 2026.05.01: 🎉🎉🎉 We have released [MOSS-Music-Data-Pipeline](https://github.com/wx9songs/MOSS-Music-Data-Pipeline) for large-scale music data annotation and processing.
 ## Contents
 - [Evaluation](#evaluation)
 - [Quickstart](#quickstart)
   - [Environment Setup](#environment-setup)
   - [SGLang Serving](#sglang-serving)
+  - [Gradio App](#gradio-app)
 - [More Information](#more-information)
 - [LICENSE](#license)
 - [Citation](#citation)
   grounded in a full track, including chain-of-thought reasoning in the
   *Thinking* variant.
+<p align="center">
+  <img src="./assets/moss-music_img.png" width="98%" alt="MOSS-Music overview" />
+</p>
 ## Model Architecture
 MOSS-Music inherits the MOSS-Audio modular design, comprising three
 pip install --extra-index-url https://download.pytorch.org/whl/cu128 -e ".[torch-runtime,flash-attn]"
 ```
+### SGLang Serving
 > [!IMPORTANT]
+> To achieve the best generation quality and fully leverage the model's capabilities, we
 > **strongly recommend using SGLang Serving for inference**.
+See the full SGLang guide in `moss_music_usage_guide.md`.
+Download the model first:
 ```bash
+hf download OpenMOSS-Team/MOSS-Music-8B-Instruct --local-dir ./weights/MOSS-Music-8B-Instruct
+hf download OpenMOSS-Team/MOSS-Music-8B-Thinking --local-dir ./weights/MOSS-Music-8B-Thinking
 ```
 The shortest setup is:
 ```bash
 If you use the default `torch==2.9.1+cu128` runtime, installing
 `nvidia-cudnn-cu12==9.16.0.29` is recommended before starting `sglang serve`.
+### Gradio App
+Start the Gradio demo with:
+```bash
+python app.py
+```
+The server address and port can be overridden via the
+`MOSS_MUSIC_SERVER_NAME` and `MOSS_MUSIC_SERVER_PORT` environment variables,
+and the default model ID via `MOSS_MUSIC_MODEL_ID`.
 ## More Information
 - **MOSI.AI**: [https://mosi.cn](https://mosi.cn)