Toowired
/

higgs-v2-llm

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Higgs Audio V2 LLM Model

This is the Higgs Audio V2 3B parameter text-to-speech generation model.

Model Description

The Higgs Audio V2 LLM is an end-to-end multimodal model capable of understanding and generating high-quality text-to-speech audio.

Model Details

Model Type: Text-to-Speech Generation Model
Parameters: 3B
Framework: PyTorch
Architecture: Transformer-based multimodal architecture

Usage

from higgs_audio.serve.serve_engine import HiggsAudioServeEngine

engine = HiggsAudioServeEngine(
    model_name_or_path="Toowired/higgs-v2-llm",
    audio_tokenizer_name_or_path="Toowired/higgs-v2-tokenizer",
    device="cuda"
)

Model Architecture

Text encoder for understanding input text
Audio decoder for generating speech
Multimodal fusion layers
3B parameters for high-quality generation

Intended Use

Text-to-speech synthesis
Voice generation applications
Audio content creation
Research in speech synthesis

Limitations

Requires significant computational resources
Optimized for English text
May require fine-tuning for specific voices

License

Please check the original model repository for license information.

Contact

For questions about this model, please contact the repository owner.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support