| --- |
| license: apache-2.0 |
| library_name: transformers |
| tags: |
| - vision |
| - image-text-to-text |
| - multimodal |
| - test-model |
| - tiny-model |
| - openvino |
| - optimum-intel |
| pipeline_tag: image-text-to-text |
| --- |
|
|
| # Tiny Random MiniCPM-o-2_6 |
| |
| ## Model Description |
| |
| This is a **tiny random-initialized version** of the [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) multimodal vision-language model, designed specifically for **testing and CI/CD purposes** in the [optimum-intel](https://github.com/huggingface/optimum-intel) library. |
| |
| **β οΈ Important**: This model has randomly initialized weights and is NOT intended for actual inference. It is designed solely for: |
| - Testing model loading and export functionality |
| - CI/CD pipeline validation |
| - OpenVINO conversion testing |
| - Quantization workflow testing |
| |
| ## Model Specifications |
| |
| - **Architecture**: MiniCPM-o-2_6 (multimodal: vision + text + audio + TTS) |
| - **Parameters**: 1,477,376 (~1.48M parameters) |
| - **Model Binary Size**: 5.64 MB |
| - **Total Repository Size**: ~21 MB |
| - **Original Model**: [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) (~18 GB) |
| - **Size Reduction**: 853Γ smaller than the full model |
|
|
| ## Architecture Details |
|
|
| ### Language Model (LLM) Component |
| - `num_hidden_layers`: 2 (reduced from 40) |
| - `hidden_size`: 256 (reduced from 2048) |
| - `intermediate_size`: 512 (reduced from 8192) |
| - `num_attention_heads`: 4 (reduced from 32) |
| - `vocab_size`: 320 (reduced from 151,700) |
| - `max_position_embeddings`: 128 (reduced from 8192) |
|
|
| ### Vision Component (SigLIP-based) |
| - `hidden_size`: 8 |
| - `num_hidden_layers`: 1 |
|
|
| ### Audio Component (Whisper-based) |
| - `d_model`: 64 |
| - `encoder_layers`: 1 |
| - `decoder_layers`: 1 |
|
|
| ### TTS Component |
| - `hidden_size`: 8 |
| - `num_layers`: 1 |
|
|
| All architectural components are present but miniaturized to ensure API compatibility while drastically reducing compute requirements. |
|
|
| ## Usage |
|
|
| ### Loading with Transformers |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoProcessor |
| import torch |
| |
| model_id = "arashkermani/tiny-random-MiniCPM-o-2_6" |
| |
| # Load model |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| trust_remote_code=True, |
| torch_dtype=torch.float32, |
| device_map="cpu" |
| ) |
| |
| # Load processor |
| processor = AutoProcessor.from_pretrained( |
| model_id, |
| trust_remote_code=True |
| ) |
| |
| # Test forward pass |
| input_ids = torch.randint(0, 320, (1, 5)) |
| position_ids = torch.arange(5).unsqueeze(0) |
| |
| data = { |
| "input_ids": input_ids, |
| "pixel_values": [[]], |
| "tgt_sizes": [[]], |
| "image_bound": [[]], |
| "position_ids": position_ids, |
| } |
| |
| with torch.no_grad(): |
| outputs = model(data=data) |
| |
| print(f"Logits shape: {outputs.logits.shape}") # (1, 5, 320) |
| ``` |
|
|
| ### Using with Optimum-Intel (OpenVINO) |
|
|
| ```python |
| from optimum.intel.openvino import OVModelForVisualCausalLM |
| from transformers import AutoProcessor |
| |
| model_id = "arashkermani/tiny-random-MiniCPM-o-2_6" |
| |
| # Load model for OpenVINO |
| model = OVModelForVisualCausalLM.from_pretrained( |
| model_id, |
| trust_remote_code=True |
| ) |
| |
| processor = AutoProcessor.from_pretrained( |
| model_id, |
| trust_remote_code=True |
| ) |
| ``` |
|
|
| ### Export to OpenVINO |
|
|
| ```bash |
| optimum-cli export openvino \ |
| -m arashkermani/tiny-random-MiniCPM-o-2_6 \ |
| minicpm-o-openvino \ |
| --task=image-text-to-text \ |
| --trust-remote-code |
| ``` |
|
|
| ## Intended Use |
|
|
| This model is intended **exclusively** for: |
| - β
Testing optimum-intel OpenVINO export functionality |
| - β
CI/CD pipeline validation |
| - β
Model loading and compatibility testing |
| - β
Quantization workflow testing |
| - β
Fast prototyping and debugging |
|
|
| **Not intended for**: |
| - β Production inference |
| - β Actual image-text-to-text tasks |
| - β Model quality evaluation |
| - β Benchmarking performance metrics |
|
|
| ## Training Details |
|
|
| This model was generated by: |
| 1. Loading the config from `optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6` |
| 2. Reducing all dimensions to minimal viable values |
| 3. Initializing weights randomly using `AutoModelForCausalLM.from_config()` |
| 4. Copying all necessary tokenizer, processor, and custom code files |
|
|
| **No training was performed** - all weights are randomly initialized. |
|
|
| ## Validation Results |
|
|
| The model has been validated to ensure: |
| - β
Loads with `trust_remote_code=True` |
| - β
Compatible with transformers AutoModel APIs |
| - β
Supports forward pass with expected input format |
| - β
Compatible with OpenVINO export via optimum-intel |
| - β
Includes all required custom modules and artifacts |
|
|
| See the [validation report](https://github.com/arashkermani/tiny-minicpm-o) for detailed technical analysis. |
|
|
| ## Files Included |
|
|
| - `config.json` - Model configuration |
| - `pytorch_model.bin` - Model weights (5.64 MB) |
| - `generation_config.json` - Generation parameters |
| - `preprocessor_config.json` - Preprocessor configuration |
| - `processor_config.json` - Processor configuration |
| - `tokenizer_config.json` - Tokenizer configuration |
| - `tokenizer.json` - Fast tokenizer |
| - `vocab.json` - Vocabulary |
| - `merges.txt` - BPE merges |
| - Custom Python modules: |
| - `modeling_minicpmo.py` |
| - `configuration_minicpm.py` |
| - `processing_minicpmo.py` |
| - `image_processing_minicpmv.py` |
| - `tokenization_minicpmo_fast.py` |
| - `modeling_navit_siglip.py` |
| - `resampler.py` |
| - `utils.py` |
|
|
| ## Related Models |
|
|
| - Original model: [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) |
| - Previous test model: [optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6](https://huggingface.co/optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6) |
|
|
| ## License |
|
|
| This model follows the same license as the original MiniCPM-o-2_6 model (Apache 2.0). |
| |
| ## Citation |
| |
| If you use this test model in your CI/CD or testing infrastructure, please reference: |
| |
| ```bibtex |
| @misc{tiny-minicpm-o-2_6, |
| author = {Arash Kermani}, |
| title = {Tiny Random MiniCPM-o-2_6 for Testing}, |
| year = {2026}, |
| publisher = {HuggingFace}, |
| howpublished = {\url{https://huggingface.co/arashkermani/tiny-random-MiniCPM-o-2_6}} |
| } |
| ``` |
| |
| ## Contact |
| |
| For issues or questions about this test model, please open an issue in the [optimum-intel repository](https://github.com/huggingface/optimum-intel/issues). |
| |