MappingAdapter exact structure available in representation_mapping.py

Mapping "sentence-transformers/stsb-roberta-large"'s hidden representation to "mistralai/Mistral-7B-Instruct-v0.1"'s.

Training:

  • Steps: 114k

  • Gradient accumulation: 2

  • Batch size: 64

  • Warm-up steps: 100

  • Learning Rate: 3e-5 with linear scheduling

  • Eval steps: %8000

  • Training hours: ~98h

  • Eval hours: ~10h

  • Gradient updates: 57k

  • Train examples: 7.3M

  • Eval examples: 106k

  • Adapter: Decoder_dim (4096) → 4096 → LeakyRelu(.1) → Encoder_dim (1024)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Dataset used to train sade-adrien/mappingadapter_roberta_mistral