Instructions to use ronit01/final_golden_rag_tuned_minilm_mnr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ronit01/final_golden_rag_tuned_minilm_mnr with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ronit01/final_golden_rag_tuned_minilm_mnr") sentences = [ "How do you set up and run an SFT fine-tuning experiment from scratch using RapidFire AI's full installation, from installing the package through launching training and monitoring results?", "Semantics of IC Ops\n-----\n\nIC Ops can be used only when a :func:`run_fit()` is actively running. \nTo access the IC Ops panel, click on the \"IC Ops\" column buttons in the runs table\nor on any run's curve on any metrics plot in the \"Chart\" view.\nAlso see :doc:`ML Metrics Dashboard</dashboard>`.\n\nAlternatively, you can also invoke the in-notebook IC Ops control panel with the \nfollowing code. \n\nAs of this writing, this in-notebook panel works only on the Google\nColab deployment for :func:`run_fit()`, but we will soon support it for other environments too.\n\n.. code-block:: python\n\n # Create Interactive Controller\n from rapidfireai.utils.interactive_controller import InteractiveController\n\n controller = InteractiveController(dispatcher_url=\"http://127.0.0.1:8851\")\n controller.display()\n\nThe in-notebook IC Ops controller has the same operations and it looks like the following: \n\n.. raw:: html\n\n <img src=\"_static/notebook-icops.png\" alt=\"In-notebook IC Ops panel\" \n style=\"cursor: zoom-in; max-width: 100%;\" onclick=\"this.requestFullscreen()\">\n\n\nFor :func:`run_evals()`, as of this writing, only jupyter is supported when its server is \nstarted as below. We will expand support for other IDEs soon.\nNote that IC Ops panel will appear below the cell where :func:`run_evals()` is invoked.\n\n.. code-block:: bash\n\n jupyter notebook --no-browser --port=8850 --ServerApp.allow_origin='*'\n\nOpen the URL provided by the above command on your browser. \nIf you are running it on a remote machine, make sure to also forward \nthe ports on your client :ref:`as explained here <step-3b-port-forwarding>`.\n", "The main function to launch training (including LLM fine-tuning and post-training) and evaluation for a given config group in one go. \nSee :doc:`the Multi-Config Specification page</configs>` for more details on how to construct a config group. \n\n.. py:function:: run_fit(self, param_config: Any, create_model_fn: Callable, train_dataset: Dataset, eval_dataset: Dataset, num_chunks: int, seed: int=42, num_gpus: int) -> None:\n\n\t:param param_config: A train config knob dictionary, a generated config group, or a :code:`list` of configs or config groups\n\t:type param_config: Train config-group or list as described in :doc:`the Multi-Config Specification page</configs>`\n\n\t:param create_model_fn: User-given function to create a model instance; a single cfg is passed as input by the system\n\t:type create_model_fn: Callable\n\n\t:param train_dataset: Training dataset\n\t:type train_dataset: Dataset\n\n\t:param eval_dataset: Evaluation dataset to measure eval metrics\n\t:type eval_dataset: Dataset\n\n\t:param num_chunks: Number of logical splits of data to control degree of concurrency for multi-config execution (recommended: at least 4)\n\t:type num_chunks: int\n\n\t:param seed: Seed for any randomness used in your code (default: 42)\n\t:type seed: int, optional\n\n\t:param num_gpus: Number of GPUs to use per run/config for each config represented in :code:`param_config`; overriden by any :code:`num_gpus` given in :code:`RFModelConfig` for those associated configs.\n\t:type num_gpus: int, optional\n\n\t:return: None\n\t:rtype: None\n\n**Example:**\n\n.. code-block:: python\n\n\t# Based on SFT chatbot tutorial notebook\n\t>>> experiment.run_fit(config_group, sample_create_model, train_dataset, eval_dataset, num_chunks=4, seed=42)\n\tStarted 4 worker processes successfully ...\n\n**Notes:**\n\nThis method auto-generates the ML metrics files as per user specification and auto-plots them on the dashboard.\nWithin an experiment, you can rerun :func:`run_fit()` as many times as you want. All of them \nwill be overlaid on the same plots on the ML metrics dashboard.\nNote that :func:`run_fit()` must be actively running for you to be able to use Interactive Control (IC) \nops on the dashboard.\n\nThe :code:`param_config` argument is very versatile in allowing you to construct various knob combinations \nand launch them in one go. \nIt can be a single config dictionary, a :code:`list` of config dictionaries, a config group generator output \n(:func:`RFGridSearch()` or :func:`RFRandomSearch()` for now), or even a :code:`list` with mix of configs or \nconfig group generator outputs as its elements.\nPlease see the :doc:`the Multi-Config Specification page</search>` for more details. \n\nEach individual config is passed as input to your :func:`create_model_fn()`. Inside it you can use whatever \nknob you set in the config group, e.g., model type or name to instantiate a model accordingly. \nYou can import models from libraries such as HuggingFace transformers or load your own PyTorch checkpoints.\n\nThe :code:`num_chunks` argument is a critical one that enables you to balance a higher degree of concurrency \nyou desire for cross-config comparisons against the (relatively minor) extra swapping overhead incurred. \nWe recommend at least 4, which means you will see results for all runs on 1/4th of the data at a time.\n", "Step 5: Monitor training behaviors on ML metrics dashboard\n--------\n\n.. raw:: html\n\n <img src=\"_static/step7.png\" alt=\"Monitor training behaviors on ML metrics dashboard\" \n style=\"cursor: zoom-in; max-width: 100%;\" onclick=\"this.requestFullscreen()\">\n\n\nStep 6: Interactive Control (IC) Ops: Stop, Clone-Modify; check their results \n-----\n\n.. raw:: html\n\n <img src=\"_static/icop-stop.png\" alt=\"IC Op: Stop\" \n style=\"cursor: zoom-in; max-width: 100%;\" onclick=\"this.requestFullscreen()\">\n\n\n.. raw:: html\n\n <img src=\"_static/icop-clone.png\" alt=\"IC Op: Clone-Modify\" \n style=\"cursor: zoom-in; max-width: 100%;\" onclick=\"this.requestFullscreen()\">\n\n\n.. raw:: html\n\n <img src=\"_static/step10.png\" alt=\"IC Op results on dashboard\" \n style=\"cursor: zoom-in; max-width: 100%;\" onclick=\"this.requestFullscreen()\">\n" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Supported Modality: Text
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/final_golden_rag_tuned_minilm_mnr")
# Run inference
sentences = [
'What are all the Experiment class methods (experiment ops) provided by RapidFire AI, and what does each one do?',
'Run Evals\n------\n\nThe main function to launch LLM evaluation (evals), including with optional RAG, for a given config group in one go. \nSee :doc:`the Multi-Config Specification page</configs>` for more details on how to construct a config group. \n\n\n.. py:function:: run_evals(self, config_group: Any, dataset: Dataset, num_shards: int=4, num_actors: int, seed: int=42) -> dict[int, tuple[dict, dict]]:\n\n\t:param config_group: Single evals config knob dictionary, a generated config group, or a :code:`list` of configs or config groups\n\t:type config_group: Evals config-group or list as described in :doc:`the Multi-Config Specification page</configs>`\n\n\t:param dataset: Evaluation dataset to measure eval metrics\n\t:type dataset: Dataset\n\n\t:param num_shards: Number of logical splits of data to control degree of concurrency for multi-config execution (recommended: at least 4)\n\t:type num_shards: int\n\n\t:param num_actors: Number of parallel worker processes per machine to control degree of concurrency; (default: number of GPUs); (recommended max 16, if machine has no GPUs)\n\t:type num_actors: int, optional\n\n\t:param seed: Seed to control randomness for online aggregation (default: 42)\n\t:type seed: int, optional\n\n\t:return: Dictionary with a key being run/config ID and a value being a 2-tuple with a dictionary each for all aggregated metrics and all cumulative metrics\n\t:rtype: dict[int, tuple[dict, dict]]\n\n**Example:**\n\n.. code-block:: python\n\n\t# Based on FiQA RAG chatbot tutorial notebook\n\t>>> experiment.run_evals(configs=config_group, dataset=fiqa_dataset, num_shards=4, num_actors=8, seed=42)\n\tStarted 8 actor processes ...\n\n**Notes:**\n\nThis method auto-generates the ML metrics as per user specification and lists them in an auto-updated table \nshown on the notebook itself (and soon, on the ML metrics dashboard also).\nAlongside the metrics table, the Interactive Control (IC) Ops panel will also appear on the notebook itself.\nNote that :func:`run_evals()` must be actively running for you to be able to use IC Ops.\n\nWithin an experiment, you can rerun :func:`run_evals()` as many times as you want. All of them \nwill be overlaid on the same plots on the ML metrics dashboard.\n\nThe :code:`config_group` argument allows you to construct various knob combinations for inference pipelines \nand launch them in one go. These pipelines can involve LLMs running on your GPUs, or OpenAI API calls, or both. \n\nJust like with :func:`run_fit()` above, you can provide a single config dictionary, a :code:`list` of config \ndictionaries, a config group generator output (:func:`RFGridSearch()` or :func:`RFRandomSearch()` for now), \nor even a :code:`list` with mix of configs or config group generator outputs as its elements.\nPlease see the :doc:`the Multi-Config Specification page</search>` for more details. \n\nThe :code:`num_shards` argument is identical to the :code:`num_chunks` argument of :func:`run_fit()` above. \nThat is, it let you balance the degree of concurrency for cross-config comparisons against the (minor) \nextra swapping overhead incurred. Again, we recommend at least 4, which means you will see results being \nupdated for all runs on 1/4th of the data at a time.\n\nUnlike :func:`run_fit()`, this function does have a return value. In particular, it will return a dictionary \nwith the run/config ID as the key. The value is a 2-tuple with a dictionary each for all aggregated metrics \nand all cumulative metrics.',
'External Vector Stores: Pinecone and PGVector\n-------\n\nRapidFire AI also supports external persistent vector stores beyond the default in-memory FAISS.\nThis allows you to scale to larger corpora, persist indexes across runs and experiments, and leverage managed vector DBMS services.\nAs of this writing, **Pinecone** (hosted serverless or pod-based) and **PostgreSQL PGVector** (self-hosted or managed) are supported.\n\nEach external store supports three modes of operation:\n\n- **Create mode:** Build a new index from base documents from within RapidFire AI itself and use it for RAG.\n- **Read mode:** Retrieve from a pre-existing index and use it for RAG. \n- **Update mode:** Add new content to an existing index from additional base documents from within RapidFire AI itself and use it for RAG. \n\nSee the :doc:`API: LangChain RAG Spec page</ragspecs>` for more details on how to specify these external vector stores.\n\nThe FiQA RAG tutorial notebooks have also been extended to showcase the external stores as below:\n\n- **Pinecone**: `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/rag-contexteng/rf-tutorial-rag-fiqa-pinecone.ipynb>`__\n- **PGVector**: `View on GitHub <https://github.com/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/rag-contexteng/rf-tutorial-rag-fiqa-pgvector.ipynb>`__',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.2191, 0.2401],
# [0.2191, 1.0000, 0.2900],
# [0.2401, 0.2900, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
Size: 52 training samples
Columns:
sentence_0andsentence_1Approximate statistics based on the first 52 samples:
sentence_0 sentence_1 type string string details - min: 25 tokens
- mean: 38.85 tokens
- max: 70 tokens
- min: 58 tokens
- mean: 223.25 tokens
- max: 256 tokens
Samples: | sentence_0 | sentence_1 | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
How does the run_fit() workflow for SFT training use the create_model_fn and formatting_func together to prepare models and data, and how does this compare to the run_evals() workflow's use of preprocess_fn and the generator config?|Formatting FunctionOptional user-provided function to format each example (row) of the dataset to construct the prompt and completion with relevant roles and system prompt as expected by your model. Apart from adding the system prompt, for conversational data it should format the user instruction and assistant responses as separate message dictionary entries.
It is passed to the :code:
formatting_funcargument of :class:RFModelConfig. Also read: :doc:the LoRA and Model Configs page</models>. You can create multiple variants of these functions and pass them all as a single :code:Listto your :class:RFModelConfigto create a multi-config specification.This function is invoked by the underlying HF trainer on all examples of the train dataset and (if given) eval dataset on the fly.
.. py:function:: sample_formatting_fn(row: Dict[str, Any]) -> Dict[str, List[Dict[str, str]]]
:param row: Dictionary containing a single data example with keys like "instruction"...| |How does RapidFire AI's shard-based adaptive execution engine enable online aggregation of eval metrics with confidence intervals, and what specific mathematical strategies are available for computing those intervals?|RapidFire AI transforms the status quo by adapting the powerful idea of online aggregation from database systems research to LLM evals. Our adaptive execution engine, :doc:as described on this page</difference>, automatically shards the data and processes multiple configs in parallel, one shard at a time, with efficient swapping techniques.This means you get running metric estimates with confidence intervals in real time. So, you can confidently stop poor configs earlier, clone better configs on the fly, and perform more informed exploration to reach much better eval metrics in much less time.
Example: Traditional Batch Evals vs. RapidFire AI
For instance, suppose you have an evals set with 400 queries. You decide to compare, say, 4 RAG configs in one go with RapidFire AI with number of shards set to 8. The illustration below contrasts traditional batch evals vs. RapidFire AI's approach for a simple eval metric.
.. list-table:: :widths: 50 50 :clas...| |What is the default value of the text_key metadata field name used to store raw text content in Pinecone vector store configurations?|- :code:|"text_key": The metadata field name used to store the original raw text content associated with a vector in Pinecone. Optional; default is :code:"text". Applicable to all modes. This is useful when the Pinecone index was populated by an external tool that stored text under a non-default metadata field name (e.g., :code:"content", :code:"original_text"). - :code:"vector_type": Vector type for the index. Accepts a :code:VectorTypevalue or string. Optional for Create mode; default is :code:"dense". N/A for Read/Update mode. - :code:"tags": Arbitrary string key-value tags to attach to the index. Optional for Create mode; default is :code:None. N/A for Read/Update mode. - :code:"timeout": Timeout in seconds for index operations. Optional for Create mode; default is :code:None. N/A for Read/Update mode. - :code:"deletion_protection": Whether deletion protection is enabled. Accepts a :code:DeletionProtection...Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false, "directions": [ "query_to_doc" ], "partition_mode": "joint", "hardness_mode": null, "hardness_strength": 0.0 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 1multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
do_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: Nonewarmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Trueenable_jit_checkpoint: Falsesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseuse_cpu: Falseseed: 42data_seed: Nonebf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: -1ddp_backend: Nonedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonedisable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Nonegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Truepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_for_metrics: []eval_do_concat_batches: Trueauto_find_batch_size: Falsefull_determinism: Falseddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueuse_cache: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}
Training Time
- Training: 1.0 seconds
Framework Versions
- Python: 3.12.13
- Sentence Transformers: 5.4.1
- Transformers: 5.0.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.0.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
- Downloads last month
- 5
Model tree for ronit01/final_golden_rag_tuned_minilm_mnr
Base model
nreimers/MiniLM-L6-H384-uncased