Instructions to use ronit01/golden_rag_tuned_minilm_mnr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ronit01/golden_rag_tuned_minilm_mnr with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ronit01/golden_rag_tuned_minilm_mnr") sentences = [ "How do you set up and run an SFT fine-tuning experiment from scratch using RapidFire AI's full installation, from installing the package through launching training and monitoring results?", "Preprocess Function\n-------------------\n\nMandatory user-provided function to prepare the inputs to be given to the generator model. \nIt is invoked for each batch during the evaluation process before generation.\nPass it directly to the :code:`preprocess_fn` key in your eval config dictionary.\n\nThe system injects into this function the batch data, as well as the RAG spec and \nthe prompt manager of an individual leaf config.\n\n\n.. py:function:: preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list]\n\n :param batch: Dictionary with a batch of examples with dataset field names as keys and lists as values\n :type batch: dict[str, list]\n\n :param rag: RAG specification object for document chunk retrieval and context serialization\n :type rag: RFLangChainRagSpec\n\n :param prompt_manager: Prompt manager object for handling instructions and few-shot examples\n :type prompt_manager: RFPromptManager\n\n :return: Dictionary with the preprocessed batch. It must have a reserved key :code:`\"prompts\"` for the fully formatted prompts for the generator. Other key-value pairs from the original batch can also be copied over if you want.\n :rtype: dict[str, list]\n\n\n**Examples:**\n\n.. code-block:: python\n\n # Example 1 from FiQA use case: RAG-based preprocessing with document chunk retrieval\n # This example demonstrates how metadata fields ingested via metadata_func in the\n # document loader (e.g., \"corpus_id\") are accessible on each Document object's\n # .metadata dict after retrieval, enabling retrieval evaluation.\n def sample_preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list]:\n\t\t\"\"\"Function to prepare the final inputs given to the generator model\"\"\"\n \n\t\tINSTRUCTIONS = \"Utilize your financial knowledge, give your answer or opinion to the input question or subject matter.\"\n \n\t\t# Perform batched retrieval over all queries; returns a list of lists of k documents per query\n\t\tall_context = rag.get_context(batch_queries=batch[\"query\"], serialize=False)\n \n\t\t# Extract the retrieved document ids from the context.\n\t\t# The \"corpus_id\" metadata field was ingested via metadata_func in the document loader\n\t\t# (see RFLangChainRagSpec examples) and is now accessible on each Document object.\n\t\tretrieved_documents = [\n\t\t\t[doc.metadata[\"corpus_id\"] for doc in docs] for docs in all_context\n\t\t]\n \n\t\t# Serialize the retrieved documents into a single string per query using the document_template.\n\t\t# If a custom document_template was provided in the RAG spec (e.g., to include title metadata),\n\t\t# it is applied here; otherwise the default \"metadata:\\ncontent\" template is used.\n\t\tserialized_context = rag.serialize_documents(all_context)\n\t\tbatch[\"query_id\"] = [int(query_id) for query_id in batch[\"query_id\"]]\n \n\t\t# Each batch to contain conversational prompt, retrieved documents, and original 'query_id', 'query', 'metadata'\n\t\treturn {\n\t\t\t\"prompts\": [\n\t\t\t\t[\n\t\t\t\t\t{\"role\": \"system\", \"content\": INSTRUCTIONS},\n\t\t\t\t\t{\n\t\t\t\t\t\t\"role\": \"user\",\n\t\t\t\t\t\t\"content\": f\"Here is some relevant context:\\n{context}. \\nNow answer the following question using the context provided earlier:\\n{question}\",\n\t\t\t\t\t},\n\t\t\t\t]\n\t\t\t\tfor question, context in zip(batch[\"query\"], serialized_context)\n\t\t\t],\n\t\t\t\"retrieved_documents\": retrieved_documents,\n\t\t\t**batch,\n\t\t}\n\n.. code-block:: python\n\n # Example 2 from GSM8K use case: Few-shot learning preprocessing without RAG\n def sample_preprocess_fn(batch: dict[str, list], rag: RFLangChainRagSpec, prompt_manager: RFPromptManager) -> dict[str, list]:\n\t\t\"\"\"Function to prepare the final inputs given to the generator model\"\"\"\n\n\t\treturn {\n\t\t\t\"prompts\": [\n\t\t\t\t[\n\t\t\t\t\t{\"role\": \"system\", \"content\": prompt_manager.get_instructions()},\n\t\t\t\t\t{\n\t\t\t\t\t\t\"role\": \"user\",\n\t\t\t\t\t\t\"content\": f\"Here are some examples: \\n{examples}. \\nNow answer the following question:\\n{question}\",\n\t\t\t\t\t},\n\t\t\t\t]\n\t\t\t\tfor question, examples in zip(\n\t\t\t\t\tbatch[\"question\"],\n\t\t\t\t\tprompt_manager.get_fewshot_examples(user_queries=batch[\"question\"]),\n\t\t\t\t)\n\t\t\t],\n\t\t\t**batch,\n\t\t}", " - :code:`\"text_key\"`: The metadata field name used to store the original raw text content associated with a vector in Pinecone. Optional; default is :code:`\"text\"`. Applicable to all modes. This is useful when the Pinecone index was populated by an external tool that stored text under a non-default metadata field name (e.g., :code:`\"content\"`, :code:`\"original_text\"`).\n - :code:`\"vector_type\"`: Vector type for the index. Accepts a :code:`VectorType` value or string. Optional for Create mode; default is :code:`\"dense\"`. N/A for Read/Update mode.\n - :code:`\"tags\"`: Arbitrary string key-value tags to attach to the index. Optional for Create mode; default is :code:`None`. N/A for Read/Update mode.\n - :code:`\"timeout\"`: Timeout in seconds for index operations. Optional for Create mode; default is :code:`None`. N/A for Read/Update mode.\n - :code:`\"deletion_protection\"`: Whether deletion protection is enabled. Accepts a :code:`DeletionProtection` value or string. Optional for Create mode; default is :code:`\"disabled\"`. N/A for Read/Update mode.\n\n To recap, for all 3 modes :code:`\"pinecone_api_key\"` is needed either here or as an environment variable; :code:`embedding_cfg` is also required either here or in the top-level config. The :code:`\"text_key\"` is optional for all modes and defaults to :code:`\"text\"`. ", "Step 1: Install dependencies and package\n-----------------------\n\nObtain the RapidFire AI OSS package from pypi (includes all dependencies) and ensure it is installed correctly.\n\n.. important::\n\n Requires Python 3.12+. Ensure that ``python3`` resolves to Python 3.12 before creating the venv.\n\n.. code-block:: bash\n\n python3 --version # must be 3.12.x\n python3 -m venv .venv\n source .venv/bin/activate\n\n pip install rapidfireai\n\n rapidfireai --version\n # Verify it prints the following:\n # RapidFire AI 0.14.0\n\nProvide your Hugging Face account token to access the gated Llama and Mistral models \nshowcased in the tutorial notebooks. \nIf you do not have such a token, you have two options:\n\n* Switch the :code:`model_name` in the tutorial notebook to a non-gated model from Hugging Face. Then proceed to Step 2.\n\n* Create a Hugging Face token `as explained here <https://huggingface.co/docs/hub/en/security-tokens>`_. Then request access on the following gated models' Hugging Face pages:\n\n * `mistralai/Mistral-7B-Instruct-v0.3 <https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3>`_\n * `meta-llama/Llama-3.1-8B-Instruct <https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct>`_\n * `meta-llama/Llama-3.2-1B-Instruct <https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct>`_\n \n Headsup: the approval for the Llama models may take a few hours. Then provide your HF token in the same venv.\n\n.. code-block:: bash\n\n source .venv/bin/activate\n pip install \"huggingface-hub[cli]\"\n\n # Replace YOUR_TOKEN with your actual HF token\n # https://huggingface.co/docs/hub/en/security-tokens\n hf auth login --token YOUR_TOKEN\n\n # Due to current issue: https://github.com/huggingface/xet-core/issues/527\n pip uninstall -y hf-xet\n\n\nFeel free to ask us on Discord if you need any help with accessing gated Hugging Face models. Unfortunately, we are not allowed to provide a publicly visible token here for your use due to Hugging Face's policies." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
- Supported Modality: Text
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'BertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'mean', 'include_prompt': True})
(2): Normalize({})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ronit01/golden_rag_tuned_minilm_mnr")
# Run inference
sentences = [
"How does RapidFire AI's concept of a 'config dictionary' with set-valued knobs relate to config groups and leaf configs, and why is this abstraction important for multi-config experimentation?",
'Run\n-----\n\nA central concept in RapidFire AI representing a single combination of configuration knob values\nfor a model trained with :func:`run_fit()`. \nIt is the same concept as in ML metrics dashboards such as MLflow and Weights & Biases. \nRapidFire AI assigns each run a unique integer :code:`run_id` within an experiment.\n',
' :param vector_store_cfg: The vector store type and args to store and possibly index embedding vectors for retrieval, provided as a single dictionary. \n \n - :code:`"type"`: The type of vector store to use. Must be one of :code:`"faiss"`, :code:`"pgvector"`, or :code:`"pinecone"`. Required.\n - :code:`"batch_size"`: Number of vectors per insert batch. Applies to all 3 types of stores. Optional; default is 128.\n\n The remaining keys are type-specific args as listed below. The vector store operates in one of 3 modes depending on the rest of the RAG spec:\n\n - **Create mode:** When :code:`document_loader` is provided and no pre-existing index/collection names are specified, a new vector store is *created* and populated from the loaded documents.\n - **Read mode:** When :code:`document_loader` is absent and pre-existing index/collection names are specified, the vector store is opened in *read-only* mode for retrieval against the existing index.\n - **Update mode:** When both :code:`document_loader` and pre-existing index/collection names are provided, the existing index/collection is *updated* with the new documents added to it.\n\n Supported vector store types and their arg keys:\n\n - **FAISS:** No additional keys. Uses a flat L2 index by default. Set :code:`enable_gpu_search=True` on the constructor to use GPU-accelerated FAISS. Only supports Create mode since it\'s an in-memory store that is not persistent. So, the notion of pre-existing indexes does not apply.\n\n - **Pinecone:**\n\n - :code:`"pinecone_api_key"`: Pinecone API key. Optional if the :code:`PINECONE_API_KEY` environment variable is set.\n - :code:`"index_namespace"`: A 2-tuple of strings (:code:`tuple[str, str]`) with index name and namespace. Required for Read/Update mode and must be a pre-existing index and namespace (NB: namespace can be empty string :code:`""` in Pinecone). N/A for Create mode.\n - :code:`"spec"`: A :code:`ServerlessSpec` or :code:`PodSpec` instance specifying the Pinecone deployment (e.g., cloud and region). Required for Create mode. N/A for Read/Update mode.\n - :code:`"metric"`: Distance metric for the index, must be one of :code:`"cosine"`, :code:`"euclidean"`, or :code:`"dotproduct"`. Optional for Create mode; default is :code:`"cosine"`. N/A for Read/Update mode.\n - :code:`"embedding_cfg"`: Embedding config dict (same format as the top-level :code:`embedding_cfg`). Required for any mode either here or in the top-level config for any mode. If provided here, *this takes precedence* over the top-level embedding config. For Create mode, we recommend providing it in the top-level config unless you want to couple different embedding configs with different vector stores.\n - :code:`"text_key"`: The metadata field name used to store the original raw text content associated with a vector in Pinecone. Optional; default is :code:`"text"`. Applicable to all modes. This is useful when the Pinecone index was populated by an external tool that stored text under a non-default metadata field name (e.g., :code:`"content"`, :code:`"original_text"`).\n - :code:`"vector_type"`: Vector type for the index. Accepts a :code:`VectorType` value or string. Optional for Create mode; default is :code:`"dense"`. N/A for Read/Update mode.\n - :code:`"tags"`: Arbitrary string key-value tags to attach to the index. Optional for Create mode; default is :code:`None`. N/A for Read/Update mode.\n - :code:`"timeout"`: Timeout in seconds for index operations. Optional for Create mode; default is :code:`None`. N/A for Read/Update mode.\n - :code:`"deletion_protection"`: Whether deletion protection is enabled. Accepts a :code:`DeletionProtection` value or string. Optional for Create mode; default is :code:`"disabled"`. N/A for Read/Update mode.\n\n To recap, for all 3 modes :code:`"pinecone_api_key"` is needed either here or as an environment variable; :code:`embedding_cfg` is also required either here or in the top-level config. The :code:`"text_key"` is optional for all modes and defaults to :code:`"text"`. \n \n For Create mode, :code:`"spec"` is required but the following are all optional: :code:`"metric"`, :code:`"vector_type"`, :code:`"tags"`, :code:`"timeout"`, and :code:`"deletion_protection"`. Although the argument :code:`"index_namespace"` is inapplicable, internally RapidFire AI creates an index name automatically with prefix "rf-" and an SHA hash per pre-processing worker to avoid naming conflicts; the namespace created is the default empty string.\n \n For Read/Update mode, :code:`"index_namespace"` is required and must point to a pre-existing index and namespace. All the other arguments are inapplicable.\n\n - **Postgres PGVector:**\n\n - :code:`"connection"`: DB connection string or engine. Required for all modes.\n - :code:`"collection_name"`: A pre-existing PGVector collection/table name to use for retrieval. Required for Read/Update mode. Inapplicable to Create mode; an SHA-based random name will be generated.\n - :code:`"embedding_cfg"`: Same explanation as above under Pinecone.\n - :code:`"pre_delete_collection"`: If :code:`True`, *deletes* the collection if it already exists before writing. **Use with caution.** Optional; default is :code:`False`. Applicable only to Update mode.\n\n The store is built from the documents provided via :code:`document_loader`. If this entire config is skipped, a default FAISS flat vector store will be created automatically.\n :type vector_store_cfg: dict[str, Any], optional',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.5984, 0.3058],
# [0.5984, 1.0000, 0.2532],
# [0.3058, 0.2532, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
Size: 111 training samples
Columns:
sentence_0andsentence_1Approximate statistics based on the first 111 samples:
sentence_0 sentence_1 type string string details - min: 15 tokens
- mean: 41.97 tokens
- max: 70 tokens
- min: 36 tokens
- mean: 227.35 tokens
- max: 256 tokens
Samples: | sentence_0 | sentence_1 | |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
What are all the Experiment class methods (experiment ops) provided by RapidFire AI, and what does each one do?|Experiment Constructor| |
------
Constructor to instantiate a new experiment.
.. py:function:: init(self, experiment_name: str, mode: str = "fit", experiments_path: str = "./rapidfire_experiments") -> None
:param experiment_name: Unique name for this experiment
:type experiment_name: str
:param mode: Mode of this experiment, either :code:"fit"or :code:"eval"; default is :code:"fit"
:type mode: str
:param experiments_path: Path to a folder to store this experiment's artifacts. Default is"./rapidfire_experiments")
:type experiments_path: str, optional
:return: None
:rtype: NoneWhat are all the parameters accepted by the RFOpenAIAPIModelConfig class, and what does each one configure?|RFOpenAIAPIModelConfigThis is a wrapper around OpenAI's API client config and chat completion parameters. The full list of their arguments are available on
this page <https://platform.openai.com/docs/api-reference/chat/create>__.The difference here is that the individual arguments (knobs) can be :class:
Listvalued or :class:Rangevalued in an :class:RFOpenAIAPIModelConfig. That is how you can specify a base set of knob combinations from which a config group can be produced. Also read :doc:the Multi-Config Specification page</configs>... py:class:: RFOpenAIAPIModelConfig
:param client_config: A dictionary necessary for initializing the AsyncOpenAI client. All knobs given in this dictionary are simply passed to the AsyncOpenAI client as is. We recommend listing at least the following knobs.
* :code:`"api_key"`: Your OpenAI API key for authentication. Note that we are NOT able to provide a publicly visible API key. * :code:`"max_retries"`: Maximum ...</code> ||
How do RFvLLMModelConfig and RFOpenAIAPIModelConfig compare in terms of their configuration parameters, underlying systems, rate limiting capabilities, and typical use cases?|RFOpenAIAPIModelConfigThis is a wrapper around OpenAI's API client config and chat completion parameters. The full list of their arguments are available on
this page <https://platform.openai.com/docs/api-reference/chat/create>__.The difference here is that the individual arguments (knobs) can be :class:
Listvalued or :class:Rangevalued in an :class:RFOpenAIAPIModelConfig. That is how you can specify a base set of knob combinations from which a config group can be produced. Also read :doc:the Multi-Config Specification page</configs>... py:class:: RFOpenAIAPIModelConfig
:param client_config: A dictionary necessary for initializing the AsyncOpenAI client. All knobs given in this dictionary are simply passed to the AsyncOpenAI client as is. We recommend listing at least the following knobs.
* :code:`"api_key"`: Your OpenAI API key for authentication. Note that we are NOT able to provide a publicly visible API key. * :code:`"max_retries"`: Maximum ...</code> |Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false, "directions": [ "query_to_doc" ], "partition_mode": "joint", "hardness_mode": null, "hardness_strength": 0.0 }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 1
multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
do_predict: False
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}
Training Time
- Training: 1.6 seconds
Framework Versions
- Python: 3.12.13
- Sentence Transformers: 5.4.1
- Transformers: 5.0.0
- PyTorch: 2.10.0+cu128
- Accelerate: 1.13.0
- Datasets: 4.0.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
- Downloads last month
- 5
Model tree for ronit01/golden_rag_tuned_minilm_mnr
Base model
nreimers/MiniLM-L6-H384-uncased