Instructions to use ronit01/rag_tuned_minilm_mnr_5epoch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries
How to use ronit01/rag_tuned_minilm_mnr_5epoch with sentence-transformers:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("ronit01/rag_tuned_minilm_mnr_5epoch")

sentences = [
    "How does RapidFire AI's adaptive execution engine differ from traditional sequential execution for multi-config experiments?",
    "Why Not Just Downsample?\n------------------------\n\nOne might wonder why downsampling the eval set does not suffice here. \n:doc:`As also explained on this page</difference>`, downsampling alone has \nsignificant disadvantages compared to the approach offered by RapidFire AI. \n\nFirst, you have to decide a downsample size upfront, which is not trivial if your\neval metrics have high variance across examples. Point estimates without confidence \nintervals can give false confidence in a sample. You can resample manually \nover and over, but that adds manual grunt work of juggling separate samples/files. \nFinally, downsampling alone does not offer you the power of IC Ops and automated \nparallelization to try new configs on the fly--you'd have reimplement those manually.\n\nRapidFire AI's online aggregation approach with IC Ops avoids all the above issues,\nwhile also being **complementary** to downsampling, i.e., you can use both in \nconjunction for even lower runtimes/costs.",
    "The crux of RapidFire AI's difference is in its *adaptive execution engine*: it enables \"interruptible\"\nexecution of configurations across GPUs/CPUs. To do so, it first shards the training and/or evaluation \ndataset randomly into \"chunks\" (also called \"shards\").\nThen instead of waiting for a run to see the whole dataset for all epochs (for SFT/RFT) or for full \neval metrics calculation (for RAG evals), RapidFire AI schedules all runs on *one shard at a time*, \nand then cycles through all shards.\n\nSuppose you have only 1 GPU, say an A100 or H100, and you want to run SFT on a Llama model. \nCurrent tools force you to run one config after another *sequentially* as shown in the (simplified) illustration below. \nIn contrast, by operating on shards, RapidFire AI offers a far more concurrent learning experience by \nautomatically *swapping* adapters (and base models, if needed) across GPU(s) and DRAM. \nIt does this via efficient shared memory-based caching mechanisms that can spill to disk when needed.\n\n.. image:: /images/gantt-1gpu.png\n   :width: 800px\n\nIn the above figure, all 3 model configs are shown for 1 epoch. RapidFire AI is set to use 4 chunks.\nSo, before model config 3 (M3) even starts in the sequential approach, RapidFire AI already shows you \nthe learning behaviors of all 3 configs on the first 2-3 chunks. \nThe overhead of swapping, represented by the thin gray box, is minimal, less than 5% of the runtime,\nas per our measurements--thanks to our new efficient memory management techniques.\n\nFor inference evals for RAG/context engineering, such sharded execution means RapidFire AI surfaces eval metrics \nsooner based on a statistical technique known as *online aggregation* from the database systems literature.\nBasically, see estimated values and confidence intervals for all eval metrics in real time as the shards \nget processed, ultimately converging to the exact metrics on the full dataset.",
    "Delete\n----\n\nThis IC Op earmarks the run to be deleted from the next chunk onward. \nOn the chart, you will see its curves vanish almost immediately. \nYou cannot do any further IC Ops on a deleted run because it will not be visible. \nNote that although a deleted run vanishes from the plots, its model checkpoints are still part of \nthe artifacts of that experiment so that you have post-hoc audibility.\n"
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]
Notebooks
Google Colab
Kaggle
New discussion
Resources
View closed (0)
Welcome to the community

The community tab is the place to discuss and collaborate with the HF community!