Arjun Bhagubhai Bhammar PRO

Arjun4707

AI & ML interests

NLP, CV, Audio and RL

Recent Activity

updated a Space 20 days ago

Arjun4707/First_agent_template

reacted to vladbogo's post with 👍 about 1 year ago

"A Closer Look at the Limitations of Instruction Tuning" is a new paper that explores the efficacy and limitations of Instruction Tuning (IT) in Large Language Models (LLMs) for conversational agents. The authors conduct a series of experiments using both LoRA fine-tuning (LFT) and standard full-parameter fine-tuning (SFT) across various LLMs and IT datasets. The key findings are: * LoRA fine-tuning (LFT) preserves the pre-training token distribution while SFT doesn't. This indicates that using LFT, post fine-tuning the model still heavily relies on the pre-training and doesn't acquire new information. * Dataset scaling is ineffective for LFT - experiments show that scaling the dataset size 52x or even 326x doesn't improve the performance. * LoRA fine-tuning mainly enhances response initiation and style without substantial knowledge enhancement. * Full-parameter fine-tuning tends to degrade LLM knowledge base and increase hallucination occurrences. * Popular other methods and adjustments fail to significantly outperform simple LoRA fine-tuned models in terms of conversational quality and accuracy. Congrats to the authors @Sreyan88 and others for their work! Paper: https://huggingface.co/papers/2402.05119

reacted to dvilasuero's post with 🤗 about 1 year ago

🚀 The Open Source AI community needs more open datasets for improving Open LLMs. Excited to share our new open dataset for boosting chat models: 🎉 Welcome Distilabel Capybara DPO, a multi-turn, high-quality preference dataset. https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized Why? Best closed chat models are built on top of multi-turn dialogue preference data. The OSS community lacks these datasets. This dataset is the first in the series to close this gap. Is this dataset useful? To test this dataset, we've built our virtual launching partner: 🎉 Welcome CapybaraHermes, a preference tuned OpenHermes with increased second turn capabilities on MTBench https://huggingface.co/argilla/CapybaraHermes-2.5-Mistral-7B As usual, models are the least important to us. We like to focus on the data. Our mission is to build and share high-quality datasets, sharing our methods in the open so the community can improve upon them. That's why, we took some time to describe the full methodology on the dataset card, check it out and give us feedback! Data and methods are never perfect! Finally, this is just a preview version and would love to collaborate with you to add more benchmarking results, what hyperparams work for DPO'ing models, what mix of datasets, etc. Expect some more datasets in the coming weeks. Let's build the best data for AI, together.

View all activity

Organizations

None yet

Arjun4707's activity

updated a Space 20 days ago

First Agent Template

⚡

Fetch mutual fund info and current time

reacted to vladbogo's post with 👍 about 1 year ago

Post

"A Closer Look at the Limitations of Instruction Tuning" is a new paper that explores the efficacy and limitations of Instruction Tuning (IT) in Large Language Models (LLMs) for conversational agents. The authors conduct a series of experiments using both LoRA fine-tuning (LFT) and standard full-parameter fine-tuning (SFT) across various LLMs and IT datasets.

The key findings are:
* LoRA fine-tuning (LFT) preserves the pre-training token distribution while SFT doesn't. This indicates that using LFT, post fine-tuning the model still heavily relies on the pre-training and doesn't acquire new information.
* Dataset scaling is ineffective for LFT - experiments show that scaling the dataset size 52x or even 326x doesn't improve the performance.
* LoRA fine-tuning mainly enhances response initiation and style without substantial knowledge enhancement.
* Full-parameter fine-tuning tends to degrade LLM knowledge base and increase hallucination occurrences.
* Popular other methods and adjustments fail to significantly outperform simple LoRA fine-tuned models in terms of conversational quality and accuracy.

Congrats to the authors @Sreyan88 and others for their work!

Paper: A Closer Look at the Limitations of Instruction Tuning (2402.05119)

2 replies

reacted to dvilasuero's post with 🤗❤️ about 1 year ago

Post

🚀 The Open Source AI community needs more open datasets for improving Open LLMs. Excited to share our new open dataset for boosting chat models:

🎉 Welcome Distilabel Capybara DPO, a multi-turn, high-quality preference dataset.

argilla/distilabel-capybara-dpo-7k-binarized

Why?
Best closed chat models are built on top of multi-turn dialogue preference data. The OSS community lacks these datasets. This dataset is the first in the series to close this gap.

Is this dataset useful?
To test this dataset, we've built our virtual launching partner:

🎉 Welcome CapybaraHermes, a preference tuned OpenHermes with increased second turn capabilities on MTBench

argilla/CapybaraHermes-2.5-Mistral-7B

As usual, models are the least important to us. We like to focus on the data. Our mission is to build and share high-quality datasets, sharing our methods in the open so the community can improve upon them.

That's why, we took some time to describe the full methodology on the dataset card, check it out and give us feedback! Data and methods are never perfect!

Finally, this is just a preview version and would love to collaborate with you to add more benchmarking results, what hyperparams work for DPO'ing models, what mix of datasets, etc.

Expect some more datasets in the coming weeks. Let's build the best data for AI, together.

1 reply