Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
Raman
TheDetective
Follow
0 followers
·
1 following
ramaneswaran
AI & ML interests
Natural Language Processing
Recent Activity
authored
a paper
2 days ago
Do Audio-Language Models Understand Linguistic Variations?
reacted
to
vladbogo
's
post
with 👍
about 1 year ago
"A Closer Look at the Limitations of Instruction Tuning" is a new paper that explores the efficacy and limitations of Instruction Tuning (IT) in Large Language Models (LLMs) for conversational agents. The authors conduct a series of experiments using both LoRA fine-tuning (LFT) and standard full-parameter fine-tuning (SFT) across various LLMs and IT datasets. The key findings are: * LoRA fine-tuning (LFT) preserves the pre-training token distribution while SFT doesn't. This indicates that using LFT, post fine-tuning the model still heavily relies on the pre-training and doesn't acquire new information. * Dataset scaling is ineffective for LFT - experiments show that scaling the dataset size 52x or even 326x doesn't improve the performance. * LoRA fine-tuning mainly enhances response initiation and style without substantial knowledge enhancement. * Full-parameter fine-tuning tends to degrade LLM knowledge base and increase hallucination occurrences. * Popular other methods and adjustments fail to significantly outperform simple LoRA fine-tuned models in terms of conversational quality and accuracy. Congrats to the authors @Sreyan88 and others for their work! Paper: https://huggingface.co/papers/2402.05119
updated
a model
over 1 year ago
TheDetective/cross_ner_ai
View all activity
Organizations
Papers
4
arxiv:
2410.16505
arxiv:
2404.00415
arxiv:
2310.15799
arxiv:
2307.13720
models
5
Sort: Recently updated
TheDetective/cross_ner_ai
Feature Extraction
•
Updated
Aug 10, 2023
•
8
TheDetective/cross_ner_science
Feature Extraction
•
Updated
Aug 10, 2023
•
11
TheDetective/cross_ner_literature
Feature Extraction
•
Updated
Aug 10, 2023
•
8
TheDetective/cross_ner_music
Feature Extraction
•
Updated
Aug 10, 2023
•
8
TheDetective/cross_ner_politics
Feature Extraction
•
Updated
Aug 10, 2023
•
11
datasets
None public yet