25 2 7

Eduardo Muñoz Sala

edumunozsala

edumunozsala

AI & ML interests

Interested in NLP task, summarization and classification or taggin and model deployment in production

Recent Activity

upvoted a collection 4 months ago

SmolLM2

updated a model 4 months ago

edumunozsala/Qwen2-0.5B-mntp-simcse

updated a model 4 months ago

edumunozsala/Qwen2-0.5B-mntp-since-L2V

View all activity

Organizations

edumunozsala's activity

upvoted a collection 4 months ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated Feb 20 • 251

updated 2 models 4 months ago

edumunozsala/Qwen2-0.5B-mntp-simcse

Updated Dec 1, 2024 • 7

edumunozsala/Qwen2-0.5B-mntp-since-L2V

Updated Nov 30, 2024

updated 2 models 10 months ago

edumunozsala/phi3-mini-4k-qlora-python-code-20k

Text Generation • Updated Jun 6, 2024 • 18

edumunozsala/phi3-mini-python-code-20k

Text Generation • Updated Jun 6, 2024 • 8

New activity in somosnlp/instruct-legal-refugiados-es 10 months ago

código del fichero creacion_datasets_refugiados_HFEndpoint.ipynb

#2 opened 10 months ago by

enricarmengol

updated 2 models 11 months ago

edumunozsala/phi-3-mini-QLoRA

Updated May 12, 2024 • 176

edumunozsala/phi-3-mini-LoRA

Text Generation • Updated May 11, 2024 • 6 • 1

New activity in edumunozsala/phi-3-mini-LoRA 11 months ago

What do you mean "unknown dataset"

#1 opened 11 months ago by

Delcos

reacted to Severian's post with 👍 11 months ago

Post

3769

Create and Train Your Own Expert LLM: Generating Synthetic, Fact-Based Datasets with LMStudio/Ollama and then fine-tuning with MLX and Unsloth

Hey everyone!

I know there are tons of videos and tutorials out there already but I've noticed a lot of questions popping up in community posts about using synthetic datasets for creative projects and how to transform personal content into more factual material. In my own work doing enterprise-level SFT and crafting my open-source models, I've enhanced a Python framework originally shared by the creator of the Tess models. This improved stack utilizes local language models and also integrates the Wikipedia dataset to ensure that the content generated is as accurate and reliable as possible.

I've been thinking of putting together a comprehensive, step-by-step course/guide on creating your own Expert Language Model. From dataset preparation and training to deployment on Hugging Face and even using something like AnythingLLM for user interaction. I'll walk you through each phase, clarifying complex concepts and troubleshooting common pitfalls.

Let me know if this interests you!

Most of the datasets and models I've made have been using these scripts and my approach