I used this data https://huggingface.co/datasets/Amod/mental_health_counseling_conversations/tree/main and also you can find here as well https://www.kaggle.com/datasets/melissamonfared/mental-health-counseling-conversations-k
Muhammad Imran Zaman PRO
ImranzamanML
AI & ML interests
Results-driven Machine Learning Engineer with 7+ years of experience leading teams and delivering advanced AI solutions that increased revenue by up to 40%. Proven track record in enhancing business performance through consultancy and expertise in NLP, Computer Vision, LLM models and end-to-end ML pipelines. Skilled in managing critical situations and collaborating with cross-functional teams to implement scalable, impactful solutions. Kaggle Grandmaster and top performer in global competitions, dedicated to staying at the forefront of AI advancements.
Recent Activity
replied to
their
post
about 4 hours ago
Mental Health Chatbot by Fine-Tuning Llama 4
https://huggingface.co/blog/ImranzamanML/llama-4-fine-tuning-with-mental-health-counseling
reacted
to
DualityAI-RebekahBogdanoff's
post
with 👍
about 15 hours ago
We’re back—with higher stakes, new datasets, and more chances to stand out. Duality AI's Synthetic-to-Real Object Detection Challenge 2 is LIVE!🚦
✍ Sign up here: https://lnkd.in/g2avFP_X
After the overwhelming response to Challenge 1, we're pushing the boundaries even further in Challenge 2, where your object detection models will be put to the test in the real world after training only on synthetic data.
👉 Join our Synthetic-to-Real Object Detection Challenge 2 on Kaggle!
What’s Different This Time? Unlike our first challenge, we’re now diving deep into data manipulation. Competitors can:
🔹Access 4 new supplemental datasets via FalconCloud with varying lighting, occlusions, and camera angles.
🔹Generate your own synthetic datasets using FalconEditor to simulate edge cases.
🔹Mix, match, and build custom training pipelines for maximum mAP@50 performance
This challenge isn’t just about using synthetic data—it’s about mastering how to craft the right synthetic data.
Ready to test your skills?
🏆The Challenge
Train an object detection model using synthetic images created with Falcon—Duality AI's cutting-edge digital twin simulation software—then evaluate your model on real-world imagery.
The Twist?
📈Boost your model’s accuracy by creating and refining your own custom synthetic datasets using Falcon!
Win Cash Prizes & Recognition
🔹Earn cash and public shout-outs from the Duality AI accounts
Enhance Your Portfolio
🔹Demonstrate your real-world AI and ML expertise in object detection to prospective employers and collaborators.
🔹Expand Your Network
🔹Engage, compete, and collaborate with fellow ML engineers, researchers, and students.
🚀 Put your skills to the test and join our Kaggle competition today: https://lnkd.in/g2avFP_X
Organizations
ImranzamanML's activity

replied to
their
post
about 4 hours ago

reacted to
DualityAI-RebekahBogdanoff's
post with 👍
about 15 hours ago
Post
1737
We’re back—with higher stakes, new datasets, and more chances to stand out. Duality AI's Synthetic-to-Real Object Detection Challenge 2 is LIVE!🚦
✍ Sign up here: https://lnkd.in/g2avFP_X
After the overwhelming response to Challenge 1, we're pushing the boundaries even further in Challenge 2, where your object detection models will be put to the test in the real world after training only on synthetic data.
👉 Join our Synthetic-to-Real Object Detection Challenge 2 on Kaggle!
What’s Different This Time? Unlike our first challenge, we’re now diving deep into data manipulation. Competitors can:
🔹Access 4 new supplemental datasets via FalconCloud with varying lighting, occlusions, and camera angles.
🔹Generate your own synthetic datasets using FalconEditor to simulate edge cases.
🔹Mix, match, and build custom training pipelines for maximum mAP@50 performance
This challenge isn’t just about using synthetic data—it’s about mastering how to craft the right synthetic data.
Ready to test your skills?
🏆The Challenge
Train an object detection model using synthetic images created with Falcon—Duality AI's cutting-edge digital twin simulation software—then evaluate your model on real-world imagery.
The Twist?
📈Boost your model’s accuracy by creating and refining your own custom synthetic datasets using Falcon!
Win Cash Prizes & Recognition
🔹Earn cash and public shout-outs from the Duality AI accounts
Enhance Your Portfolio
🔹Demonstrate your real-world AI and ML expertise in object detection to prospective employers and collaborators.
🔹Expand Your Network
🔹Engage, compete, and collaborate with fellow ML engineers, researchers, and students.
🚀 Put your skills to the test and join our Kaggle competition today: https://lnkd.in/g2avFP_X
✍ Sign up here: https://lnkd.in/g2avFP_X
After the overwhelming response to Challenge 1, we're pushing the boundaries even further in Challenge 2, where your object detection models will be put to the test in the real world after training only on synthetic data.
👉 Join our Synthetic-to-Real Object Detection Challenge 2 on Kaggle!
What’s Different This Time? Unlike our first challenge, we’re now diving deep into data manipulation. Competitors can:
🔹Access 4 new supplemental datasets via FalconCloud with varying lighting, occlusions, and camera angles.
🔹Generate your own synthetic datasets using FalconEditor to simulate edge cases.
🔹Mix, match, and build custom training pipelines for maximum mAP@50 performance
This challenge isn’t just about using synthetic data—it’s about mastering how to craft the right synthetic data.
Ready to test your skills?
🏆The Challenge
Train an object detection model using synthetic images created with Falcon—Duality AI's cutting-edge digital twin simulation software—then evaluate your model on real-world imagery.
The Twist?
📈Boost your model’s accuracy by creating and refining your own custom synthetic datasets using Falcon!
Win Cash Prizes & Recognition
🔹Earn cash and public shout-outs from the Duality AI accounts
Enhance Your Portfolio
🔹Demonstrate your real-world AI and ML expertise in object detection to prospective employers and collaborators.
🔹Expand Your Network
🔹Engage, compete, and collaborate with fellow ML engineers, researchers, and students.
🚀 Put your skills to the test and join our Kaggle competition today: https://lnkd.in/g2avFP_X

posted
an
update
about 15 hours ago
Post
1616
Mental Health Chatbot by Fine-Tuning Llama 4
https://huggingface.co/blog/ImranzamanML/llama-4-fine-tuning-with-mental-health-counseling
https://huggingface.co/blog/ImranzamanML/llama-4-fine-tuning-with-mental-health-counseling

upvoted
an
article
about 15 hours ago
Article
LLaMA 4 Fine-Tuning with Mental Health Counseling Data
By
•
•
3
published
an
article
about 15 hours ago
Article
LLaMA 4 Fine-Tuning with Mental Health Counseling Data
By
•
•
3
posted
an
update
9 days ago
Post
1541
Llama 4 is here and it's making serious waves!
After diving into the latest benchmark results, it’s clear that Meta’s new Llama 4 lineup (Maverick, Scout, and Behemoth) is no joke.
Here are a few standout highlights🔍:
Llama 4 Maverick hits the sweet spot between cost and performance
- Outperforms GPT-4o in image tasks like ChartQA (90.0 vs 85.7) and DocVQA (94.4 vs 92.8)
- Beats others in MathVista and MMLU Pro too and at a fraction of the cost ($0.19–$0.49 vs $4.38 🤯)
Llama 4 Scout is lean, cost-efficient, and surprisingly capable
- Strong performance across image and language tasks (e.g. ChartQA: 88.8, DocVQA: 94.4)
- More affordable than most competitors and still beats out larger models like Gemini 2.0 Flash-Lite
Llama 4 Behemoth is the heavy hitter.
- Tops the charts in LiveCodeBench (49.4), MATH-500 (95.0), and MMLU Pro (82.2)
- Even edges out Claude 3 Sonnet and Gemini 2 Pro in multiple areas
Meta didn’t just show up, they delivered across multimodal, coding, reasoning, and multilingual benchmarks.
And honestly? Seeing this level of performance, especially at lower inference costs, is a big deal for anyone building on LLMs.
Curious to see how these models do in real-world apps next.
#AI #Meta #Llama4 #LLMs #Benchmarking #MachineLearning #OpenSourceAI #GenerativeAI

posted
an
update
2 months ago
Post
3253
Hugging Face just launched the AI Agents Course – a free journey from beginner to expert in AI agents!
- Learn AI Agent fundamentals, use cases and frameworks
- Use top libraries like LangChain & LlamaIndex
- Compete in challenges & earn a certificate
- Hands-on projects & real-world applications
https://huggingface.co/learn/agents-course/unit0/introduction
You can join for a live Q&A on Feb 12 at 5PM CET to learn more about the course here
https://www.youtube.com/live/PopqUt3MGyQ
- Learn AI Agent fundamentals, use cases and frameworks
- Use top libraries like LangChain & LlamaIndex
- Compete in challenges & earn a certificate
- Hands-on projects & real-world applications
https://huggingface.co/learn/agents-course/unit0/introduction
You can join for a live Q&A on Feb 12 at 5PM CET to learn more about the course here
https://www.youtube.com/live/PopqUt3MGyQ

upvoted
an
article
2 months ago
Article
Fine-Tuning 1B LLaMA 3.2: A Comprehensive Step-by-Step Guide with Code
By
•
•
64
posted
an
update
4 months ago
Post
697
Deep understanding of (C-index) evaluation measure for better model
Lets start with three patients groups:
Group A
Group B
Group C
For each patient, we will predict risk score (higher score means higher risk of early event).
Step 1: Understanding Concordance Index
The Concordance Index (C-index) evaluate that how well the model ranks survival times.
Understand with sample data:
Group A has 3 patients with actual survival times and predicted risk scores:
Patient Actual Survival Time Predicted Risk Score
P1 5 months 0.8
P2 3 months 0.9
P3 10 months 0.2
Comparable pairs:
(P1, P2): P2 has a shorter survival time and a higher risk score → Concordant ✅
(P1, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
(P2, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
Total pairs = 3
Total concordant pairs = 3
C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0
Step 2: Calculate C-index for All Groups
Repeat the process for all groups. For now we can assume:
Group A: C-index = 1.0
Group B: C-index = 0.8
Group C: C-index = 0.6
Step 3: Stratified Concordance Index
The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:
Average performance across groups (mean of C-indices).
Consistency across groups (low standard deviation of C-indices).
Formula:
Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)
Calculate the mean:
Mean=1.0 + 0.8 + 0.6/3 = 0.8
Calculate the standard deviation:
Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16
Stratified C-index:
Stratified C-index = 0.8 - 0.16 = 0.64
Step 4: Interpret the Results
A high Stratified C-index means:
The model predicts well overall (high mean C-index).
Lets start with three patients groups:
Group A
Group B
Group C
For each patient, we will predict risk score (higher score means higher risk of early event).
Step 1: Understanding Concordance Index
The Concordance Index (C-index) evaluate that how well the model ranks survival times.
Understand with sample data:
Group A has 3 patients with actual survival times and predicted risk scores:
Patient Actual Survival Time Predicted Risk Score
P1 5 months 0.8
P2 3 months 0.9
P3 10 months 0.2
Comparable pairs:
(P1, P2): P2 has a shorter survival time and a higher risk score → Concordant ✅
(P1, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
(P2, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
Total pairs = 3
Total concordant pairs = 3
C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0
Step 2: Calculate C-index for All Groups
Repeat the process for all groups. For now we can assume:
Group A: C-index = 1.0
Group B: C-index = 0.8
Group C: C-index = 0.6
Step 3: Stratified Concordance Index
The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:
Average performance across groups (mean of C-indices).
Consistency across groups (low standard deviation of C-indices).
Formula:
Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)
Calculate the mean:
Mean=1.0 + 0.8 + 0.6/3 = 0.8
Calculate the standard deviation:
Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16
Stratified C-index:
Stratified C-index = 0.8 - 0.16 = 0.64
Step 4: Interpret the Results
A high Stratified C-index means:
The model predicts well overall (high mean C-index).

reacted to
dyyyyyyyy's
post with 🔥
6 months ago
Post
1363
📊 We present ScaleQuest-Math-1M, a mathematical reasoning dataset of 1 million high-quality question-answer pairs.
🔥 We propose ScaleQuest, a scalable and novel data synthesis method that utilizes small-size open-source models to generate questions from scratch.
Project Page: https://scalequest.github.io/
Dataset: dyyyyyyyy/ScaleQuest-Math
Paper: Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch (2410.18693)
HF Collection: dyyyyyyyy/scalequest-670a7dc2623c91990f28913b
🔥 We propose ScaleQuest, a scalable and novel data synthesis method that utilizes small-size open-source models to generate questions from scratch.
Project Page: https://scalequest.github.io/
Dataset: dyyyyyyyy/ScaleQuest-Math
Paper: Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch (2410.18693)
HF Collection: dyyyyyyyy/scalequest-670a7dc2623c91990f28913b

posted
an
update
6 months ago
Post
733
Easy steps for an effective RAG pipeline with LLM models!
1. Document Embedding & Indexing
We can start with the use of embedding models to vectorize documents, store them in vector databases (Elasticsearch, Pinecone, Weaviate) for efficient retrieval.
2. Smart Querying
Then we can generate query embeddings, retrieve top-K relevant chunks and can apply hybrid search if needed for better precision.
3. Context Management
We can concatenate retrieved chunks, optimize chunk order and keep within token limits to preserve response coherence.
4. Prompt Engineering
Then we can instruct the LLM to leverage retrieved context, using clear instructions to prioritize the provided information.
5. Post-Processing
Finally we can implement response verification, fact-checking and integrate feedback loops to refine the responses.
Happy to connect :)
1. Document Embedding & Indexing
We can start with the use of embedding models to vectorize documents, store them in vector databases (Elasticsearch, Pinecone, Weaviate) for efficient retrieval.
2. Smart Querying
Then we can generate query embeddings, retrieve top-K relevant chunks and can apply hybrid search if needed for better precision.
3. Context Management
We can concatenate retrieved chunks, optimize chunk order and keep within token limits to preserve response coherence.
4. Prompt Engineering
Then we can instruct the LLM to leverage retrieved context, using clear instructions to prioritize the provided information.
5. Post-Processing
Finally we can implement response verification, fact-checking and integrate feedback loops to refine the responses.
Happy to connect :)