John Smith's picture

John Smith PRO

John6666

AI & ML interests

None yet

Recent Activity

reacted to tomaarsen's post with 🔥 about 8 hours ago
‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think. 1️⃣ Reranker Training Refactor Reranker models can now be trained using an extensive trainer with a lot of powerful features: - MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP)) - bf16 training support; loss logging - Evaluation datasets + evaluation loss - Improved callback support + an excellent Weights & Biases integration - Gradient checkpointing, gradient accumulation - Model card generation - Resuming from a training checkpoint without performance loss - Hyperparameter Optimization and much more! Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade. 2️⃣ New Reranker Losses - 11 new losses: - 2 traditional losses: BinaryCrossEntropy and CrossEntropy - 2 distillation losses: MSE and MarginMSE - 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL - 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE 3️⃣ New Reranker Documentation - New Training Overview, Loss Overview, API Reference docs - 5 new, 1 refactored training examples docs pages - 13 new, 6 refactored training scripts - Migration guides (2.x -> 3.x, 3.x -> 4.x) 4️⃣ Blogpost Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1
View all activity

Organizations

open/ acc's profile picture Solving Real World Problems's profile picture FashionStash Group meeting's profile picture No More Copyright's profile picture

John6666's activity

reacted to clem's post with 🤗 about 8 hours ago
upvoted an article about 8 hours ago
view article
Article

Training and Finetuning Reranker Models with Sentence Transformers v4

23
reacted to tomaarsen's post with 🔥 about 8 hours ago
view post
Post
441
‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think.

1️⃣ Reranker Training Refactor
Reranker models can now be trained using an extensive trainer with a lot of powerful features:
- MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP))
- bf16 training support; loss logging
- Evaluation datasets + evaluation loss
- Improved callback support + an excellent Weights & Biases integration
- Gradient checkpointing, gradient accumulation
- Model card generation
- Resuming from a training checkpoint without performance loss
- Hyperparameter Optimization
and much more!

Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker
Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade.

2️⃣ New Reranker Losses
- 11 new losses:
- 2 traditional losses: BinaryCrossEntropy and CrossEntropy
- 2 distillation losses: MSE and MarginMSE
- 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL
- 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE

3️⃣ New Reranker Documentation
- New Training Overview, Loss Overview, API Reference docs
- 5 new, 1 refactored training examples docs pages
- 13 new, 6 refactored training scripts
- Migration guides (2.x -> 3.x, 3.x -> 4.x)

4️⃣ Blogpost
Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1
reacted to Smooke's post with 👀 about 8 hours ago
view post
Post
174
Meet the Bot That Reads All the Bad News Headlines So You Don’t Have To https://hackernoon.com/meet-the-bot-that-reads-all-the-bad-news-headlines-so-you-dont-have-to

And More Top https://hackernoon.com/ blogs today! https://hackernoon.com/p/publish

The Truth About Senior Engineering at FAANG—It’s Not What You Expect https://hackernoon.com/the-truth-about-senior-engineering-at-faangits-not-what-you-expect

Ex-ICE Agent Urges Congress to Double Down on Immigration Surveillance Tech https://hackernoon.com/ex-ice-agent-urges-congress-to-double-down-on-immigration-surveillance-tech

Too Many AIs With Too Many Terrible Names: How to Choose Your AI Model https://hackernoon.com/too-many-ais-with-too-many-terrible-names-how-to-choose-your-ai-model

Nobody Wants to Pay for Apps Anymore—Except When AI Is Involved https://hackernoon.com/nobody-wants-to-pay-for-apps-anymoreexcept-when-ai-is-involved

Nvidia GTC 2025: AI Goes Big, Robots Get Smarter, and GPUs Rule the World https://hackernoon.com/nvidia-gtc-2025-ai-goes-big-robots-get-smarter-and-gpus-rule-the-world

C++ Metaprogramming: Compilation of Calculations, from Basic Techniques to Advanced Methods https://hackernoon.com/c-metaprogramming-compilation-of-calculations-from-basic-techniques-to-advanced-methods

Has Google Made a $32 Billion Cloud Security Blunder? https://hackernoon.com/has-google-made-a-$32-billion-cloud-security-blunder

Ripple in Time: Is XRP About to Go Parabolic in 2025? https://hackernoon.com/ripple-in-time-is-xrp-about-to-go-parabolic-in-2025

Your Next Tech Job? Vibe Coding https://hackernoon.com/your-next-tech-job-vibe-coding

Ethereum Block Building: The Hidden Economy Behind Every Transaction https://hackernoon.com/ethereum-block-building-the-hidden-economy-behind-every-transaction

Building a Robust JS/TS Monorepo: Best Practices with Yarn, NX and Changesets https://hackernoon.com/building-a-robust-jsts-monorepo-best-practices-with-yarn-nx-and-changesets
reacted to kshitizkhanal7's post with 👍 about 8 hours ago
reacted to giadap's post with 🔥 about 8 hours ago
view post
Post
439
We've all become experts at clicking "I agree" without a second thought. In my latest blog post, I explore why these traditional consent models are increasingly problematic in the age of generative AI.

I found three fundamental challenges:
- Scope problem: how can you know what you're agreeing to when AI could use your data in different ways?
- Temporality problem: once an AI system learns from your data, good luck trying to make it "unlearn" it.
- Autonomy trap: the data you share today could create systems that pigeonhole you tomorrow.

Individual users shouldn't bear all the responsibility, while big tech holds all the cards. We need better approaches to level the playing field, from collective advocacy and stronger technological safeguards to establishing "data fiduciaries" with a legal duty to protect our digital interests.

Available here: https://huggingface.co/blog/giadap/beyond-consent
reacted to nyuuzyou's post with 👍 about 8 hours ago
view post
Post
266
📚 Archive of Our Own (AO3) Dataset - nyuuzyou/archiveofourown

Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring:
- Full text content from diverse fandoms across television, film, books, anime, and more
- Comprehensive metadata including warnings, relationships, characters, and tags
- Multilingual content with works in 40+ languages though English predominant
- Rich classification data preserving author-created folksonomy and content categorization

P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!