Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
4
51
Hu Zang
zanghu
Follow
AI & ML interests
None yet
Recent Activity
reacted
to
joaogante
's
post
with š¤
3 days ago
New sampling strategy dropped in š¤ transformers -- Min P sampling š„ Are you tired of having `top_k` arbitrarily discarding high-quality continuations? Or `top_p` forgetting to exclude low-probability tokens, derailing your generation? Try out the new `min_p` flag in `generate`, fresh from a PR merged today! š„¬ Min P consists of a dynamic token filter -- as opposed to Top K, which keeps the K most likely tokens, and Top P, which keeps the most likely tokens up to a fixed cumulative probability, both static filters. Min P takes a base probability (defined in the `min_p` flag) and multiplies it by the probability of the most likely token in the distribution for the next token. All tokens less likely than the resulting value are filtered. What happens with this strategy? š High probability token present -> aggressive filter (we don't want to miss on that high-probability case and risk derailing generation) š No high probability token present -> relaxed filter (there are many continuation possibilities that the model finds plausible) You should set `min_p` to a low value, between 0.05 and 0.1. It behaves particularly well for creative text generation when paired up with temperature > 1. Kudos to @kalomaze and @menhguin for creating this technique š„ Read their discussion in the original issue for benchmarks (https://github.com/huggingface/transformers/issues/27670) Copy-pasteable version of the example in the image below here: https://pastebin.com/VqXNtuxd Have fun experimenting! š
reacted
to
joaogante
's
post
with š
3 days ago
New sampling strategy dropped in š¤ transformers -- Min P sampling š„ Are you tired of having `top_k` arbitrarily discarding high-quality continuations? Or `top_p` forgetting to exclude low-probability tokens, derailing your generation? Try out the new `min_p` flag in `generate`, fresh from a PR merged today! š„¬ Min P consists of a dynamic token filter -- as opposed to Top K, which keeps the K most likely tokens, and Top P, which keeps the most likely tokens up to a fixed cumulative probability, both static filters. Min P takes a base probability (defined in the `min_p` flag) and multiplies it by the probability of the most likely token in the distribution for the next token. All tokens less likely than the resulting value are filtered. What happens with this strategy? š High probability token present -> aggressive filter (we don't want to miss on that high-probability case and risk derailing generation) š No high probability token present -> relaxed filter (there are many continuation possibilities that the model finds plausible) You should set `min_p` to a low value, between 0.05 and 0.1. It behaves particularly well for creative text generation when paired up with temperature > 1. Kudos to @kalomaze and @menhguin for creating this technique š„ Read their discussion in the original issue for benchmarks (https://github.com/huggingface/transformers/issues/27670) Copy-pasteable version of the example in the image below here: https://pastebin.com/VqXNtuxd Have fun experimenting! š
liked
a Space
16 days ago
bigcode/bigcode-models-leaderboard
View all activity
Organizations
None yet
zanghu
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
2 Spaces
16 days ago
Running
1.19k
1.19k
Big Code Models Leaderboard
š
Submit code models for evaluation on benchmarks
Running
185
185
BigCodeBench Leaderboard
š„
Explore and analyze code evaluation data
liked
a model
about 1 month ago
openai-community/gpt2
Text Generation
ā¢
Updated
Feb 19, 2024
ā¢
17.1M
ā¢
ā¢
2.62k
liked
2 datasets
2 months ago
Daoguang/Multi-SWE-bench
Viewer
ā¢
Updated
Sep 3, 2024
ā¢
91
ā¢
536
ā¢
6
princeton-nlp/SWE-bench
Viewer
ā¢
Updated
11 days ago
ā¢
21.5k
ā¢
53.3k
ā¢
99
liked
a Space
9 months ago
Build error
3
3
Face Forgery Detection
š
liked
a model
12 months ago
BAAI/bge-reranker-large
Feature Extraction
ā¢
Updated
May 11, 2024
ā¢
904k
ā¢
ā¢
380
liked
a model
about 1 year ago
thenlper/gte-large-zh
Sentence Similarity
ā¢
Updated
Feb 5, 2024
ā¢
53k
ā¢
ā¢
104
liked
a model
over 1 year ago
ChanceFocus/finma-7b-full
Text Generation
ā¢
Updated
Sep 14, 2023
ā¢
247
ā¢
19
liked
4 datasets
over 1 year ago
zyznull/dureader-retrieval-corpus
Viewer
ā¢
Updated
Jan 3, 2023
ā¢
8.19M
ā¢
486
ā¢
3
microsoft/LCC_java
Viewer
ā¢
Updated
Jun 21, 2023
ā¢
120k
ā¢
271
ā¢
5
google-research-datasets/mbpp
Viewer
ā¢
Updated
Jan 4, 2024
ā¢
1.4k
ā¢
93.7k
ā¢
166
openai/openai_humaneval
Viewer
ā¢
Updated
Jan 4, 2024
ā¢
164
ā¢
99k
ā¢
284
liked
3 models
over 1 year ago
meta-llama/Llama-2-7b
Text Generation
ā¢
Updated
Apr 17, 2024
ā¢
4.27k
codellama/CodeLlama-34b-hf
Text Generation
ā¢
Updated
Apr 12, 2024
ā¢
9.53k
ā¢
169
defog/sqlcoder
Text Generation
ā¢
Updated
Mar 1, 2024
ā¢
2.08k
ā¢
319
liked
2 datasets
over 1 year ago
OpenAssistant/oasst1
Viewer
ā¢
Updated
May 2, 2023
ā¢
88.8k
ā¢
9.86k
ā¢
1.36k
tiiuae/falcon-refinedweb
Viewer
ā¢
Updated
Jun 20, 2023
ā¢
968M
ā¢
53k
ā¢
841
liked
2 models
over 1 year ago
Spico/Humback-M0
Text Generation
ā¢
Updated
Aug 18, 2023
ā¢
22
ā¢
3
Spico/Humback-Myx
Text Generation
ā¢
Updated
Aug 19, 2023
ā¢
20
ā¢
3
Load more