Join the conversation
Join the community of Machine Learners and AI enthusiasts.
Sign UpAll HF Hub posts
Post
2367
Announcement! We have made significant progress in our efforts to replicate OpenAI O1 based on the AlphaGo Zero architectureβLLaMA-O1. We have successfully enabled the model to acquire advanced thinking skills through interaction with the search tree during the learning process without human annotations.
We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.
Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
We plan to complete the model training and evaluation no later than the end of November and will release all data, models, and code to the community.
Past related papers:
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning (2410.02884)
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B (2406.07394)
For the linear representation format of Long COT (OpenLongCoT), please refer:
qq8933/OpenLongCoT-Pretrain
qq8933/OpenLongCoT-SFT
ajibawa-2023Β
posted an update
about 12 hours ago
Post
1023
New Dataset: Software-Architecture
Link: ajibawa-2023/Software-Architecture
I am releasing a Large Dataset covering topics related to Software-Architecture. This dataset consists of around 450,000 lines of data in jsonl.
I have included following topics:
Architectural Frameworks
Architectural Patterns for Reliability
Architectural Patterns for Scalability
Architectural Patterns
Architectural Quality Attributes
Architectural Testing
Architectural Views
Architectural Decision-Making
Advanced Research
Cloud-Based Architectures
Component-Based Architecture
Data Architecture
Emerging Trends
Event-Driven Architecture
Evolvability and Maintainability
Microservices and Monolithic
Microservices Architecture
Security Architecture
Service-Oriented Architecture
Software Design Principles
and Many More!
This dataset is useful in LLM development. Also those who are working on developing Software development related LLMs then this dataset can be useful.
This dataset is very useful to Researchers as well.
Link: ajibawa-2023/Software-Architecture
I am releasing a Large Dataset covering topics related to Software-Architecture. This dataset consists of around 450,000 lines of data in jsonl.
I have included following topics:
Architectural Frameworks
Architectural Patterns for Reliability
Architectural Patterns for Scalability
Architectural Patterns
Architectural Quality Attributes
Architectural Testing
Architectural Views
Architectural Decision-Making
Advanced Research
Cloud-Based Architectures
Component-Based Architecture
Data Architecture
Emerging Trends
Event-Driven Architecture
Evolvability and Maintainability
Microservices and Monolithic
Microservices Architecture
Security Architecture
Service-Oriented Architecture
Software Design Principles
and Many More!
This dataset is useful in LLM development. Also those who are working on developing Software development related LLMs then this dataset can be useful.
This dataset is very useful to Researchers as well.
singhsidhukuldeepΒ
posted an update
1 day ago
Post
1980
Good folks from
@Microsoft
have released an exciting breakthrough in GUI automation!
OmniParser β a game-changing approach for pure vision-based GUI agents that works across multiple platforms and applications.
Key technical innovations:
- Custom-trained interactable icon detection model using 67k screenshots from popular websites
- Specialized BLIP-v2 model fine-tuned on 7k icon-description pairs for extracting functional semantics
- Novel combination of icon detection, OCR, and semantic understanding to create structured UI representations
The results are impressive:
- Outperforms GPT-4V baseline by significant margins on the ScreenSpot benchmark
- Achieves 73% accuracy on Mind2Web without requiring HTML data
- Demonstrates a 57.7% success rate on AITW mobile tasks
What makes OmniParser special is its ability to work across platforms (mobile, desktop, web) using only screenshot data β no HTML or view hierarchy needed. This opens up exciting possibilities for building truly universal GUI automation tools.
The team has open-sourced both the interactable region detection dataset and icon description dataset to accelerate research in this space.
Kudos to the Microsoft Research team for pushing the boundaries of what's possible with pure vision-based GUI understanding!
What are your thoughts on vision-based GUI automation?
OmniParser β a game-changing approach for pure vision-based GUI agents that works across multiple platforms and applications.
Key technical innovations:
- Custom-trained interactable icon detection model using 67k screenshots from popular websites
- Specialized BLIP-v2 model fine-tuned on 7k icon-description pairs for extracting functional semantics
- Novel combination of icon detection, OCR, and semantic understanding to create structured UI representations
The results are impressive:
- Outperforms GPT-4V baseline by significant margins on the ScreenSpot benchmark
- Achieves 73% accuracy on Mind2Web without requiring HTML data
- Demonstrates a 57.7% success rate on AITW mobile tasks
What makes OmniParser special is its ability to work across platforms (mobile, desktop, web) using only screenshot data β no HTML or view hierarchy needed. This opens up exciting possibilities for building truly universal GUI automation tools.
The team has open-sourced both the interactable region detection dataset and icon description dataset to accelerate research in this space.
Kudos to the Microsoft Research team for pushing the boundaries of what's possible with pure vision-based GUI understanding!
What are your thoughts on vision-based GUI automation?
albertvillanovaΒ
posted an update
about 9 hours ago
Post
660
π Exciting update! You can now compare multiple models side-by-side with the Hugging Face Open LLM Comparator! π
open-llm-leaderboard/comparator
Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?
open-llm-leaderboard/comparator
Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?
Post
878
Hello, researchers! I've tried to made reading HF Daily Papers easier and made a tool that does reviews with LLMs like Claude 3.5, GPT-4o and sometimes FLUX.
π Classification by topics
π Sorting by publication date and HF addition date
π Syncing every 2 hours
π» Hosted on GitHub
π English, Russian, and Chinese
π Top by week/month (in progress)
π https://hfday.ru
Let me know what do you think of it.
π Classification by topics
π Sorting by publication date and HF addition date
π Syncing every 2 hours
π» Hosted on GitHub
π English, Russian, and Chinese
π Top by week/month (in progress)
π https://hfday.ru
Let me know what do you think of it.
Post
2648
Last Week in Medical AI: Top Research Papers/Models π₯
π (October 19-26, 2024)
π Medical AI Paper of the Week:
Safety principles for medical summarization using generative AI by Google
Medical LLM & Other Models:
- BioMistral-NLU: Medical Vocab Understanding
- Bilingual Multimodal LLM for Biomedical Tasks
- Metabolic-Enhanced LLMs for Clinical Analysis
- Dermatology Foundation Model
Frameworks and Methodologies:
- Back-in-Time: Medical Deepfake Detection
- Hybrid GenAI for Crystal Design
- VISAGE: Video Synthesis for Surgery
- MoRE: Multi-Modal X-Ray/ECG Pretraining
- SleepCoT: Personalized Health via CoT
Medical LLM Applications:
- ONCOPILOT: CT Model for Tumors
- LMLPA: Linguistic Personality Assessment
- GenAI for Medical Training
Medical LLMs & Benchmarks:
- LLM Evaluation Through Explanations
- Contrastive Decoding for Medical LLM Hallucination
AI in Healthcare Ethics:
- Healthcare XAI Through Storytelling
- Clinical LLM Bias Analysis
- ReflecTool: Reflection-Aware Clinical Agents
Now you can watch and listen to the latest Medical AI papers daily on our YouTube and Spotify channels as well!
- ποΈ Spotify: https://podcasters.spotify.com/pod/show/medicalai/episodes/Medical-AI-Weekly-Digest-From-Deepfake-Detection-to-Clinical-LLMs-Oct-19-26--Part-1-e2q6012
- YouTube: https://youtu.be/Wt5QOv1vk2U
π (October 19-26, 2024)
π Medical AI Paper of the Week:
Safety principles for medical summarization using generative AI by Google
Medical LLM & Other Models:
- BioMistral-NLU: Medical Vocab Understanding
- Bilingual Multimodal LLM for Biomedical Tasks
- Metabolic-Enhanced LLMs for Clinical Analysis
- Dermatology Foundation Model
Frameworks and Methodologies:
- Back-in-Time: Medical Deepfake Detection
- Hybrid GenAI for Crystal Design
- VISAGE: Video Synthesis for Surgery
- MoRE: Multi-Modal X-Ray/ECG Pretraining
- SleepCoT: Personalized Health via CoT
Medical LLM Applications:
- ONCOPILOT: CT Model for Tumors
- LMLPA: Linguistic Personality Assessment
- GenAI for Medical Training
Medical LLMs & Benchmarks:
- LLM Evaluation Through Explanations
- Contrastive Decoding for Medical LLM Hallucination
AI in Healthcare Ethics:
- Healthcare XAI Through Storytelling
- Clinical LLM Bias Analysis
- ReflecTool: Reflection-Aware Clinical Agents
Now you can watch and listen to the latest Medical AI papers daily on our YouTube and Spotify channels as well!
- ποΈ Spotify: https://podcasters.spotify.com/pod/show/medicalai/episodes/Medical-AI-Weekly-Digest-From-Deepfake-Detection-to-Clinical-LLMs-Oct-19-26--Part-1-e2q6012
- YouTube: https://youtu.be/Wt5QOv1vk2U
MonsterMMORPGΒ
posted an update
3 days ago
Post
3420
Stability AI published their most power newest model Stable Diffusion 3.5 Large. This model unlike FLUX is full model not distilled and has huge potential. I have done extensive research and publishing all of it in this video regarding how to use SD 3.5 Large with the best settings. Moreover, I am sharing how to use FLUX DEV with the best possible configuration as well. Moreover, I am making a huge comparison between SD 3.5 and FLUX and you are going to learn who is the winner.
https://youtu.be/-zOKhoO9a5s
62 Prompts tested on all experiments to find best Sampler + Scheduler for Stable Diffusion 3.5 Large and SD 3.5 Large vs FLUX DEV > https://youtu.be/-zOKhoO9a5s
FLUX Dev vs SD 3.5 Large fully compared.
SD 3.5 Large FP16 vs Scaled FP8 fully compared.
T5 XXL FP8 vs Scaled FP8 vs FP16 fully compared.
FLUX FP16 vs Scaled FP8 fully compared.
Also how to install SwarmUI on Windows, Massed Compute and RunPod shown in the tutorial.
I have shown how to use FLUX and SD 3.5 Large in details as well.
https://youtu.be/-zOKhoO9a5s
62 Prompts tested on all experiments to find best Sampler + Scheduler for Stable Diffusion 3.5 Large and SD 3.5 Large vs FLUX DEV > https://youtu.be/-zOKhoO9a5s
FLUX Dev vs SD 3.5 Large fully compared.
SD 3.5 Large FP16 vs Scaled FP8 fully compared.
T5 XXL FP8 vs Scaled FP8 vs FP16 fully compared.
FLUX FP16 vs Scaled FP8 fully compared.
Also how to install SwarmUI on Windows, Massed Compute and RunPod shown in the tutorial.
I have shown how to use FLUX and SD 3.5 Large in details as well.
Post
3588
This is no Woodstock AI but will be fun nonetheless haha. Iβll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.
1,000 spots available first-come first serve with some surprises during the stream!
You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
1,000 spots available first-come first serve with some surprises during the stream!
You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
Post
223
π NYT leveraged AI to investigate election interference by analyzing 400+ hours of recorded meetings - that's 5M words of data!
AI spotted patterns, humans verified facts. Every AI-flagged quote was manually verified against source recordings. Really appreciate that they published their full methodology - transparency matters when using AI in journalism.
A perfect blend of tech & journalism.
The future of journalism isn't robots replacing reporters - it's AI helping humans process massive datasets more efficiently. Sometimes the most powerful tech solutions are the least flashy ones.
Read the article: https://www.nytimes.com/interactive/2024/10/28/us/politics/inside-the-movement-behind-trumps-election-lies.html?unlocked_article_code=1.Vk4.ucv9.dbHVquTQaf0G&smid=nytcore-ios-share
AI spotted patterns, humans verified facts. Every AI-flagged quote was manually verified against source recordings. Really appreciate that they published their full methodology - transparency matters when using AI in journalism.
A perfect blend of tech & journalism.
The future of journalism isn't robots replacing reporters - it's AI helping humans process massive datasets more efficiently. Sometimes the most powerful tech solutions are the least flashy ones.
Read the article: https://www.nytimes.com/interactive/2024/10/28/us/politics/inside-the-movement-behind-trumps-election-lies.html?unlocked_article_code=1.Vk4.ucv9.dbHVquTQaf0G&smid=nytcore-ios-share