Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

KingNish

posted an update 28 minutes ago

Post

Decoding GPT-4'o': Its Mechanisms and Creating Similar AI.

𝗥𝗲𝗮𝗱 𝗙𝘂𝗹𝗹 𝐀𝐫𝐭𝐢𝐜𝐥𝐞: https://huggingface.co/blog/KingNish/decoding-gpt-4o

𝐒𝐮𝐦𝐦𝐚𝐫𝐲 𝐨𝐟 𝐀𝐫𝐭𝐢𝐜𝐥𝐞- 📝
# 𝐌𝐞𝐜𝐡𝐚𝐧𝐢𝐜𝐬 𝐨𝐟 𝐆𝐏𝐓-𝟒’𝐨’: GPT-4’o’ operates through three main components 🛠️

𝟏. 𝐒𝐮𝐩𝐞𝐫𝐂𝐡𝐚𝐭: Integrates image generation, QnA (image, document and video) for diverse interactions.
𝟐. 𝐕𝐨𝐢𝐜𝐞 𝐂𝐡𝐚𝐭: Merges TTS and STT for real-time, human-like audio responses, focusing on human interaction.
𝟑. 𝐕𝐢𝐝𝐞𝐨 𝐂𝐡𝐚𝐭: Utilizes Zero Shot Image Classification to enhance user interaction with visual information.

# 𝐌𝐞𝐭𝐡𝐨𝐝𝐬 𝐭𝐨 𝐂𝐫𝐞𝐚𝐭𝐞 𝐒𝐢𝐦𝐢𝐥𝐚𝐫 𝐀𝐈 🧠

𝟏. 𝐌𝐮𝐥𝐭𝐢𝐌𝐨𝐝𝐚𝐥𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧: Combines multiple models for a powerful, multifunctional AI.
𝟐. 𝐃𝐮𝐜𝐭 𝐓𝐚𝐩𝐞 𝐌𝐞𝐭𝐡𝐨𝐝: Uses different models or APIs for specific tasks without additional training.

The article provides an in-depth exploration of GPT-4’o’, its functionalities, and methods to create similar AI models. It emphasizes the model’s language support and its innovative approach to human-AI interaction. 💡🌐

(𝙉𝙊𝙏𝙀: 𝙎𝙪𝙢𝙢𝙖𝙧𝙮 𝙞𝙨 𝘼𝙄 𝙜𝙚𝙣𝙚𝙧𝙖𝙩𝙚𝙙) ✅

leonardlin

posted an update about 9 hours ago

Post

610

With slurm figured out and ablations humming along, I though I'd update and post my understanding of the legal status of training data in Japan. It is in general, much clearer in the US: https://huggingface.co/blog/leonardlin/ai-training-data-in-japan

singhsidhukuldeep

posted an update about 13 hours ago

Post

860

🎭 You picked an LLM for your work but then you find out it hallucinates! 🤖

🤔 Your first thought might be to fine-tune it on more training data.... but should you? 🛠️

📜 This is what @Google is exploring in the paper "Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?" 🕵️‍♂️

📘 When LLMs undergo supervised fine-tuning with new factual knowledge not present in their initial training data, there is a risk they might "hallucinate" or produce factually incorrect information. 🚨

🔍 The paper investigates how fine-tuning LLMs with new facts influences their ability to leverage pre-existing knowledge and the extent to which they generate errors. 📊

⚙️Technical Setup:

🔧 Approach: They introduce a system named SliCK (this stands for Sampling-based Categorization of Knowledge, don't even bother understanding how) to categorize knowledge into four levels (HighlyKnown, MaybeKnown, WeaklyKnown, and Unknown) based on how well the model's generated responses agree with known facts. 🗂️

📝 Experimental Setup: The study uses a controlled setup focusing on closed-book QA, adjusting the proportion of fine-tuning examples that introduce new facts versus those that do not. 🧪

👉 Here is the gist of the findings:

🚸 LLMs struggle to integrate new factual knowledge during fine-tuning, and such examples are learned slower than those consistent with the model's pre-existing knowledge. 🐢

📈 As LLMs learn from examples containing new knowledge, their propensity to hallucinate increases. 👻

⏱️ Early stopping during training can mitigate the risks of hallucinations by minimizing exposure to unlearned new facts. 🛑

🧠 Training LLMs mostly with known examples leads to better utilization of pre-existing knowledge, whereas examples introducing new knowledge increase the risk of generating incorrect information. 🏗️

📄 Paper: Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? (2405.05904) 📚

1 reply

monsoon-nlp

posted an update about 17 hours ago

Post

661

A few people on RealtimeQA (a continuous news-based Q&A task) noted that the project has tapered off on adding quizzes, so here's my tribute spinoff, scraping WikiNews and generating questions with GPT-4o: monsoon-nlp/relive-qa
RealTime QA: What's the Answer Right Now? (2207.13332)

Salama1429

posted an update about 23 hours ago

Post

775

Free Guide: How to Fine-Tune and Prompt Engineer LLMs

While some of the most forward-thinking companies in the world are already using LLMs, few organizations have the bandwidth, compute, or money to train foundational models in-house. It’s become much more common to either fine-tune or prompt engineer existing LLMs for unique business needs. In this guide, you’ll learn:

• How to choose between fine-tuning and prompting
• Popular fine-tuning strategies and their trade-offs
• Tasks where fine-tuning excels vs. ones where it doesn’t
• Tips and current best practices for prompt engineering
• And a whole lot more!

Link: https://wandb.ai/site/resources/whitepapers/llm-fine-tuning

Walmart-the-bag

posted an update 1 day ago

Post

790

Mm what a good time for a new merge!

This is a merge of 6 models that were finetuned on llama3 8b. This has done pretty decent on some coding tasks, for the parameter size. I have looked through models because a lot of people cannot run 33B models (deepseek) for coding.

Original Model: Walmart-the-bag/Llama-3-LizardCoder-8B

GGUF: Walmart-the-bag/Llama-3-LizardCoder-8B-GGUF

KingNish

posted an update 1 day ago

Post

1339

New Updates OpenGPT 4o
1. Live Chat (also known as video chat) (very powerful and fast, it can even identify famous places and persons)
2. Powerful Image Generation

Test and give feedback of New features:
KingNish/OpenGPT-4o

Future Updates
1. PDF Chat
2. Human like speech (Using Parler tts expresso)
3. Multilingual support for voice chat

Suggest more features that should be added. 🤗

Edit: Live Chat is now very powerful (than prev)

9 replies

singhsidhukuldeep

posted an update 1 day ago

Post

1023

How many times have you said Pandas is slow and still kept on using it? 🐼💨

Get ready to say Pandas can be fast but it's expensive 😂

🙌 Original Limitations:

💻 CPU-Bound Processing: Traditional pandas operations are CPU-bound (mostly single-threaded😰), leading to slower processing of large datasets.

🧠 Memory Constraints: Handling large datasets in memory-intensive operations can lead to inefficiencies and limitations.

𝌣 Achievements with @nvidia RAPIDS cuDF:

🚀 GPU Acceleration: RAPIDS cuDF leverages GPU computing. Users switch to GPU-accelerated operations without modifying existing pandas code.

🔄 Unified Workflows: Seamlessly integrates GPU and CPU operations, falling back to CPU when necessary.

📈 Optimized Performance: With extreme parallel operation opportunity of GPUs, this achieves up to 150x speedup in data processing, demonstrated through benchmarks like DuckDB.

😅New Limitations:

🎮 GPU Availability: Requires a GPU (not everything should need a GPU)

🔄 Library Compatibility: Currently in the initial stages, all the functionality cannot be ported

🐢 Data Transfer Overhead: Moving data between CPU and GPU can introduce latency if not managed efficiently. As some operations still run on the CPU.

🤔 User Adoption: We already had vectorization support in Pandas, people just didn't use it as it was difficult to implement. We already had DASK for parallelization. It's not that solutions didn't exist

Blog: https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/

For Jupyter Notebooks:

%load_ext cudf.pandas
import pandas as pd

For python scripts:

python -m cudf.pandas script.py

kaisugi

posted an update 1 day ago

Post

759

🚀 Stockmark-100b

Stockmark Inc. has developed and released one of Japan's largest commercial-scale Language Models (LLM) with 100 billion parameters, named "Stockmark-LLM-100b". This model significantly reduces hallucinations and provides accurate responses to complex business-related queries. Developed from scratch with a focus on Japanese business data, the model aims to be reliable for high-stakes business environments. It's open-source and available for commercial use.

Key highlights:
- The model reduces hallucinations—incorrect confident responses that AI models sometimes generate.
- Stockmark-LLM-100b can answer basic business questions and specialized queries in industries like manufacturing.
- The model's performance surpasses GPT-4-turbo in accuracy for business-specific queries.
- Evaluation benchmarks (VicunaQA) show high performance.
- Fast inference speed, generating 100-character Japanese text in 1.86 seconds.

stockmark/stockmark-100b
stockmark/stockmark-100b-instruct-v0.1

Detailed press release (in Japanese): https://stockmark.co.jp/news/20240516

2 replies

leonardlin

posted an update 1 day ago

Post

692

llm-jp-eval is currently one of the most widely used benchmarks for Japanese LLMs and is half of WandB's comprehensive Nejumi LLM Leaderboard scoring. I was seeing some weirdness in results I was getting and ended up in a bit of a rabbit hole. Here's my article on evaling llm-jp-eval: https://huggingface.co/blog/leonardlin/llm-jp-eval-eval

I've setup a fork of Lightblue's Shaberi testing framework which uses LLM-as-a-Judge style benchmarks as something probably more representative of real world LLM strength in Japanese. Here's how the new base model ablations are looking:

Recently active users