Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

KingNishย 
posted an update 28 minutes ago
view post
Post
Decoding GPT-4'o': Its Mechanisms and Creating Similar AI.

๐—ฅ๐—ฒ๐—ฎ๐—ฑ ๐—™๐˜‚๐—น๐—น ๐€๐ซ๐ญ๐ข๐œ๐ฅ๐ž: https://huggingface.co/blog/KingNish/decoding-gpt-4o

๐’๐ฎ๐ฆ๐ฆ๐š๐ซ๐ฒ ๐จ๐Ÿ ๐€๐ซ๐ญ๐ข๐œ๐ฅ๐ž- ๐Ÿ“
# ๐Œ๐ž๐œ๐ก๐š๐ง๐ข๐œ๐ฌ ๐จ๐Ÿ ๐†๐๐“-๐Ÿ’โ€™๐จโ€™: GPT-4โ€™oโ€™ operates through three main components ๐Ÿ› ๏ธ

๐Ÿ. ๐’๐ฎ๐ฉ๐ž๐ซ๐‚๐ก๐š๐ญ: Integrates image generation, QnA (image, document and video) for diverse interactions.
๐Ÿ. ๐•๐จ๐ข๐œ๐ž ๐‚๐ก๐š๐ญ: Merges TTS and STT for real-time, human-like audio responses, focusing on human interaction.
๐Ÿ‘. ๐•๐ข๐๐ž๐จ ๐‚๐ก๐š๐ญ: Utilizes Zero Shot Image Classification to enhance user interaction with visual information.

# ๐Œ๐ž๐ญ๐ก๐จ๐๐ฌ ๐ญ๐จ ๐‚๐ซ๐ž๐š๐ญ๐ž ๐’๐ข๐ฆ๐ข๐ฅ๐š๐ซ ๐€๐ˆ ๐Ÿง 

๐Ÿ. ๐Œ๐ฎ๐ฅ๐ญ๐ข๐Œ๐จ๐๐š๐ฅ๐ข๐Ÿ๐ข๐œ๐š๐ญ๐ข๐จ๐ง: Combines multiple models for a powerful, multifunctional AI.
๐Ÿ. ๐ƒ๐ฎ๐œ๐ญ ๐“๐š๐ฉ๐ž ๐Œ๐ž๐ญ๐ก๐จ๐: Uses different models or APIs for specific tasks without additional training.

The article provides an in-depth exploration of GPT-4โ€™oโ€™, its functionalities, and methods to create similar AI models. It emphasizes the modelโ€™s language support and its innovative approach to human-AI interaction. ๐Ÿ’ก๐ŸŒ

(๐™‰๐™Š๐™๐™€: ๐™Ž๐™ช๐™ข๐™ข๐™–๐™ง๐™ฎ ๐™ž๐™จ ๐˜ผ๐™„ ๐™œ๐™š๐™ฃ๐™š๐™ง๐™–๐™ฉ๐™š๐™™) โœ…
leonardlinย 
posted an update about 9 hours ago
singhsidhukuldeepย 
posted an update about 13 hours ago
view post
Post
860
๐ŸŽญ You picked an LLM for your work but then you find out it hallucinates! ๐Ÿค–

๐Ÿค” Your first thought might be to fine-tune it on more training data.... but should you? ๐Ÿ› ๏ธ

๐Ÿ“œ This is what @Google is exploring in the paper "Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?" ๐Ÿ•ต๏ธโ€โ™‚๏ธ

๐Ÿ“˜ When LLMs undergo supervised fine-tuning with new factual knowledge not present in their initial training data, there is a risk they might "hallucinate" or produce factually incorrect information. ๐Ÿšจ

๐Ÿ” The paper investigates how fine-tuning LLMs with new facts influences their ability to leverage pre-existing knowledge and the extent to which they generate errors. ๐Ÿ“Š

โš™๏ธTechnical Setup:

๐Ÿ”ง Approach: They introduce a system named SliCK (this stands for Sampling-based Categorization of Knowledge, don't even bother understanding how) to categorize knowledge into four levels (HighlyKnown, MaybeKnown, WeaklyKnown, and Unknown) based on how well the model's generated responses agree with known facts. ๐Ÿ—‚๏ธ

๐Ÿ“ Experimental Setup: The study uses a controlled setup focusing on closed-book QA, adjusting the proportion of fine-tuning examples that introduce new facts versus those that do not. ๐Ÿงช

๐Ÿ‘‰ Here is the gist of the findings:

๐Ÿšธ LLMs struggle to integrate new factual knowledge during fine-tuning, and such examples are learned slower than those consistent with the model's pre-existing knowledge. ๐Ÿข

๐Ÿ“ˆ As LLMs learn from examples containing new knowledge, their propensity to hallucinate increases. ๐Ÿ‘ป

โฑ๏ธ Early stopping during training can mitigate the risks of hallucinations by minimizing exposure to unlearned new facts. ๐Ÿ›‘

๐Ÿง  Training LLMs mostly with known examples leads to better utilization of pre-existing knowledge, whereas examples introducing new knowledge increase the risk of generating incorrect information. ๐Ÿ—๏ธ

๐Ÿ“„ Paper: Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? (2405.05904) ๐Ÿ“š
  • 1 reply
ยท
monsoon-nlpย 
posted an update about 17 hours ago
Salama1429ย 
posted an update about 23 hours ago
view post
Post
775
Free Guide: How to Fine-Tune and Prompt Engineer LLMs

While some of the most forward-thinking companies in the world are already using LLMs, few organizations have the bandwidth, compute, or money to train foundational models in-house. Itโ€™s become much more common to either fine-tune or prompt engineer existing LLMs for unique business needs. In this guide, youโ€™ll learn:

โ€ข How to choose between fine-tuning and prompting
โ€ข Popular fine-tuning strategies and their trade-offs
โ€ข Tasks where fine-tuning excels vs. ones where it doesnโ€™t
โ€ข Tips and current best practices for prompt engineering
โ€ข And a whole lot more!

Link: https://wandb.ai/site/resources/whitepapers/llm-fine-tuning
Walmart-the-bagย 
posted an update 1 day ago
KingNishย 
posted an update 1 day ago
view post
Post
1339
New Updates OpenGPT 4o
1. Live Chat (also known as video chat) (very powerful and fast, it can even identify famous places and persons)
2. Powerful Image Generation

Test and give feedback of New features:
KingNish/OpenGPT-4o

Future Updates
1. PDF Chat
2. Human like speech (Using Parler tts expresso)
3. Multilingual support for voice chat

Suggest more features that should be added. ๐Ÿค—

Edit: Live Chat is now very powerful (than prev)
ยท
singhsidhukuldeepย 
posted an update 1 day ago
view post
Post
1023
How many times have you said Pandas is slow and still kept on using it? ๐Ÿผ๐Ÿ’จ

Get ready to say Pandas can be fast but it's expensive ๐Ÿ˜‚

๐Ÿ™Œ Original Limitations:

๐Ÿ’ป CPU-Bound Processing: Traditional pandas operations are CPU-bound (mostly single-threaded๐Ÿ˜ฐ), leading to slower processing of large datasets.

๐Ÿง  Memory Constraints: Handling large datasets in memory-intensive operations can lead to inefficiencies and limitations.

๐Œฃ Achievements with @nvidia RAPIDS cuDF:

๐Ÿš€ GPU Acceleration: RAPIDS cuDF leverages GPU computing. Users switch to GPU-accelerated operations without modifying existing pandas code.

๐Ÿ”„ Unified Workflows: Seamlessly integrates GPU and CPU operations, falling back to CPU when necessary.

๐Ÿ“ˆ Optimized Performance: With extreme parallel operation opportunity of GPUs, this achieves up to 150x speedup in data processing, demonstrated through benchmarks like DuckDB.

๐Ÿ˜…New Limitations:

๐ŸŽฎ GPU Availability: Requires a GPU (not everything should need a GPU)

๐Ÿ”„ Library Compatibility: Currently in the initial stages, all the functionality cannot be ported

๐Ÿข Data Transfer Overhead: Moving data between CPU and GPU can introduce latency if not managed efficiently. As some operations still run on the CPU.

๐Ÿค” User Adoption: We already had vectorization support in Pandas, people just didn't use it as it was difficult to implement. We already had DASK for parallelization. It's not that solutions didn't exist

Blog: https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/

For Jupyter Notebooks:

%load_ext cudf.pandas
import pandas as pd


For python scripts:

python -m cudf.pandas script.py


kaisugiย 
posted an update 1 day ago
view post
Post
759
๐Ÿš€ Stockmark-100b

Stockmark Inc. has developed and released one of Japan's largest commercial-scale Language Models (LLM) with 100 billion parameters, named "Stockmark-LLM-100b". This model significantly reduces hallucinations and provides accurate responses to complex business-related queries. Developed from scratch with a focus on Japanese business data, the model aims to be reliable for high-stakes business environments. It's open-source and available for commercial use.

Key highlights:
- The model reduces hallucinationsโ€”incorrect confident responses that AI models sometimes generate.
- Stockmark-LLM-100b can answer basic business questions and specialized queries in industries like manufacturing.
- The model's performance surpasses GPT-4-turbo in accuracy for business-specific queries.
- Evaluation benchmarks (VicunaQA) show high performance.
- Fast inference speed, generating 100-character Japanese text in 1.86 seconds.

stockmark/stockmark-100b
stockmark/stockmark-100b-instruct-v0.1

Detailed press release (in Japanese): https://stockmark.co.jp/news/20240516
  • 2 replies
ยท
leonardlinย 
posted an update 1 day ago
view post
Post
692
llm-jp-eval is currently one of the most widely used benchmarks for Japanese LLMs and is half of WandB's comprehensive Nejumi LLM Leaderboard scoring. I was seeing some weirdness in results I was getting and ended up in a bit of a rabbit hole. Here's my article on evaling llm-jp-eval: https://huggingface.co/blog/leonardlin/llm-jp-eval-eval

I've setup a fork of Lightblue's Shaberi testing framework which uses LLM-as-a-Judge style benchmarks as something probably more representative of real world LLM strength in Japanese. Here's how the new base model ablations are looking: