Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

KingNishย 
posted an update about 2 hours ago
view post
Post
87
New Updates OpenGPT 4o
1. Live Chat (also known as video chat)(currently fast by not so powerful and give very short answers)
2. Powerful Image Generation

Test and give feedback of New features:
KingNish/OpenGPT-4o

Future Updates
1. Better Live Chat (Using more Powerful model)
2. Human like speech (Using Parler tts expresso)
3. Multilingual support for voice chat

Suggest more features that should be added. ๐Ÿค—
singhsidhukuldeepย 
posted an update about 8 hours ago
view post
Post
435
How many times have you said Pandas is slow and still kept on using it? ๐Ÿผ๐Ÿ’จ

Get ready to say Pandas can be fast but it's expensive ๐Ÿ˜‚

๐Ÿ™Œ Original Limitations:

๐Ÿ’ป CPU-Bound Processing: Traditional pandas operations are CPU-bound (mostly single-threaded๐Ÿ˜ฐ), leading to slower processing of large datasets.

๐Ÿง  Memory Constraints: Handling large datasets in memory-intensive operations can lead to inefficiencies and limitations.

๐Œฃ Achievements with @nvidia RAPIDS cuDF:

๐Ÿš€ GPU Acceleration: RAPIDS cuDF leverages GPU computing. Users switch to GPU-accelerated operations without modifying existing pandas code.

๐Ÿ”„ Unified Workflows: Seamlessly integrates GPU and CPU operations, falling back to CPU when necessary.

๐Ÿ“ˆ Optimized Performance: With extreme parallel operation opportunity of GPUs, this achieves up to 150x speedup in data processing, demonstrated through benchmarks like DuckDB.

๐Ÿ˜…New Limitations:

๐ŸŽฎ GPU Availability: Requires a GPU (not everything should need a GPU)

๐Ÿ”„ Library Compatibility: Currently in the initial stages, all the functionality cannot be ported

๐Ÿข Data Transfer Overhead: Moving data between CPU and GPU can introduce latency if not managed efficiently. As some operations still run on the CPU.

๐Ÿค” User Adoption: We already had vectorization support in Pandas, people just didn't use it as it was difficult to implement. We already had DASK for parallelization. It's not that solutions didn't exist

Blog: https://developer.nvidia.com/blog/rapids-cudf-accelerates-pandas-nearly-150x-with-zero-code-changes/

For Jupyter Notebooks:

%load_ext cudf.pandas
import pandas as pd


For python scripts:

python -m cudf.pandas script.py


kaisugiย 
posted an update about 8 hours ago
view post
Post
336
๐Ÿš€ Stockmark-100b

Stockmark Inc. has developed and released one of Japan's largest commercial-scale Language Models (LLM) with 100 billion parameters, named "Stockmark-LLM-100b". This model significantly reduces hallucinations and provides accurate responses to complex business-related queries. Developed from scratch with a focus on Japanese business data, the model aims to be reliable for high-stakes business environments. It's open-source and available for commercial use.

Key highlights:
- The model reduces hallucinationsโ€”incorrect confident responses that AI models sometimes generate.
- Stockmark-LLM-100b can answer basic business questions and specialized queries in industries like manufacturing.
- The model's performance surpasses GPT-4-turbo in accuracy for business-specific queries.
- Evaluation benchmarks (VicunaQA) show high performance.
- Fast inference speed, generating 100-character Japanese text in 1.86 seconds.

stockmark/stockmark-100b
stockmark/stockmark-100b-instruct-v0.1

Detailed press release (in Japanese): https://stockmark.co.jp/news/20240516
  • 2 replies
ยท
leonardlinย 
posted an update about 9 hours ago
view post
Post
290
llm-jp-eval is currently one of the most widely used benchmarks for Japanese LLMs and is half of WandB's comprehensive Nejumi LLM Leaderboard scoring. I was seeing some weirdness in results I was getting and ended up in a bit of a rabbit hole. Here's my article on evaling llm-jp-eval: https://huggingface.co/blog/leonardlin/llm-jp-eval-eval

I've setup a fork of Lightblue's Shaberi testing framework which uses LLM-as-a-Judge style benchmarks as something probably more representative of real world LLM strength in Japanese. Here's how the new base model ablations are looking:
Fredtt3ย 
posted an update about 12 hours ago
mrfakenameย 
posted an update about 14 hours ago
kadirnarย 
posted an update about 14 hours ago
view post
Post
340
Midjourney + Custom SDXL-Lightning:
  • 2 replies
ยท
not-lainย 
posted an update about 16 hours ago
view post
Post
417
If you're a researcher or developing your own model ๐Ÿ‘€ you might need to take a look at huggingface's ModelHubMixin classes.
They are used to seamlessly integrate your AI model with huggingface and to save/ load your model easily ๐Ÿš€

1๏ธโƒฃ make sure you're using the appropriate library version
pip install -qU "huggingface_hub>=0.22"

2๏ธโƒฃ inherit from the appropriate class
from huggingface_hub import PyTorchModelHubMixin
from torch import nn

class MyModel(nn.Module,PyTorchModelHubMixin):
  def __init__(self, a, b):
    super().__init__()
    self.layer = nn.Linear(a,b)
  def forward(self,inputs):
    return self.layer(inputs)

first_model = MyModel(3,1)

4๏ธโƒฃ push the model to the hub (or use save_pretrained method to save locally)
first_model.push_to_hub("not-lain/test")

5๏ธโƒฃ Load and initialize the model from the hub using the original class
pretrained_model = MyModel.from_pretrained("not-lain/test")

Salama1429ย 
posted an update about 19 hours ago
view post
Post
512
๐Ÿ“š Introducing the 101 Billion Arabic Words Dataset

๐ŸŒ Exciting Milestone in Arabic Language Technology! hashtag#NLP hashtag#ArabicLLM hashtag#LanguageModels

๐Ÿš€ Why It Matters:
1. ๐ŸŒŸ Large Language Models (LLMs) have brought transformative changes, primarily in English. It's time for Arabic to shine!
2. ๐ŸŽฏ This project addresses the critical challenge of bias in Arabic LLMs due to reliance on translated datasets.

๐Ÿ” Approach:
1. ๐Ÿ’ช Undertook a massive data mining initiative focusing exclusively on Arabic from Common Crawl WET files.
2. ๐Ÿงน Employed state-of-the-art cleaning and deduplication processes to maintain data quality and uniqueness.

๐Ÿ“ˆ Impact:
1. ๐Ÿ† Created the largest Arabic dataset to date with 101 billion words.
2. ๐Ÿ“ Enables the development of Arabic LLMs that are linguistically and culturally accurate.
3. ๐ŸŒ Sets a global benchmark for future Arabic language research.


๐Ÿ”— Paper: https://lnkd.in/dGAiaygn
๐Ÿ”— Dataset: https://lnkd.in/dGTMe5QV

- ๐Ÿ”„ Share your thoughts and let's drive the future of Arabic NLP together!

hashtag#DataScience hashtag#MachineLearning hashtag#ArtificialIntelligence hashtag#Innovation hashtag#ArabicData
akhaliqย 
posted an update about 24 hours ago
view post
Post
1035
Chameleon

Mixed-Modal Early-Fusion Foundation Models

Chameleon: Mixed-Modal Early-Fusion Foundation Models (2405.09818)

We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.