Csaba  Kecskemeti's picture

Csaba Kecskemeti PRO

csabakecskemeti

AI & ML interests

None yet

Recent Activity

Organizations

Zillow's profile picture DevQuasar's profile picture Hugging Face Party @ PyTorch Conference's profile picture Intelligent Estate's profile picture open/ acc's profile picture

csabakecskemeti's activity

reacted to mitkox's post with 🤗 3 days ago
view post
Post
2315
Can it run DeepSeek V3 671B is the new 'can it run Doom'.

How minimalistic can I go with on device AI with behemoth models - here I'm running DeepSeek V3 MoE on a single A6000 GPU.

Not great, not terrible, for this minimalistic setup. I love the Mixture of Experts architectures. Typically I'm running my core LLM distributed over the 4 GPUs.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
·
replied to mitkox's post 3 days ago
view reply

Deepseek-V3-Base Q2_K

AMD Ryzen™ Threadripper™ 3970X × 64
ASUS ROG ZENITH II EXTREME ALPHA
256.0 GiB
NVIDIA GeForce RTX™ 3090 / NVIDIA GeForce RTX™ 3090 / NVIDIA GeForce RTX™ 4080

replied to singhsidhukuldeep's post 3 days ago
view reply

seems it's happening:
ChatGPT
I've provided context that has no information about if Berlin is the capital of Germany, though my 'fake' source has been cited.
Screenshot 2025-01-08 at 3.26.35 PM.png

replied to singhsidhukuldeep's post 4 days ago
reacted to singhsidhukuldeep's post with 👀 4 days ago
view post
Post
1394
Groundbreaking Research Alert: Correctness ≠ Faithfulness in RAG Systems

Fascinating new research from L3S Research Center, University of Amsterdam, and TU Delft reveals a critical insight into Retrieval Augmented Generation (RAG) systems. The study exposes that up to 57% of citations in RAG systems could be unfaithful, despite being technically correct.

>> Key Technical Insights:

Post-rationalization Problem
The researchers discovered that RAG systems often engage in "post-rationalization" - where models first generate answers from their parametric memory and then search for supporting evidence afterward. This means that while citations may be correct, they don't reflect the actual reasoning process.

Experimental Design
The team used Command-R+ (104B parameters) with 4-bit quantization on NVIDIA A100 GPU, testing on the NaturalQuestions dataset. They employed BM25 for initial retrieval and ColBERT v2 for reranking.

Attribution Framework
The research introduces a comprehensive framework for evaluating RAG systems across multiple dimensions:
- Citation Correctness: Whether cited documents support the claims
- Citation Faithfulness: Whether citations reflect actual model reasoning
- Citation Appropriateness: Relevance and meaningfulness of citations
- Citation Comprehensiveness: Coverage of key points

Under the Hood
The system processes involve:
1. Document relevance prediction
2. Citation prediction
3. Answer generation without citations
4. Answer generation with citations

This work fundamentally challenges our understanding of RAG systems and highlights the need for more robust evaluation metrics in AI systems that claim to provide verifiable information.
  • 2 replies
·
replied to bartowski's post 5 days ago
view reply

I had the same hesitation but had to settle with something, so I went with '.' :D
Basically the '.' as separata has resemble me the domain name structure which has made sense for me

replied to bartowski's post 5 days ago
reacted to singhsidhukuldeep's post with 👍 8 days ago
view post
Post
3018
Groundbreaking Research Alert: Rethinking RAG with Cache-Augmented Generation (CAG)

Researchers from National Chengchi University and Academia Sinica have introduced a paradigm-shifting approach that challenges the conventional wisdom of Retrieval-Augmented Generation (RAG).

Instead of the traditional retrieve-then-generate pipeline, their innovative Cache-Augmented Generation (CAG) framework preloads documents and precomputes key-value caches, eliminating the need for real-time retrieval during inference.

Technical Deep Dive:
- CAG preloads external knowledge and precomputes KV caches, storing them for future use
- The system processes documents only once, regardless of subsequent query volume
- During inference, it loads the precomputed cache alongside user queries, enabling rapid response generation
- The cache reset mechanism allows efficient handling of multiple inference sessions through strategic token truncation

Performance Highlights:
- Achieved superior BERTScore metrics compared to both sparse and dense retrieval RAG systems
- Demonstrated up to 40x faster generation times compared to traditional approaches
- Particularly effective with both SQuAD and HotPotQA datasets, showing robust performance across different knowledge tasks

Why This Matters:
The approach significantly reduces system complexity, eliminates retrieval latency, and mitigates common RAG pipeline errors. As LLMs continue evolving with expanded context windows, this methodology becomes increasingly relevant for knowledge-intensive applications.
replied to their post 8 days ago
posted an update 8 days ago
posted an update 9 days ago
reacted to s-emanuilov's post with 👍👀 10 days ago
view post
Post
2537
Hey HF community! 👋

Excited to share Monkt - a tool I built to solve the eternal headache of processing documents for ML/AI pipelines.

What it does: Converts PDFs, Word, PowerPoint, Excel, Web pages or raw HTML into clean Markdown or structured JSON.

Great for:
✔ LLM training dataset preparation;
✔ Knowledge base construction;
✔ Research paper processing;
✔ Technical documentation management.

It has API access for integration into ML pipelines.

Check it out at https://monkt.com/ if you want to save time on document processing infrastructure.

Looking forward to your feedback!
  • 3 replies
·
posted an update 11 days ago
reacted to prithivMLmods's post with ❤️ 11 days ago
view post
Post
3820
Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
·
reacted to DamarJati's post with 12 days ago
view post
Post
2242
Happy New Year 2025 🤗
For the Huggingface community.
reacted to prithivMLmods's post with 🤗 12 days ago
view post
Post
3820
Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
·
reacted to sequelbox's post with 👍 12 days ago
posted an update 13 days ago
reacted to ginipick's post with 🔥 13 days ago
view post
Post
3802
🌊 [Dokdo Membership - Next Generation AI Video Creation Platform]

✨ Transform your imagination into mesmerizing videos with Dokdo Membership, an innovative AI-powered platform that generates unique videos from text and images. Built as a streamlined SaaS boilerplate using Python Gradio for Hugging Face users, this tool offers an intuitive way to create AI-generated videos with minimal effort.

🎯 [Key Features]
- 📧 Email-based authentication system with secure login/signup
- 🎁 15 points automatically credited upon registration
- 💰 5 points deduction per video generation
- 🌏 Bilingual support (Korean/English) with automatic translation
- 🖼️ Optional first frame image upload capability
- ⭐ Automatic GiniGEN.AI watermark integration

🚀 [Technical Specifications]
1. 💫 Modern, responsive user interface with Gradio components
2. 📊 Efficient resource management through points system
3. 🎥 High-quality video generation using advanced AI models
4. 🔄 Seamless translation pipeline for multilingual support
5. ⚡ Real-time point tracking and management system
6. 🛡️ Comprehensive content moderation and filtering

📝 [How to Use]
1. ✅ Register with your email to receive 15 initial points
2. 💭 Enter your video description (supports both English and Korean)
3. 📤 Upload a reference image for the first frame (optional)
4. 🎬 Click "Generate Video" (consumes 5 points)
5. 📥 Preview and download your generated video

🔧 [Technical Implementation]
- Built with Python Gradio for seamless Hugging Face Space integration
- Implements secure user authentication and session management
- Features real-time point tracking and automated deduction system
- Includes comprehensive error handling and input validation
- Utilizes advanced AI models for video generation

📮 Need additional points for more creations? Contact us at ginipicks@gmail.com for point acquisition options through public contributions or paid services.

ginigen/Dokdo-membership
  • 1 reply
·