Pretam Ray's picture

Pretam Ray

Pretam
·

AI & ML interests

NLP

Recent Activity

liked a Space 24 days ago
hf-accelerate/model-memory-usage
published a model about 2 months ago
Pretam/t5-small-finetuned-xsum
View all activity

Organizations

Sanskrit Computational Linguistics's profile picture Sanskrit Knowledge Accessor's profile picture National Language Translation Mission's profile picture

Pretam's activity

published a model about 2 months ago
upvoted an article 9 months ago
view article
Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

129
reacted to vladbogo's post with 👍 about 1 year ago
view post
Post
A recent paper titled "ShortGPT: Layers in Large Language Models are More Redundant Than You Expect" proposes a simple and effective approach to pruning Large Language Models (LLMs) by removing redundant layers.

Key points:
* Discovers significant redundancy across layers in LLMs, with some layers playing a negligible role for the final performance.
* Defines a new metric called Block Influence (BI) to quantify the importance of each layer in an LLM.
* Removes layers with low BI scores, achieving up to 25% reduction in parameters and computation while maintaining 92% of the LLM's performance.

Congrats to the authors for their work!

Paper: ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853)

reacted to SkalskiP's post with ❤️ about 1 year ago
view post
Post
YOLO-World: Real-Time, Zero-Shot Object Detection 🔥 🔥 🔥

YOLO-World was designed to solve a limitation of existing zero-shot object detection models: speed. Whereas other state-of-the-art models use Transformers, a powerful but typically slower architecture, YOLO-World uses the faster CNN-based YOLO architecture.

YOLO-World provides three models: small with 13M (re-parametrized 77M), medium with 29M (re-parametrized 92M), and large with 48M (re-parametrized 110M) parameters.

The YOLO-World team benchmarked the model on the LVIS dataset and measured their performance on the V100 without any performance acceleration mechanisms like quantization or TensorRT.

According to the paper, YOLO-World reached 35.4 AP with 52.0 FPS for the L version and 26.2 AP with 74.1 FPS for the S version. While the V100 is a powerful GPU, achieving such high FPS on any device is impressive.

- 🔗 YOLO-World arXiv paper: https://lnkd.in/ddRBKCCX
- 🔗 my YOLO-World technical report: https://blog.roboflow.com/what-is-yolo-world
- 🤗 YOLO-World space: SkalskiP/YOLO-World
New activity in Pretam/ramayana over 1 year ago