1 16 83

Peter Tanski

pdtgct

AI & ML interests

Machine Learning, Artificial Intelligence

Recent Activity

upvoted a paper 13 days ago

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

liked a model 15 days ago

nvidia/Llama-3_3-Nemotron-Super-49B-v1

upvoted an article about 1 month ago

Open R1: How to use OlympicCoder locally for coding?

View all activity

Organizations

pdtgct's activity

upvoted a paper 13 days ago

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published 14 days ago • 27

upvoted an article about 1 month ago

Article

Open R1: How to use OlympicCoder locally for coding?

Mar 20

• 56

upvoted 2 papers 4 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 84

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 13

upvoted a paper 6 months ago

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 55

upvoted 2 articles 7 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 188

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 236

upvoted a paper 8 months ago

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Paper • 2408.11049 • Published Aug 20, 2024 • 13

upvoted a paper 10 months ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 70

upvoted a collection 11 months ago

Sparse Foundational Llama 2 Models

Collection

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 8 days ago • 9

upvoted a collection about 1 year ago

A little guide to building Large Language Models in 2024

Collection

Resources mentioned by @thomwolf in https://x.com/Thom_Wolf/status/1773340316835131757 • 19 items • Updated Apr 1, 2024 • 15

upvoted 3 papers about 1 year ago

RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation

Paper • 2403.05313 • Published Mar 8, 2024 • 9

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Paper • 2403.08763 • Published Mar 13, 2024 • 52

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 92

upvoted a paper over 1 year ago

YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 68

upvoted a paper almost 2 years ago

Full Parameter Fine-tuning for Large Language Models with Limited Resources

Paper • 2306.09782 • Published Jun 16, 2023 • 30