Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80.

Tom Goldstein's Lab at University of Maryland, College Park
university
AI & ML interests
AI security & privacy, algorithmic bias, foundations of ML
Recent Activity
View all activity
Collections
7
These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space.
-
tomg-group-umd/huginn-0125
Text Generation • Updated • 9.87k • 224 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 113 -
tomg-group-umd/huginn_swa_100_10_avg_0.9_merge
Text Generation • Updated • 579 -
tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation • Updated • 184
spaces
4
models
87

tomg-group-umd/huginn-0125
Text Generation
•
Updated
•
9.87k
•
224

tomg-group-umd/Gemstone-3072x12_lr_ablation
Text Generation
•
Updated
•
8

tomg-group-umd/Gemstone-256x23_lr_ablation
Text Generation
•
Updated
•
4

tomg-group-umd/Gemstone-1024x28_lr_ablation
Text Generation
•
Updated
•
3

tomg-group-umd/Gemstone-384x13_lr_ablation
Text Generation
•
Updated
•
15

tomg-group-umd/Gemstone-2048x27_lr_ablation
Text Generation
•
Updated
•
5

tomg-group-umd/Gemstone-512x16_lr_ablation
Text Generation
•
Updated
•
4

tomg-group-umd/Gemstone-384x36_lr_ablation
Text Generation
•
Updated
•
3

tomg-group-umd/Gemstone-1792x18_lr_ablation
Text Generation
•
Updated
•
7

tomg-group-umd/Gemstone-2560x8_lr_ablation
Text Generation
•
Updated
•
7
datasets
18
tomg-group-umd/alpaca_cleaned_dataset_short
Viewer
•
Updated
•
32
•
58
•
1
tomg-group-umd/riftV1
Viewer
•
Updated
•
478k
•
37
tomg-group-umd/fictional_qa_11-08-24_refolded_1-10-25_txt
Viewer
•
Updated
•
417k
•
5
tomg-group-umd/fictional_qa_11-08-24_refolded_1-10-25
Viewer
•
Updated
•
444k
•
24
tomg-group-umd/rift
Viewer
•
Updated
•
714k
•
8
tomg-group-umd/fictional_qa_11-08-24_txt
Viewer
•
Updated
•
42k
•
9
•
1
tomg-group-umd/fictional_qa_11-08-24
Viewer
•
Updated
•
46.4k
•
20
•
1
tomg-group-umd/fictional_qa_10-28-24
Viewer
•
Updated
•
14.6k
•
17
tomg-group-umd/fictional_qa_mix_train_2024-09-15
Viewer
•
Updated
•
797
•
23
tomg-group-umd/cinepile
Viewer
•
Updated
•
608k
•
245
•
79