lakelee commited on
Commit
a3860fa
·
verified ·
1 Parent(s): d6d1e21

Model save

Browse files
README.md CHANGED
@@ -1,7 +1,5 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
- base_model: state-spaces/mamba2-2.7b
5
  tags:
6
  - generated_from_trainer
7
  model-index:
@@ -14,7 +12,7 @@ should probably proofread and complete it, then remove this comment. -->
14
 
15
  # video-ma2mba-3.1b-clip
16
 
17
- This model is a fine-tuned version of [state-spaces/mamba2-2.7b](https://huggingface.co/state-spaces/mamba2-2.7b) on an unknown dataset.
18
 
19
  ## Model description
20
 
@@ -34,12 +32,12 @@ More information needed
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 4e-05
37
- - train_batch_size: 1
38
  - eval_batch_size: 1
39
  - seed: 42
40
  - distributed_type: multi-GPU
41
  - num_devices: 8
42
- - gradient_accumulation_steps: 4
43
  - total_train_batch_size: 32
44
  - total_eval_batch_size: 8
45
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
 
1
  ---
2
  library_name: transformers
 
 
3
  tags:
4
  - generated_from_trainer
5
  model-index:
 
12
 
13
  # video-ma2mba-3.1b-clip
14
 
15
+ This model was trained from scratch on an unknown dataset.
16
 
17
  ## Model description
18
 
 
32
 
33
  The following hyperparameters were used during training:
34
  - learning_rate: 4e-05
35
+ - train_batch_size: 2
36
  - eval_batch_size: 1
37
  - seed: 42
38
  - distributed_type: multi-GPU
39
  - num_devices: 8
40
+ - gradient_accumulation_steps: 2
41
  - total_train_batch_size: 32
42
  - total_eval_batch_size: 8
43
  - optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
- "_name_or_path": "state-spaces/mamba2-2.7b",
3
  "add_faster_video": false,
4
- "add_time_instruction": false,
5
  "architectures": [
6
  "LlavaMambaForCausalLM"
7
  ],
 
1
  {
2
+ "_name_or_path": "/scratch/hpc162a02/cvpr/huggingface/hub/LongMamba_V4/LongMamba-clip-Stage12-2.7b/",
3
  "add_faster_video": false,
4
+ "add_time_instruction": true,
5
  "architectures": [
6
  "LlavaMambaForCausalLM"
7
  ],
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4d4430a5c514e4e12ad7e7d126078ba0aaa5cd03ed0831c4887246a74388b13d
3
  size 4976871936
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acc0705fc2ac3c33aa4d1af3ad13667e3bdb482bbbe48e8df38745bbd1f0d652
3
  size 4976871936
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d872b1d158bf9575e1f4f36c17ad2897735eb33c9034040aae49c03885a44ea4
3
  size 1310647288
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea22ada0487ee9431f99c15bbcc7bd60724c4ee284eba0d95d88e4e88b652e87
3
  size 1310647288
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f114b32d5ce15e02a341bb1da63b3b56400986747a04240d59a528038414a0bb
3
  size 7096
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b42741fc6c76aa102024a039ed24ccea6045392bd6e89ed86c4c392ab92f759b
3
  size 7096