rahul7star commited on
Commit
d0d32f2
Β·
verified Β·
1 Parent(s): 8c6a005

Chatterbox fine-tuned model + logs

Browse files
Files changed (1) hide show
  1. training.log +107 -19
training.log CHANGED
@@ -1,7 +1,7 @@
1
 
2
  /usr/local/lib/python3.13/site-packages/perth/perth_net/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
3
  from pkg_resources import resource_filename
4
- 02/06/2026 06:17:26 - INFO - __main__ - Training/evaluation parameters CustomTrainingArguments(
5
  accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
6
  adam_beta1=0.9,
7
  adam_beta2=0.999,
@@ -113,36 +113,124 @@ warmup_ratio=None,
113
  warmup_steps=1.0,
114
  weight_decay=0.0,
115
  )
116
- 02/06/2026 06:17:26 - INFO - __main__ - Model parameters ModelArguments(model_name_or_path='ResembleAI/chatterbox', local_model_dir=None, cache_dir=None, freeze_voice_encoder=True, freeze_s3gen=True)
117
- 02/06/2026 06:17:26 - INFO - __main__ - Data parameters DataArguments(language='hi', dataset_dir=None, metadata_file=None, dataset_name='rahul7star/hindi-speech-dataset', dataset_config_name=None, train_split_name='train', eval_split_name='validation', text_column_name='text', audio_column_name='audio', max_text_len=256, max_speech_len=800, audio_prompt_duration_s=3.0, eval_split_size=0.0002, preprocessing_num_workers=None, ignore_verifications=False)
118
- 02/06/2026 06:17:26 - INFO - __main__ - Loading ChatterboxTTS model...
119
- 02/06/2026 06:17:26 - INFO - __main__ - Loading model from Hugging Face Hub: ResembleAI/chatterbox
120
  /usr/local/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `hf_hub_download`. Downloading to a local directory does not use symlinks anymore.
121
  warnings.warn(
122
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/ve.safetensors "HTTP/1.1 302 Found"
123
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
124
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/s3gen.safetensors "HTTP/1.1 302 Found"
125
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/mtl_tokenizer.json "HTTP/1.1 307 Temporary Redirect"
126
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
127
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/conds.pt "HTTP/1.1 302 Found"
128
- 02/06/2026 06:17:26 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/revision/main "HTTP/1.1 200 OK"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
 
131
  Downloading (incomplete total...): 0.00B [00:00, ?B/s]
132
 
133
- Fetching 6 files: 0%| | 0/6 [00:00<?, ?it/s]
134
- Fetching 6 files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:00<00:00, 16299.11it/s]
 
 
 
 
 
 
 
 
 
135
 
 
136
 
137
- Download complete: : 0.00B [00:00, ?B/s] Traceback (most recent call last):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 849, in <module>
139
  main()
140
  ~~~~^^
141
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 616, in main
142
  chatterbox_model = ChatterboxMultilingualTTS.from_pretrained(device="cpu")
143
- File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 188, in from_pretrained
144
  return cls.from_local(ckpt_dir, device)
145
  ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
146
- TypeError: ChatterboxMultilingualTTS.from_local() takes 2 positional arguments but 3 were given
147
-
148
- Download complete: : 0.00B [00:00, ?B/s]
 
 
 
 
1
 
2
  /usr/local/lib/python3.13/site-packages/perth/perth_net/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
3
  from pkg_resources import resource_filename
4
+ 02/06/2026 06:26:37 - INFO - __main__ - Training/evaluation parameters CustomTrainingArguments(
5
  accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None, 'use_configured_state': False},
6
  adam_beta1=0.9,
7
  adam_beta2=0.999,
 
113
  warmup_steps=1.0,
114
  weight_decay=0.0,
115
  )
116
+ 02/06/2026 06:26:37 - INFO - __main__ - Model parameters ModelArguments(model_name_or_path='ResembleAI/chatterbox', local_model_dir=None, cache_dir=None, freeze_voice_encoder=True, freeze_s3gen=True)
117
+ 02/06/2026 06:26:37 - INFO - __main__ - Data parameters DataArguments(language='hi', dataset_dir=None, metadata_file=None, dataset_name='rahul7star/hindi-speech-dataset', dataset_config_name=None, train_split_name='train', eval_split_name='validation', text_column_name='text', audio_column_name='audio', max_text_len=256, max_speech_len=800, audio_prompt_duration_s=3.0, eval_split_size=0.0002, preprocessing_num_workers=None, ignore_verifications=False)
118
+ 02/06/2026 06:26:37 - INFO - __main__ - Loading ChatterboxTTS model...
119
+ 02/06/2026 06:26:37 - INFO - __main__ - Loading model from Hugging Face Hub: ResembleAI/chatterbox
120
  /usr/local/lib/python3.13/site-packages/huggingface_hub/utils/_validators.py:202: UserWarning: The `local_dir_use_symlinks` argument is deprecated and ignored in `hf_hub_download`. Downloading to a local directory does not use symlinks anymore.
121
  warnings.warn(
122
+ 02/06/2026 06:26:37 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/ve.safetensors "HTTP/1.1 302 Found"
123
+ 02/06/2026 06:26:38 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/xet-read-token/05e904af2b5c7f8e482687a9d7336c5c824467d9 "HTTP/1.1 200 OK"
124
+
125
+
126
+ ve.safetensors: 0%| | 0.00/5.70M [00:00<?, ?B/s]
127
+ ve.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5.70M/5.70M [00:00<00:00, 13.7MB/s]
128
+ 02/06/2026 06:26:38 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
129
+
130
+
131
+ t3_mtl23ls_v2.safetensors: 0%| | 0.00/2.14G [00:00<?, ?B/s]
132
+
133
+ t3_mtl23ls_v2.safetensors: 0%| | 7.60M/2.14G [00:01<06:02, 5.89MB/s]
134
+
135
+ t3_mtl23ls_v2.safetensors: 4%|β–Ž | 78.7M/2.14G [00:07<03:17, 10.5MB/s]
136
+
137
+ t3_mtl23ls_v2.safetensors: 37%|β–ˆβ–ˆβ–ˆβ–‹ | 793M/2.14G [00:08<00:10, 125MB/s] 
138
+
139
+ t3_mtl23ls_v2.safetensors: 93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 2.00G/2.14G [00:09<00:00, 329MB/s]
140
+ t3_mtl23ls_v2.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆοΏ½οΏ½β–ˆβ–ˆβ–ˆβ–ˆ| 2.14G/2.14G [00:09<00:00, 217MB/s]
141
+ 02/06/2026 06:26:48 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/s3gen.safetensors "HTTP/1.1 302 Found"
142
+
143
+
144
+ s3gen.safetensors: 0%| | 0.00/1.06G [00:00<?, ?B/s]
145
+
146
+ s3gen.safetensors: 5%|▍ | 50.7M/1.06G [00:02<00:42, 23.4MB/s]
147
+
148
+ s3gen.safetensors: 87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 922M/1.06G [00:03<00:00, 357MB/s] 
149
+ s3gen.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.06G/1.06G [00:03<00:00, 319MB/s]
150
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/mtl_tokenizer.json "HTTP/1.1 307 Temporary Redirect"
151
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
152
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/mtl_tokenizer.json "HTTP/1.1 200 OK"
153
+
154
+
155
+ mtl_tokenizer.json: 0%| | 0.00/68.1k [00:00<?, ?B/s]
156
+ mtl_tokenizer.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 68.1k/68.1k [00:00<00:00, 29.1MB/s]
157
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/conds.pt "HTTP/1.1 302 Found"
158
+
159
+
160
+ conds.pt: 0%| | 0.00/107k [00:00<?, ?B/s]
161
+ conds.pt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 107k/107k [00:00<00:00, 895kB/s]
162
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/models/ResembleAI/chatterbox/revision/main "HTTP/1.1 200 OK"
163
 
164
 
165
  Downloading (incomplete total...): 0.00B [00:00, ?B/s]
166
 
167
+ Fetching 6 files: 0%| | 0/6 [00:00<?, ?it/s]02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 307 Temporary Redirect"
168
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/ve.pt "HTTP/1.1 302 Found"
169
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/conds.pt "HTTP/1.1 302 Found"
170
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
171
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/s3gen.pt "HTTP/1.1 302 Found"
172
+ 02/06/2026 06:26:51 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 307 Temporary Redirect"
173
+
174
+
175
+ Downloading (incomplete total...): 0%| | 0.00/1.06G [00:00<?, ?B/s]
176
+
177
+ Downloading (incomplete total...): 0%| | 0.00/1.06G [00:00<?, ?B/s]
178
 
179
+ Downloading (incomplete total...): 0%| | 0.00/1.06G [00:00<?, ?B/s]02/06/2026 06:26:52 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/05e904af2b5c7f8e482687a9d7336c5c824467d9/t3_mtl23ls_v2.safetensors "HTTP/1.1 302 Found"
180
 
181
+
182
+ Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]02/06/2026 06:26:52 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
183
+ 02/06/2026 06:26:52 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
184
+
185
+
186
+ Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]02/06/2026 06:26:52 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/grapheme_mtl_merged_expanded_v1.json "HTTP/1.1 200 OK"
187
+
188
+
189
+ Downloading (incomplete total...): 0%| | 0.00/3.21G [00:00<?, ?B/s]
190
+
191
+ Downloading (incomplete total...): 0%| | 15.4M/3.21G [00:01<05:21, 9.94MB/s]
192
+
193
+ Downloading (incomplete total...): 2%|▏ | 66.5M/3.21G [00:07<06:19, 8.28MB/s]
194
+
195
+ Downloading (incomplete total...): 4%|▍ | 138M/3.21G [00:09<03:11, 16.1MB/s] 
196
+
197
+ Downloading (incomplete total...): 22%|β–ˆβ–ˆβ– | 690M/3.21G [00:10<00:23, 105MB/s] 
198
+
199
+ Downloading (incomplete total...): 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1.94G/3.21G [00:11<00:03, 325MB/s]
200
+
201
+ Fetching 6 files: 67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 4/6 [00:12<00:06, 3.20s/it]
202
+ Fetching 6 files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6/6 [00:12<00:00, 2.13s/it]
203
+
204
+
205
+ Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:12<00:00, 325MB/s] 
206
+ Download complete: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.21G/3.21G [00:25<00:00, 128MB/s]
207
+ /usr/local/lib/python3.13/site-packages/diffusers/models/lora.py:393: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
208
+ deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
209
+ 02/06/2026 06:27:18 - INFO - root - input frame rate=25
210
+ 02/06/2026 06:27:23 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/ResembleAI/chatterbox/resolve/main/Cangjie5_TC.json "HTTP/1.1 307 Temporary Redirect"
211
+ 02/06/2026 06:27:23 - INFO - httpx - HTTP Request: HEAD https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
212
+ 02/06/2026 06:27:23 - INFO - httpx - HTTP Request: GET https://huggingface.co/api/resolve-cache/models/ResembleAI/chatterbox/05e904af2b5c7f8e482687a9d7336c5c824467d9/Cangjie5_TC.json "HTTP/1.1 200 OK"
213
+
214
+
215
+ Cangjie5_TC.json: 0%| | 0.00/1.92M [00:00<?, ?B/s]
216
+ Cangjie5_TC.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.92M/1.92M [00:00<00:00, 128MB/s]
217
+ Downloading: "https://github.com/explosion/spacy-pkuseg/releases/download/v0.0.26/spacy_ontonotes.zip" to /root/.pkuseg/spacy_ontonotes.zip
218
+
219
+
220
+ 0%| | 0/34567143 [00:00<?, ?it/s]
221
+ 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 34567143/34567143 [00:00<00:00, 88357656.22it/s]
222
+ Traceback (most recent call last):
223
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 849, in <module>
224
  main()
225
  ~~~~^^
226
  File "/app/chatterbox-multilingual-finetuning/src/finetune_t3.py", line 616, in main
227
  chatterbox_model = ChatterboxMultilingualTTS.from_pretrained(device="cpu")
228
+ File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 192, in from_pretrained
229
  return cls.from_local(ckpt_dir, device)
230
  ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
231
+ File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 170, in from_local
232
+ conds = Conditionals.load(conds_path).to(DEVICE)
233
+ File "/app/chatterbox-multilingual-finetuning/src/chatterbox/mtl_tts.py", line 96, in to
234
+ self.t3 = self.t3.to(device)
235
+ ~~~~~~~~~~^^^^^^^^
236
+ TypeError: T3Cond.to() takes 1 positional argument but 2 were given