merve HF staff commited on
Commit
8f3e22e
1 Parent(s): 7fb3550

Add FT tutorial link

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -35,7 +35,7 @@ SmolVLM is a compact open multimodal model that accepts arbitrary sequences of i
35
 
36
  SmolVLM can be used for inference on multimodal (image + text) tasks where the input comprises text queries along with one or more images. Text and images can be interleaved arbitrarily, enabling tasks like image captioning, visual question answering, and storytelling based on visual content. The model does not support image generation.
37
 
38
- To fine-tune SmolVLM on a specific task, you can follow the fine-tuning tutorial.
39
  <!-- todo: add link to fine-tuning tutorial -->
40
 
41
  ### Technical Summary
 
35
 
36
  SmolVLM can be used for inference on multimodal (image + text) tasks where the input comprises text queries along with one or more images. Text and images can be interleaved arbitrarily, enabling tasks like image captioning, visual question answering, and storytelling based on visual content. The model does not support image generation.
37
 
38
+ To fine-tune SmolVLM on a specific task, you can follow the [fine-tuning tutorial](https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb).
39
  <!-- todo: add link to fine-tuning tutorial -->
40
 
41
  ### Technical Summary