0xZohar commited on
Commit
edbbf00
·
verified ·
1 Parent(s): a7b14bb

Fix: Remove use_safetensors=True for CLIP model loading

Browse files

Root cause (Research findings):
- openai/clip-vit-base-patch32 main branch only has pytorch_model.bin
- model.safetensors exists in PR #21 (revision d15b5f2) but NOT merged to main
- Code forced use_safetensors=True → file not found → runtime failure
- Working version (commit b9fd613) did NOT have this flag

Solution:
- Remove use_safetensors=True from CLIPModel.from_pretrained (line 144)
- Remove use_safetensors=True from CLIPProcessor.from_pretrained (line 155)
- Let transformers auto-detect available weight files (pytorch_model.bin)
- Add documentation explaining why we use pytorch_model.bin

Security considerations (CVE-2025-32434):
- Using official OpenAI model (high trust, low risk)
- local_files_only=True prevents malicious replacements
- Model preloaded during build in controlled environment
- Same approach as working version b9fd613

Official documentation:
- CLIP model files: https://huggingface.co/openai/clip-vit-base-patch32/tree/main
- Safetensors PR: https://huggingface.co/openai/clip-vit-base-patch32/discussions/21
- HF Spaces preload: https://huggingface.co/docs/hub/spaces-config-reference

Changes:
- code/clip_retrieval.py lines 141-157: Removed use_safetensors=True flags
- Added documentation comments explaining the choice

Expected result:
✅ CLIP model loads from preloaded pytorch_model.bin
✅ Text-to-LEGO functionality works correctly

Files changed (1) hide show
  1. code/clip_retrieval.py +5 -2
code/clip_retrieval.py CHANGED
@@ -141,7 +141,10 @@ class CLIPRetriever:
141
  model = CLIPModel.from_pretrained(
142
  self.model_name,
143
  cache_dir=HF_CACHE_DIR,
144
- use_safetensors=True, # Force safetensors to bypass CVE-2025-32434
 
 
 
145
  torch_dtype=torch_dtype,
146
  local_files_only=True # Use pre-downloaded model from build
147
  ).to(target_device)
@@ -149,7 +152,7 @@ class CLIPRetriever:
149
  processor = CLIPProcessor.from_pretrained(
150
  self.model_name,
151
  cache_dir=HF_CACHE_DIR,
152
- use_safetensors=True, # Force safetensors to bypass CVE-2025-32434
153
  local_files_only=True # Use pre-downloaded model from build
154
  )
155
 
 
141
  model = CLIPModel.from_pretrained(
142
  self.model_name,
143
  cache_dir=HF_CACHE_DIR,
144
+ # NOTE: Not using use_safetensors=True because openai/clip-vit-base-patch32
145
+ # only has pytorch_model.bin in main branch (model.safetensors exists in
146
+ # revision d15b5f2 but not merged). Using pytorch_model.bin is safe for
147
+ # official OpenAI model with local_files_only=True (prevents malicious replacements)
148
  torch_dtype=torch_dtype,
149
  local_files_only=True # Use pre-downloaded model from build
150
  ).to(target_device)
 
152
  processor = CLIPProcessor.from_pretrained(
153
  self.model_name,
154
  cache_dir=HF_CACHE_DIR,
155
+ # Processor doesn't have weight files, use_safetensors not applicable
156
  local_files_only=True # Use pre-downloaded model from build
157
  )
158