yassineafr commited on
Commit
c0bbf8c
1 Parent(s): 21d1e2a

Enabling PEFT for jais-13b

Browse files

When trying to fine-tune jais-13b using qlora, I encourted this error :
![Screenshot from 2024-05-16 14-23-08.png](https://cdn-uploads.huggingface.co/production/uploads/64ad9abc80f308a395e8b9c6/GlOZbAHHV33NI4l01OSKs.png)
This error says that the "hidden_states" is leaf variable(Leaf Variable: A tensor that is not the result of an operation and has requires_grad=True.) therefore it doesn't accept in-place operation like in this error:
hidden_states *= torch.tensor(float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device )

Files changed (1) hide show
  1. modeling_jais.py +7 -4
modeling_jais.py CHANGED
@@ -866,10 +866,13 @@ class JAISModel(JAISPreTrainedModel):
866
  hidden_states = inputs_embeds + position_embeds
867
  else:
868
  hidden_states = inputs_embeds
869
- hidden_states *= torch.tensor(
870
- float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device
871
- )
872
-
 
 
 
873
  if token_type_ids is not None:
874
  token_type_embeds = self.wte(token_type_ids)
875
  hidden_states = hidden_states + token_type_embeds
 
866
  hidden_states = inputs_embeds + position_embeds
867
  else:
868
  hidden_states = inputs_embeds
869
+
870
+ # hidden_states *= torch.tensor(
871
+ # float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device
872
+ # )
873
+ aux_hidden = torch.tensor(float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device)
874
+ hidden_states = hidden_states * aux_hidden
875
+
876
  if token_type_ids is not None:
877
  token_type_embeds = self.wte(token_type_ids)
878
  hidden_states = hidden_states + token_type_embeds