Error while running it with huggingface transformers

#3
by zcpp - opened

Error type 1 - Using code in your github forked version to install transformers

/CPMBee-fork-transformer/transformers/src/transformers/models/cpmbee/modeling_cpmbee.py:572 in forward
│ │
│ 569 │ │ self.inv_freq = inv_freq.to(config.torch_dtype) │
│ 570 │ │
│ 571 │ def forward(self, x: torch.Tensor, x_pos: torch.Tensor): │
│ ❱ 572 │ │ inv_freq = self.inv_freq.to(device=x.device, dtype=self.dtype) │
│ 573 │ │ │
│ 574 │ │ x_pos = x_pos * self.distance_scale │
│ 575 │ │ freqs = x_pos[..., None].to(self.dtype) * inv_freq[None, :] # (..., dim/2) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: CUDA error: device-side assert triggered

Error type 2 - Using the code in this huggingface repo with setting trust_remote_code=True

modeling_cpmbee.py:787 in forward
│ │
│ 784 │ │ │ │ + segment_rel_offset[:, :, None], │
│ 785 │ │ │ │ ~( │
│ 786 │ │ │ │ │ (sample_ids[:, :, None] == sample_ids[:, None, :]) │
│ ❱ 787 │ │ │ │ │ & (span[:, None, :] == span[:, :, None]) │
│ 788 │ │ │ │ ), # not in the same span or sample │
│ 789 │ │ │ │ 0, # avoid torch.gather overflow │
│ 790 │ │ │ ).view(batch, seqlen * seqlen) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'NoneType' object is not subscriptable

OpenBMB org

You should not use the forked github code, please follow the case in model card.
As type 2, please use model.generate(). When you use model.forward(), you should process the data by tokenizer.prepare_for_finetune().

Sign up or log in to comment