may be little bugs
#14
by
prudant
- opened
Hi there! I'm testing your model in a spanish RAG application and it is for far the best spanish model i have ever tried, using your model to retrieve and rerank works like a charm! very nice JOB!
in my tests I have find some extrange behaviours:
- the embeddings results values changes a little bit when I switch from batchsize = 1 to > 1
- when I have a batchsize > 1 and I call first the compute_score works fine, but when I call encode it hangs the process in a deadlock. (only happens in that secuence order of calls)
- I debug the model code and I find that the Dataloader class in the encode method has the deadlock problem, I do not known too much of pytorch so I looked to the for iterator in the compute_score and that method do not use a DataLoader, instead use direct the tokenizer. I adapted de code of encode (based on the compute_score method):
from:
all_dense_embeddings, all_lexical_weights, all_colbert_vec = [], [], []
for batch_data in tqdm(data_loader, desc='encoding', mininterval=10):
batch_data = batch_data.to(self.device)
output = self.model(batch_data,
return_dense=return_dense,
return_sparse=return_sparse,
return_colbert=return_colbert_vecs)
to:
all_dense_embeddings, all_lexical_weights, all_colbert_vec = [], [], []
for start_index in tqdm(range(0, len(sentences), batch_size), desc="Encoding", disable=len(sentences) < 0):
sentences_batch = sentences[start_index:start_index + batch_size]
queries_inputs = self.tokenizer(sentences_batch,
max_length=max_length,
padding=True,
return_token_type_ids=False,
truncation=True,
return_tensors='pt').to(self.device)
output = self.model(queries_inputs,
return_dense=return_dense,
return_sparse=return_sparse,
return_colbert=return_colbert_vecs)
And the problem of the deadlock gone.
Hope my findings help you to improve your great JOB!
Best regards
Thanks for sharing!
Thanks for your findings! We have fixed this issue.
thanks to you :)