MukeshSharma (Mukesh Sharma) – Community Activity

#66 opened about 1 year ago by

New activity in bigcode/starcoder over 1 year ago

#49 opened over 1 year ago by

#55 opened over 1 year ago by

New activity in EleutherAI/gpt-j-6b over 1 year ago

#22 opened almost 2 years ago by

#26 opened over 1 year ago by

#4 opened over 1 year ago by

New activity in bigscience/bloom almost 2 years ago

#201 opened almost 2 years ago by

New activity in EleutherAI/gpt-j-6b almost 2 years ago

#14 opened almost 2 years ago by

New activity in togethercomputer/GPT-JT-6B-v1 about 2 years ago

#16 opened about 2 years ago by

#8 opened about 2 years ago by

#15 opened about 2 years ago by

New activity in EleutherAI/gpt-j-6b about 2 years ago

#9 opened about 2 years ago by

New activity in hivemind/gpt-j-6B-8bit over 2 years ago

#11 opened over 2 years ago by

#5 opened over 2 years ago by

#3 opened over 2 years ago by

#4 opened over 2 years ago by

New activity in EleutherAI/gpt-j-6b over 2 years ago

#2 opened over 2 years ago by

#2 opened over 2 years ago by

Mukesh Sharma

MukeshSharma's activity

Mukesh Sharma

AI & ML interests

Organizations

MukeshSharma's activity

Failed to import transformers.models.mixtral.modeling_mixtral because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory

How to stop the prediction once the model is generated a sufficient solution for the asked prompt ?

Solved

what changes exactly need to be done to make use of Horovod library for distributed parallel training on multiple server

can we use this model to finetune it on our specific dataset , like how other models hosted on hugging face is done.

Trying to convert LlaMa weights to HF and running out of RAM, but don't want to buy more RAM?

Is the 14 programming Laungugae dataset uploaded on hugging face ? Any other option to doenload the data

How can we add ability remember the conversation ??

Do GPT-JT-6B-v1 model has the ability of follow up questions like CHATGPT

Training code

What is the fine tuning process of GPT-JT-6B-v1 Copied ? Any Docs available ?

GPTJForCausalLM hogs memory - inference only

hivemind / gpt-j-6B-8bit , How can i use Multiple GPU , I treid using accelerate and also using torch.nn.dataparallel() nothing works out

When will the error get resolved ?Can't load tokenizer using from_pretrained, please update its configuration

Error at the moment of training

bitsandbytes-cuda111==0.26.0 not found

EleutherAI / gpt-j-6B

EleutherAI / gpt-j-6B