Pre training

by vhh23 - opened Apr 12, 2023

Discussion

vhh23

Apr 12, 2023

Is it possible to pre train this model on a new data ?

srowen

Databricks org Apr 12, 2023

Yes. See the Github repo associated with this model: https://github.com/databrickslabs/dolly You can simply supply a different data set in the same format, to fine-tune the base model differently.

vhh23

Apr 12, 2023

Should we use type as "classification" if we just want to give data as a pre training data ?

srowen

Databricks org Apr 12, 2023

No, this is not a classification model. It is a text-generation model. See the repo for the full training script and more information.

vhh23 changed discussion status to closed Apr 12, 2023

ammontenegrod

Apr 20, 2023

Is databricks/dolly-v2-12b multilingua?

matthayes

Databricks org Apr 20, 2023

@ammontenegrod no it is based on the Pythia model, which is pretrained on English text from The Pile.

https://huggingface.co/EleutherAI/pythia-12b

srowen

Databricks org Apr 20, 2023

You might find it has some non-English tokens and works a little bit, from snippets of non-English language in the training data, but generally no.

Chintan-Donda

May 22, 2023

•

edited May 22, 2023

Can we use this model for 'question-answering' task?

For ex:
task="question-answering"

model, tokenizer = load_model_tokenizer_for_generate(input_model='databricks/dolly-v2-3b')
llm = HuggingFacePipeline(
    pipeline=InstructionTextGenerationPipeline(
        # Return the full text, because this is what the HuggingFacePipeline expects.
        model=model, tokenizer=tokenizer, return_full_text=True, task="question-answering",
        torch_dtype=torch.bfloat16, max_new_tokens=512, top_p=0.95, top_k=50),
    )

This model is using AutoModelForCausalLM. Can we use AutoModelForQuestionAnswering when retraining the model?

srowen

Databricks org May 22, 2023

(Please use new threads for new questions)
question-answering means extractive QA, and no it is not that type of model. You can answer questions, but not in the sense of that task.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment