Testing notes and Recommendations
Could you push safetensors to the repos instead of or in addition to pytorch? It greatly speeds up download speeds for users. You just need to add safe_serialization=True when you push to hub.
There seems to be an issue with the tokenizer in this repo. It doesn't load smoothly.
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a `tokenizers` library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
- The model is much much better than the previous sft for conversation. I tested the 13B model for passkey retrieval and it fails at 2k, 4k, 8k lengths. The base llama model succeeds at 2k, so it looks like performance there has degraded:
Length of input is 1915
[INST] we are here live in passkey-u89dsnakj8 Omaha Nebraska good morning everybody I'm Becky quick along with Mike santoli and in just 30 min...
...I guess we'll dig into that a little deeper today and we've got some numbers that you've been going through be very interested to hear the color on th
Respond with the passkey contained within the above text. [/INST]
Berkshire Hathaway's first quarter earnings were released moments ago, and the results are quite intriguing. The company reported earnings of $35.8 billion, a 13% increas...
The response is sensible, it just doesn't answer the request for the passkey.
Hi,
Thanks for your messages.
Thanks for your reminding. I will do so tomorrow.
It is weird. I can load this model smoothly on my machine. Would you please provide me your environment message, like
conda list
? I will test it in your environment.This is an interesting observation. During the sft, we previously train the models on all long-context data. And then the resulted model has a serious degradation on short QA. After that, we fuse the train data with short QA from alpaca. It obtains better performance on short QA and is also good at long QA. But it is then not good at retrieval. This is somewhat a trade-offs.
For models without sft, they are good at retrieval. Because they are trained consistently on long context data, like 32768. For example, our 7b 32k models, https://huggingface.co/Yukang/Llama-2-7b-longlora-32k-ft
Regards,
Yukang Chen
Hi,
I have uploaded the safetensors files.
Regards,
Yukang Chen
Thanks Yukang for the safetensors. I saved a lot of time loading today. BTW, may be better to push bf16 since I don't know if anyone uses float32 and it's 2x as big.
Regarding point 2, here is 'pip list':
Package Version
--------------------------------- -------------
accelerate 0.24.0.dev0
aiohttp 3.8.6
aiosignal 1.3.1
anyio 3.7.1
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
asttokens 2.2.1
async-lru 2.0.3
async-timeout 4.0.3
attrs 23.1.0
Babel 2.12.1
backcall 0.2.0
beautifulsoup4 4.12.2
bitsandbytes 0.41.1
bleach 6.0.0
blinker 1.4
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 2.1.1
cmake 3.25.0
comm 0.1.3
cryptography 3.4.8
datasets 2.14.5
dbus-python 1.2.18
debugpy 1.6.7
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.7
distro 1.7.0
einops 0.7.0
exceptiongroup 1.1.2
executing 1.2.0
fastjsonschema 2.17.1
filelock 3.9.0
fqdn 1.5.1
frozenlist 1.4.0
fsspec 2023.6.0
httplib2 0.20.2
huggingface-hub 0.17.3
idna 3.4
importlib-metadata 4.6.4
ipykernel 6.24.0
ipython 8.14.0
ipython-genutils 0.2.0
ipywidgets 8.0.7
isoduration 20.11.0
jedi 0.18.2
jeepney 0.7.1
Jinja2 3.1.2
json5 0.9.14
jsonpointer 2.4
jsonschema 4.18.0
jsonschema-specifications 2023.6.1
jupyter-archive 3.3.4
jupyter_client 8.3.0
jupyter-contrib-core 0.4.2
jupyter-contrib-nbextensions 0.7.0
jupyter_core 5.3.1
jupyter-events 0.6.3
jupyter-highlight-selected-word 0.2.0
jupyter-lsp 2.2.0
jupyter-nbextensions-configurator 0.6.3
jupyter_server 2.7.0
jupyter_server_terminals 0.4.4
jupyterlab 4.0.2
jupyterlab-pygments 0.2.2
jupyterlab_server 2.23.0
jupyterlab-widgets 3.0.8
keyring 23.5.0
launchpadlib 1.10.16
lazr.restfulclient 0.14.4
lazr.uri 1.0.6
lit 15.0.7
lxml 4.9.3
MarkupSafe 2.1.2
matplotlib-inline 0.1.6
mistune 3.0.1
more-itertools 8.10.0
mpmath 1.2.1
multidict 6.0.4
multiprocess 0.70.15
nbclassic 1.0.0
nbclient 0.8.0
nbconvert 7.6.0
nbformat 5.9.1
nest-asyncio 1.5.6
networkx 3.0
notebook 6.5.4
notebook_shim 0.2.3
numpy 1.24.1
nvidia-cublas-cu11 11.10.3.66
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 8.5.0.96
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.2.10.91
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.4.91
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu11 2.14.3
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.2.140
nvidia-nvtx-cu11 11.7.91
nvidia-nvtx-cu12 12.1.105
oauthlib 3.2.0
overrides 7.3.1
packaging 23.1
pandas 2.1.1
pandocfilters 1.5.0
parso 0.8.3
peft 0.6.0.dev0
pexpect 4.8.0
pickleshare 0.7.5
Pillow 9.3.0
pip 23.2.1
platformdirs 3.8.1
prometheus-client 0.17.0
prompt-toolkit 3.0.39
psutil 5.9.5
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 13.0.0
pycparser 2.21
Pygments 2.15.1
PyGObject 3.42.1
PyJWT 2.3.0
pyparsing 2.4.7
python-apt 2.4.0+ubuntu1
python-dateutil 2.8.2
python-json-logger 2.0.7
pytz 2023.3.post1
PyYAML 6.0
pyzmq 25.1.0
referencing 0.29.1
regex 2023.10.3
requests 2.28.1
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.8.10
safetensors 0.4.0
scipy 1.11.3
SecretStorage 3.3.1
Send2Trash 1.8.2
setuptools 68.0.0
six 1.16.0
sniffio 1.3.0
soupsieve 2.4.1
stack-data 0.6.2
sympy 1.11.1
terminado 0.17.1
tinycss2 1.2.1
tokenizers 0.14.1
tomli 2.0.1
torch 2.0.1
torchaudio 2.0.2+cu118
torchvision 0.15.2+cu118
tornado 6.3.2
tqdm 4.66.1
traitlets 5.9.0
transformers 4.35.0.dev0
triton 2.0.0
typing_extensions 4.4.0
tzdata 2023.3
uri-template 1.3.0
urllib3 1.26.13
wadllib 1.3.6
wcwidth 0.2.6
webcolors 1.13
webencodings 0.5.1
websocket-client 1.6.1
wheel 0.40.0
widgetsnbextension 4.0.8
xformers 0.0.22
xxhash 3.4.1
yarl 1.9.2
zipp 1.0.0
I'm using a jupyter notebook.
Hi,
We have updated our LongAlpaca models from alpaca prompting to llama2 prompting, which is consistent to their pre-trained models. Please refer to the updated models and the inference code below with the llama2 prompting.
In addition, we updated our requirement list. We have test on it. It should be all right for training and inference.
https://github.com/dvlab-research/LongLoRA/blob/main/requirements.txt
Regards,
Yukang Chen
Thanks Yukang.
Does this mean that all data used to train LongAlpaca models is generated with Llama 2 and thus has a Llama 2 type license (i.e. it is no longer limited by having used openai conversations)?
Thanks
Hi,
The training data is not generated by llama2. It is collected by ours. We just use the format of llama2 prompting.
"<s>[INST] <<SYS>>\n"
"You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\n"
"If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n"
"<</SYS>> \n\n {instruction} [/INST]"
Regards,
Yukang Chen
Thanks! That helps clarify.
I guess the Alpaca data is "The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes. "
So that means this model still wouldn't be available for commercial use?
Yes. I think so.