### Code to consider including: [flan-alpaca](https://github.com/declare-lab/flan-alpaca)
[text-generation-webui](https://github.com/oobabooga/text-generation-webui)
[minimal-llama](https://github.com/zphang/minimal-llama/)
[finetune GPT-NeoX](https://nn.labml.ai/neox/samples/finetune.html)
[GPTQ-for_LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa/compare/cuda...Digitous:GPTQ-for-GPT-NeoX:main)
[OpenChatKit on multi-GPU](https://github.com/togethercomputer/OpenChatKit/issues/20)
[Non-Causal LLM](https://huggingface.co/docs/transformers/main/en/model_doc/gptj#transformers.GPTJForSequenceClassification)
[OpenChatKit_Offload](https://github.com/togethercomputer/OpenChatKit/commit/148b5745a57a6059231178c41859ecb09164c157)
[Flan-alpaca](https://github.com/declare-lab/flan-alpaca/blob/main/training.py)
### Some open source models: [GPT-NeoXT-Chat-Base-20B](https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B/tree/main)
[GPT-NeoX](https://huggingface.co/docs/transformers/model_doc/gpt_neox)
[GPT-NeoX-20B](https://huggingface.co/EleutherAI/gpt-neox-20b)
[Pythia-6.9B](https://huggingface.co/EleutherAI/pythia-6.9b)
[Pythia-12B](https://huggingface.co/EleutherAI/neox-ckpt-pythia-12b)
[Flan-T5-XXL](https://huggingface.co/google/flan-t5-xxl)
[GPT-J-Moderation-6B](https://huggingface.co/togethercomputer/GPT-JT-Moderation-6B)
[OIG safety models](https://laion.ai/blog/oig-dataset/#safety-models)
[BigScience-mT0](https://huggingface.co/mT0)
[BigScience-XP3](https://huggingface.co/datasets/bigscience/xP3)
[BigScience-Bloomz](https://huggingface.co/bigscience/bloomz)
### Some create commons models that would be interesting to use: [Galactica-120B](https://huggingface.co/facebook/galactica-120b)
[LLaMa-small-pt](https://huggingface.co/decapoda-research/llama-smallint-pt)
[LLaMa-64b-4bit](https://huggingface.co/maderix/llama-65b-4bit/tree/main)
### Papers/Repos [Self-improve](https://arxiv.org/abs/2210.11610)
[Coding](https://arxiv.org/abs/2303.17491)
[self-reflection](https://arxiv.org/abs/2303.11366)
[RLHF](https://arxiv.org/abs/2204.05862)
[DERA](https://arxiv.org/abs/2303.17071)
[HAI Index Report 2023](https://aiindex.stanford.edu/report/)
[LLaMa](https://arxiv.org/abs/2302.13971)
[GLM-130B](https://github.com/THUDM/GLM-130B)
[RWKV RNN](https://github.com/BlinkDL/RWKV-LM)
[Toolformer](https://arxiv.org/abs/2302.04761)
[GPTQ](https://github.com/qwopqwop200/GPTQ-for-LLaMa)
[Retro](https://www.deepmind.com/publications/improving-language-models-by-retrieving-from-trillions-of-tokens)
[Clinical_outperforms](https://arxiv.org/abs/2302.08091)
[Chain-Of-Thought](https://github.com/amazon-science/mm-cot)
[scaling law1](https://arxiv.org/abs/2203.15556)
[Big-bench](https://github.com/google/BIG-bench)
[Natural-Instructions](https://github.com/allenai/natural-instructions)
### Other projects: [StackLLaMa](https://huggingface.co/blog/stackllama)
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)
[ColossalAIChat](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Chat)
[EasyLM](https://github.com/young-geng/EasyLM.git)
[Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)
[Vicuna](https://vicuna.lmsys.org/)
[Flan-Alpaca](https://github.com/declare-lab/flan-alpaca)
[FastChat](https://chat.lmsys.org/)
[alpaca-lora](https://github.com/h2oai/alpaca-lora)
[alpaca.http](https://github.com/Nuked88/alpaca.http)
[chatgpt-retrieval-pllugin](https://github.com/openai/chatgpt-retrieval-plugin)
[subtl.ai docs search on private docs](https://www.subtl.ai/)
[gertel](https://gretel.ai/)
[alpaca_lora_4bit](https://github.com/johnsmith0031/alpaca_lora_4bit)
[alpaca_lora_4bit_readme](https://github.com/s4rduk4r/alpaca_lora_4bit_readme)
[code alpaca](https://github.com/sahil280114/codealpaca)
[serge](https://github.com/nsarrazin/serge)
[BlinkDL](https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio)
[RWKV-LM](https://github.com/BlinkDL/RWKV-LM)
[MosaicCM](https://github.com/mosaicml/examples#large-language-models-llms)
[OpenAI Plugins](https://openai.com/blog/chatgpt-plugins)
[GPT3.5-Turbo-PGVector](https://github.com/gannonh/gpt3.5-turbo-pgvector)
[LLaMa-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter)
[llama-index](https://github.com/jerryjliu/llama_index)
[minimal-llama](https://github.com/zphang/minimal-llama/)
[llama.cpp](https://github.com/ggerganov/llama.cpp)
[ggml](https://github.com/ggerganov/ggml)
[mmap](https://justine.lol/mmap/)
[lamma.cpp more](https://til.simonwillison.net/llms/llama-7b-m2)
[TargetedSummarization](https://github.com/helliun/targetedSummarization)
[OpenFlamingo](https://laion.ai/blog/open-flamingo/)
[Auto-GPT](https://github.com/Torantulino/Auto-GPT)
### Apache2/etc. Data [OIG 43M instructions](https://laion.ai/blog/oig-dataset/) [direct HF link](https://huggingface.co/datasets/laion/OIG)
[More on OIG](https://laion.ai/blog/oig-dataset/)
[DataSet Viewer](https://huggingface.co/datasets/viewer/?dataset=squad)
[Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf)
[WebGPT_Comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons)
[Self_instruct](https://github.com/yizhongw/self-instruct)
[20BChatModelData](https://github.com/togethercomputer/OpenDataHub)
### Apache2/MIT/BSD-3 Summarization Data [xsum for Summarization](https://huggingface.co/datasets/xsum)
[Apache2 Summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:apache-2.0&sort=downloads)
[MIT summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:mit&sort=downloads)
[BSD-3 summarization](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:bsd-3-clause&sort=downloads)
[OpenRail](https://huggingface.co/datasets?task_categories=task_categories:summarization&license=license:openrail&sort=downloads)
[Summarize_from_feedback](https://huggingface.co/datasets/openai/summarize_from_feedback)
### Ambiguous License Data [GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)
[GPT4All](https://huggingface.co/datasets/nomic-ai/gpt4all_prompt_generations)
[LinkGPT4](https://github.com/lm-sys/FastChat/issues/90#issuecomment-1493250773)
[ShareGPT52K](https://huggingface.co/datasets/RyokoAI/ShareGPT52K)
[ShareGPT_Vicuna](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered)
[ChatLogs](https://chatlogs.net/)
[Alpaca-CoT](https://github.com/PhoebusSi/alpaca-CoT)
[LaMini-LM](https://github.com/mbzuai-nlp/LaMini-LM)
### Non-commercial Data [GPT-3 based Alpaca Cleaned](https://github.com/gururise/AlpacaDataCleaned)
### Prompt ENGR [Prompt/P-tuning](https://github.com/huggingface/peft)
[Prompt/P-tuing Nemo/NVIDIA](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html)
[Info](https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/)
[Info2](https://github.com/dair-ai/Prompt-Engineering-Guide)
[Prompt-Tuning](https://arxiv.org/abs/2104.08691)
[P-tuning v2](https://arxiv.org/abs/2110.07602)
[babyagi](https://github.com/yoheinakajima/babyagi/blob/main/babyagi.py#L97-L134)
[APE](https://www.promptingguide.ai/techniques/ape)
### Validation [Bleu/Rouge/Meteor/Bert-Score](https://arize.com/blog-course/generative-ai-metrics-bleu-score/)
### Generate Hyperparameters [hot-to-generate](https://huggingface.co/blog/how-to-generate)
[Notes_on_Transformers Chpt5](https://christianjmills.com/posts/transformers-book-notes/chapter-5/index.html)
[Notes_on_Transformers_Chpt10](https://christianjmills.com/posts/transformers-book-notes/chapter-10/index.html)
### Embeddings [OpenAI Expensive?](https://medium.com/@nils_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9)
[Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)
### Commercial products [OpenAI](https://platform.openai.com/docs/guides/fine-tuning/advanced-usage)
[OpenAI Tokenizer](https://platform.openai.com/tokenizer)
[OpenAI Playground](https://platform.openai.com/playground)
[OpenAI Chat](https://chat.openai.com/chat?)
[OpenAI GPT-4 Chat](https://chat.openai.com/chat?model=gpt-4)
[cohere](https://cohere.io/)
[coherefinetune](https://docs.cohere.ai/reference/finetune)
[DocsBotAI](https://docsbot.ai/)
[Perplexity](https://www.perplexity.ai/)
[VoiceFlow](https://www.voiceflow.com/)
[NLPCloud](https://nlpcloud.com/effectively-using-gpt-j-gpt-neo-gpt-3-alternatives-few-shot-learning.html)
### Multinode inference [FasterTransformer](https://github.com/triton-inference-server/fastertransformer_backend#multi-node-inference)
[Kubernetes Triton](https://developer.nvidia.com/blog/deploying-nvidia-triton-at-scale-with-mig-and-kubernetes/)
### Faster inference [text-generation-inference](https://github.com/huggingface/text-generation-inference)
[Optimum](https://github.com/huggingface/optimum)
### Semi-Open source Semi-Commercial products [OpenAssistant](https://open-assistant.io/)
[OpenAssistant Repo](https://github.com/LAION-AI/Open-Assistant)
[OpenChatKit](https://github.com/togethercomputer/OpenChatKit)
[OpenChatKit2](https://github.com/togethercomputer/OpenDataHub)
[OpenChatKit3](https://www.together.xyz/blog/openchatkit)
[OpenChatKit4](https://github.com/togethercomputer/OpenChatKit/blob/main/training/README.md#arguments)
[OpenChatKitPreview](https://api.together.xyz/open-chat?preview=1)
[langchain](https://python.langchain.com/en/latest/)
[langchain+pinecone](https://www.youtube.com/watch?v=nMniwlGyX-c)
### Q/A docs [HUMATA](https://www.humata.ai/)
[OSSCHat](https://osschat.io/)
[NeuralSearchCohere](https://txt.cohere.com/embedding-archives-wikipedia/)
[ue5](https://github.com/bublint/ue5-llama-lora)
### AutoGPT type projects [AgentGPT](https://github.com/reworkd/AgentGPT)
[Self-DEBUG](https://arxiv.org/abs/2304.05128)
[BabyAGI](https://github.com/yoheinakajima/babyagi/)
[AutoPR](https://github.com/irgolic/AutoPR)
### Cloud fine-tune [AWS](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html)
[AWS2](https://aws.amazon.com/blogs/machine-learning/training-large-language-models-on-amazon-sagemaker-best-practices/)
### Chatbots: [GPT4ALL Chat](https://github.com/nomic-ai/gpt4all-chat)
[GLT4ALL](https://github.com/nomic-ai/gpt4all)
[OASSST](https://open-assistant.io/chat)
[FastChat](https://github.com/lm-sys/FastChat)
[Dolly](https://huggingface.co/spaces/HuggingFaceH4/databricks-dolly)
[HF Instructions](https://huggingface.co/spaces/HuggingFaceH4/instruction-model-outputs-filtered)
[DeepSpeed Chat](https://github.com/microsoft/DeepSpeedExamples/tree/master/applications/DeepSpeed-Chat)
[LoraChat](https://github.com/bupticybee/FastLoRAChat)
[Tabby](https://github.com/TabbyML/tabby)
[TalkToModel](https://github.com/dylan-slack/TalkToModel)
[You.com](https://you.com/)
### LangChain or Agent related [Gradio Tools](https://github.com/freddyaboulton/gradio-tools)
[LLM Agents](https://blog.langchain.dev/gradio-llm-agents/)
[Meta Prompt](https://github.com/mbchang/meta-prompt)
[HF Agents](https://huggingface.co/docs/transformers/transformers_agents) [HF Agents Collab](https://colab.research.google.com/drive/1c7MHD-T1forUPGcC_jlwsIptOzpG3hSj) [Einstein GPT](https://www.salesforce.com/products/einstein/overview/?d=cta-body-promo-8) [SMOL-AI](https://github.com/smol-ai/developer) [Pandas-AI](https://github.com/gventuri/pandas-ai/) ### Summaries [LLMs](https://github.com/Mooler0410/LLMsPracticalGuide)
### Deployment [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
### Evaluations [LMSYS (check for latest glob)](https://lmsys.org/blog/2023-05-25-leaderboard/)
[LMSYS Chatbot Arena](https://chat.lmsys.org/?arena)
[LMSYS Add model](https://github.com/lm-sys/FastChat/blob/main/docs/arena.md#how-to-add-a-new-model)
[NLL](https://blog.gopenai.com/lmflow-benchmark-an-automatic-evaluation-framework-for-open-source-llms-ef5c6f142418)
[HackAPrompt](https://www.aicrowd.com/challenges/hackaprompt-2023/leaderboards)