bling-1.4b-0.1 / README.md
doberst's picture
Upload 4 files
4e1e91d
metadata
license: apache-2.0

Model Card for Model ID

BLING-1b-0.1 is the first model release in the BLING ("Best Little Instruction-following No-GPU-required") model series.

BLING models are designed as custom instruct-following laptop-effective GPT decoder-based models (~1B-2.7B parameters). BLING models are currently built on top of Pythia (GPTNeox architecture) base models and other Apache 2.0-licensed GPT-compatible models with primary focus on 'little' models in the range of 1B, 1.3-1.4B, and 2.7B parameters. (Note: in our testing, we have seen relatively limited success with instruct-following models below <1B parameters.)

BLING models are fine-tuned with distilled high-quality custom instruct datasets, targeted at a specific subset of instruct tasks with the objective of providing a high-quality Instruct model that can be run entirely without a GPU server, with good quality instruct-following capability that can be loaded and run locally on a laptop.

Model Details

Model Description

  • Developed by: llmware
  • Shared by [optional]: Darren Oberst
  • Model type: GPTNeoX instruct-trained decoder
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Finetuned from model [optional]: EleutherAI/Pythia-1b-deduped

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

The intended use of BLING models is two-fold:

  1. Provide a high-quality Instruct models that can run on a laptop for local testing. We have found it extremely useful when building a proof-of-concept, or working with sensitive enterprise data that must be closely guarded, especially in RAG use cases.

  2. Push the state of the art for smaller Instruct-following models in the 1B - 7B range.

Direct Use

BLING is designed for enterprise automation use cases, especially in knowledge-intensive industries, such as financial services, legal and regulatory industries. BLING is intended to be an experimental series of little instruct models targeted as specific RAG automation tasks with complex information sources. Rather than try to be "all things to all people," BLING models try to focus on a narrower set of Instructions more suitable to a ~1B parameter GPT model.

BLING is ideal for rapid prototyping, testing, and the ability to perform an end-to-end workflow locally on a laptop without having to send sensitive information over an Internet-based API.

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

  1. BLING is not designed for 'chat-bot' or 'consumer-oriented' applications.

  2. BLING is not optimal for most production applications, other than simple and highly specific use cases.

[More Information Needed]

Bias, Risks, and Limitations

BLING has not been designed for end consumer-oriented applications, and there has been any focus in training on important safeguards to mitigate potential bias and safety. We would strongly discourage any use of BLING for any 'chatbot' use case.

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]