Spaces:

instructlab
/

README

Running

App Files Files Community

mairindubh commited on Jun 3

Commit

6c39993

•

1 Parent(s): c3af1bf

Update README.md

Browse files

Files changed (1) hide show

README.md +86 -4

README.md CHANGED Viewed

@@ -1,10 +1,92 @@
 ---
 title: README
-emoji: 🐨
-colorFrom: indigo
-colorTo: pink
 sdk: static
 pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 ---
 title: README
+emoji: 🐶
+colorFrom: purple
+colorTo: indigo
 sdk: static
 pinned: false
 ---
+### InstructLab
+**Project Name**: InstructLab
+**Description**:
+[InstructLab](https://instructlab.ai) (based on [the Large-scale Alignment for ChatBots technique](https://arxiv.org/abs/2403.01081))
+is an innovative open-source initiative led by Red Hat and IBM.
+The project aims to enhance the capabilities of Large Language Models
+(LLMs) through a community-driven approach that leverages a novel
+taxonomy-based curation process and synthetic data generation. InstructLab
+provides tools for users to engage with and improve LLMs, contributing skills
+and knowledge to the project’s taxonomy repository.
+**Key Features**:
+- **ilab Command-Line Interface (CLI)**: Allows users to interact
+with, train, and fine-tune LLMs using custom taxonomy data. The CLI
+supports various platforms including macOS, Fedora Linux, and Windows.
+- **Synthetic Data Generation**: Enhances LLM training through the
+creation of synthetic datasets.
+- **Taxonomy Repository**: A structured repository where users can
+submit and manage their contributions of skills and knowledge.
+**Core Components**:
+1. **ilab CLI Tool**: Facilitates model interaction, training, and
+data generation.
+2. **Taxonomy Tree**: Organizes skills and knowledge contributions for
+model tuning.
+3. **Community Collaboration**: Encourages open-source contributions,
+including new features, bug fixes, and documentation improvements.
+**Granite and Merlinite Models**:
+- **Merlinite**: Merlinite is instruct-tuned from the Mistral model,
+providing overall better accuracy than Mistral. It is continuously
+improved using user-submitted data from the taxonomy repository,
+incorporating both skills and knowledge.
+- **Granite**: [Granite](https://huggingface.co/ibm-granite/granite-7b-base)
+is a base model developed from scratch by IBM Research, trained on 2 trillion
+tokens. The datasets the model was trained on are openly cited in [its
+HuggingFace model card](https://huggingface.co/ibm-granite/granite-7b-base).
+**Installation and Usage**:
+- [Detailed instructions are available for setting up the `ilab` CLI
+tool](https://github.com/instructlab/instructlab) on various operating systems. Key steps include installing
+necessary dependencies, creating a virtual environment, and
+initializing the `ilab` tool.
+- The CLI supports commands for chatting with models, generating
+synthetic data, downloading pre-trained models, and training models
+with user-generated data.
+**Community and Contribution**:
+- InstructLab welcomes contributions from the open-source community.
+Users can submit pull requests to the taxonomy repository, participate
+in discussions, and contribute to ongoing development.
+- The project maintains [a comprehensive guide for contributors](https://github.com/instructlab/community),
+outlining best practices and governance.
+**Getting Started**:
+1. **Install ilab CLI**: Follow the installation instructions specific
+to your operating system.
+2. **Initialize ilab**: Set up the local environment and clone the
+taxonomy repository.
+3. **Contribute**: Create and submit new skills and knowledge to improve LLMs.
+**Repository Links**:
+- [InstructLab Main Repository](https://github.com/instructlab/instructlab)
+- [Taxonomy Repository](https://github.com/instructlab/taxonomy)
+- [Community Repository](https://github.com/instructlab/community)
+**Contact and Support**:
+- Join the InstructLab community on
+[Slack](https://instruct-lab.slack.com.) for support and
+collaboration.
+- Refer to the [documentation](https://github.com/instructlab/instructlab)
+for detailed guides and troubleshooting tips.
+**Licenses**:
+- InstructLab is released under the Apache-2.0 license.
+For more details and to get involved, visit the [InstructLab GitHub
+page](https://github.com/instructlab).