|
--- |
|
license: apache-2.0 |
|
metrics: |
|
- accuracy |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
--- |
|
|
|
# Model Card for Llama8b-NNetNav-WA |
|
|
|
<!-- Provide a quick summary of what the model is/does. [Optional] --> |
|
LLama8b-NNetNav-WA is a [LLama-3.1-8B]() model that is instruct-tuned with [NNetNav]() data collected via unsupervised exploration on WebArena websites, with a larger [LLama-3.1-70B]() model. |
|
|
|
Most details about this model along with details can be found in our paper: [NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild](https://arxiv.org/abs/2410.02907). |
|
|
|
|
|
 |
|
|
|
## Table of Contents |
|
|
|
- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-) |
|
- [Table of Contents](#table-of-contents) |
|
- [Model Details](#model-details) |
|
- [Model Description](#model-description) |
|
- [Uses](#uses) |
|
- [Bias, Risks, and Limitations](#bias-risks-and-limitations) |
|
- [Training Details](#training-details) |
|
- [Training Data](#training-data) |
|
- [Training Procedure](#training-procedure) |
|
- [Environmental Impact](#environmental-impact) |
|
- [Technical Specifications [optional]](#technical-specifications-optional) |
|
- [Model Architecture and Objective](#model-architecture-and-objective) |
|
- [Compute Infrastructure](#compute-infrastructure) |
|
- [Hardware](#hardware) |
|
- [Software](#software) |
|
- [Citation](#citation) |
|
- [Model Card Authors [optional]](#model-card-authors-optional) |
|
- [Model Card Contact](#model-card-contact) |
|
- [How to Get Started with the Model](#how-to-get-started-with-the-model) |
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is/does. --> |
|
|
|
|
|
## Uses |
|
|
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
<!-- This section is meant to convey both technical and sociotechnical limitations. --> |
|
|
|
## How to Get Started with the Model |
|
|
|
```python |
|
|
|
|
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
This model was trained on the [NNetnav-WA]() corpus. |
|
|
|
|
|
### Training Procedure |
|
|
|
This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000. |
|
|
|
### Environmental Impact |
|
|
|
- **Hardware Type:** 4 H100 GPUs (80G) |
|
- **Hours used:** Roughly 2 days. |
|
- **Cloud Provider:** Stanford compute. |
|
- **Compute Region:** Stanford energy grid. |
|
|
|
### Model Architecture and Objective |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
This model was trained on a slurm cluster. |
|
|
|
### Hardware |
|
|
|
This model was trained on 4 H100s. |
|
|
|
### Software |
|
|
|
This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main) |
|
|
|
## Citation |
|
|
|
**BibTeX:** |
|
|
|
``` |
|
|
|
``` |
|
|
|
|
|
## Model Card Authors [optional] |
|
|
|
<!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. --> |
|
|
|
Shikhar Murty |
|
|
|
## Model Card Contact |
|
|
|
smurty@cs.stanford.edu |
|
shikhar.murty@gmail.com |