stanfordnlp
/

llama8b-nnetnav-wa

Model card Files Files and versions Community

llama8b-nnetnav-wa / README.md

smurty's picture

Update README.md

f99e404 verified 3 months ago

|

3.13 kB

	---
	license: apache-2.0
	metrics:
	- accuracy
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	---

	# Model Card for Llama8b-NNetNav-WA

	<!-- Provide a quick summary of what the model is/does. [Optional] -->
	LLama8b-NNetNav-WA is a [LLama-3.1-8B]() model that is instruct-tuned with [NNetNav]() data collected via unsupervised exploration on WebArena websites, with a larger [LLama-3.1-70B]() model.

	Most details about this model along with details can be found in our paper: [NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild](https://arxiv.org/abs/2410.02907).


	![show an example trajectory from NNetNav-WA](TODO)

	## Table of Contents

	- [Model Card for Llama8b-NNetNav-WA](#model-card-for--model_id-)
	- [Table of Contents](#table-of-contents)
	- [Model Details](#model-details)
	- [Model Description](#model-description)
	- [Uses](#uses)
	- [Bias, Risks, and Limitations](#bias-risks-and-limitations)
	- [Training Details](#training-details)
	- [Training Data](#training-data)
	- [Training Procedure](#training-procedure)
	- [Environmental Impact](#environmental-impact)
	- [Technical Specifications [optional]](#technical-specifications-optional)
	- [Model Architecture and Objective](#model-architecture-and-objective)
	- [Compute Infrastructure](#compute-infrastructure)
	- [Hardware](#hardware)
	- [Software](#software)
	- [Citation](#citation)
	- [Model Card Authors [optional]](#model-card-authors-optional)
	- [Model Card Contact](#model-card-contact)
	- [How to Get Started with the Model](#how-to-get-started-with-the-model)


	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is/does. -->


	## Uses



	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	## How to Get Started with the Model

	```python


	```

	## Training Details

	### Training Data

	<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	This model was trained on the [NNetnav-WA]() corpus.


	### Training Procedure

	This model was trained for 2 epochs (roughly 4k gradient steps) with a batch size of 128, and a maximum sequence length of 20000.

	### Environmental Impact

	- Hardware Type: 4 H100 GPUs (80G)
	- Hours used: Roughly 2 days.
	- Cloud Provider: Stanford compute.
	- Compute Region: Stanford energy grid.

	### Model Architecture and Objective


	### Compute Infrastructure

	This model was trained on a slurm cluster.

	### Hardware

	This model was trained on 4 H100s.

	### Software

	This model was fine-tuned with [Open-Instruct](https://github.com/allenai/open-instruct/tree/main)

	## Citation

	BibTeX:

	```

	```


	## Model Card Authors [optional]

	<!-- This section provides another layer of transparency and accountability. Whose views is this model card representing? How many voices were included in its construction? Etc. -->

	Shikhar Murty

	## Model Card Contact

	smurty@cs.stanford.edu
	shikhar.murty@gmail.com