some1nostr
/

Ostrich-70B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

Ostrich-70B / README.md

some1nostr's picture

Update README.md

42bc602 verified 7 months ago

|

1.52 kB



	---
	license: apache-2.0
	---

	![Ostrich-70B](https://primal.b-cdn.net/media-cache?s=o&a=1&u=https%3A%2F%2Fm.primal.net%2FHyFP.png)

	# Model Card for Ostrich


	- Trained with some of the Nostr notes
	- Trained a bit about bitcoin
	- Aligned a bit in these domains:
	- Health
	- Permaculture
	- Phytochemicals
	- Alternative medicine
	- Herbs
	- Nutrition

	Read more about it here:
	https://habla.news/a/naddr1qvzqqqr4gupzp8lvwt2hnw42wu40nec7vw949ys4wgdvums0svs8yhktl8mhlpd3qqxnzde3xsunjwfkxcunwv3jvtnjyc

	## Model Details


	- Finetuned from model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct


	## Uses

	Ask any question, compared to other models this may know more about Nostr and Bitcoin.
	You can use llama.cpp to chat with it.
	You can also use llama-cpp-python package to use it in a Python script.

	Llama3 chat template can be used. <\|begin_of_text\|><\|start_header_id\|> ...

	Use repeat penalty of 1.05 or more to avoid repetitions.


	## Warning

	Users (both direct and downstream) should be aware of the risks, biases and limitations of the model.
	The trainer, developer or uploader of this model does not assume any liability. Use it at your own risk.


	## Training Details

	### Training Data

	Nostr related info from web and nostr itself, bitcoin related info. Info on health domain.
	Information that aligns well with humanity is preferred.

	### Training Procedure

	LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.

	The Nostr training took ~30 hours for a dataset of about 20MB.