isemmanuelolowe
/

BerKANT_171M

Kolmogorov-Arnold Network

Model card Files Files and versions Community

BerKANT_171M / README.md

isemmanuelolowe's picture

isemmanuelolowe

Update README.md

dfc9313 verified 3 months ago

|

history blame contribute delete

No virus

754 Bytes

	---
	language:
	- en
	license: mit
	tags:
	- Kolmogorov-Arnold Network
	- BerKANT
	- KAN
	---
	# BerKANT (training)

	A Bert implementation where most of the `torch.nn.linear` have been replaced with `KANLinear`.

	Currently pretraining on [JackBAI/bert_pretrain_datasets](https://huggingface.co/datasets/JackBAI/bert_pretrain_datasets) on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :)


	```python
	from transformers import AutoModelForMaskedLM, AutoConfig

	# Define the model path
	model_path = "isemmanuelolowe/BerKANT_171M"

	# Load the configuration
	config = AutoConfig.from_pretrained(model_path)

	# Load the model with the correct configuration
	model = AutoModelForMaskedLM.from_pretrained(model_path, config=config, trust_remote_code=True)
	```