BerKANT_171M / README.md
isemmanuelolowe's picture
Update README.md
dfc9313 verified
|
raw
history blame contribute delete
No virus
754 Bytes
---
language:
- en
license: mit
tags:
- Kolmogorov-Arnold Network
- BerKANT
- KAN
---
# BerKANT (training)
A Bert implementation where most of the `torch.nn.linear` have been replaced with `KANLinear`.
Currently pretraining on [JackBAI/bert_pretrain_datasets](https://huggingface.co/datasets/JackBAI/bert_pretrain_datasets) on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :)
```python
from transformers import AutoModelForMaskedLM, AutoConfig
# Define the model path
model_path = "isemmanuelolowe/BerKANT_171M"
# Load the configuration
config = AutoConfig.from_pretrained(model_path)
# Load the model with the correct configuration
model = AutoModelForMaskedLM.from_pretrained(model_path, config=config, trust_remote_code=True)
```