|
--- |
|
language: |
|
- en |
|
license: mit |
|
tags: |
|
- Kolmogorov-Arnold Network |
|
- Bert |
|
- KAN |
|
--- |
|
# BerKANT (training) |
|
|
|
A Bert implementation where most of the `torch.nn.linear` have been replaced with `KANLinear`. |
|
|
|
Currently pretraining on [JackBAI/bert_pretrain_datasets](https://huggingface.co/datasets/JackBAI/bert_pretrain_datasets) on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :) |
|
|
|
|
|
```python |
|
from transformers import AutoModelForMaskedLM, AutoConfig |
|
|
|
# Define the model path |
|
model_path = "isemmanuelolowe/BerKANT_171M" |
|
|
|
# Load the configuration |
|
config = AutoConfig.from_pretrained(model_path) |
|
|
|
# Load the model with the correct configuration |
|
model = AutoModelForMaskedLM.from_pretrained(model_path, config=config, trust_remote_code=True) |
|
``` |