BerKANT_171M / README.md
isemmanuelolowe's picture
Upload BerKANTForMaskedLM
30fac58 verified
|
raw
history blame
No virus
674 Bytes
metadata
language:
  - en
license: mit
tags:
  - Kolmogorov-Arnold Network
  - Bert
  - KAN

BerKANT (training)

A Bert implementation where most of the torch.nn.linear have been replaced with KANLinear.

Currently pretraining on JackBAI/bert_pretrain_datasets on a RTX 4090. Will be do in 5 days from 13/05/2024. Until then :)

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_path = 'isemmanuelolowe/BerKANT_171M'
model = AutoModelForSequenceClassification.from_pretrained(model_path, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_path)