Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
ceyda 
posted an update Dec 28, 2023
Post
some mandatory hello world as first post 🤗

from tokenizers import Tokenizer
tokenizer = Tokenizer.from_pretrained("bert-base-uncased")
tokenizer.encode("Hello world~ Hello 2024").tokens

['[CLS]', 'hello', 'world', '~', 'hello', '202', '##4', '[SEP]']