fasttext-jp-embedding

This model is experimental.

Pretrained FastText word vector for Japanese

Usage

Google Colaboratory Example

! apt install aptitude swig > /dev/null 
! aptitude install mecab libmecab-dev mecab-ipadic-utf8 git make curl xz-utils file -y > /dev/null 
! pip install transformers torch mecab-python3 torchtyping > /dev/null 
! ln -s /etc/mecabrc /usr/local/etc/mecabrc
from transformers import pipeline
import pandas as pd
import numpy as np 

text = "海賊王におれはなる"

pipeline = pipeline("feature-extraction", model="paulhindemith/fasttext-jp-embedding", revision="2022.11.13", trust_remote_code=True)
pd.DataFrame(np.array(pipeline(text)).T, columns=pipeline.tokenizer.tokenize(text))
pipeline.tokenizer.target_hinshi = ["動詞", "名詞", "形容詞"]
pd.DataFrame(np.array(pipeline(text)).T, columns=pipeline.tokenizer.tokenize(text))

License

This model utilizes the folllowing pretrained vectors. Name: fastText
Credit: https://fasttext.cc/
License: Creative Commons Attribution-Share-Alike License 3.0
Link: https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.ja.vec

Downloads last month
63
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.