词法分析的库函数介绍/Introduction

1.分词 segment
method的参数包括中文：jieba_ac，jieba_all,hanlp，thulac,snownlp,ltp 英文：spacy,nltk,split

2.词干提取 stem
method的参数包括 porter，lancester，snowball

3.词形还原 lemmatize_text
method的参数包括 spacy，nltk

4.词性标注 tagging
method的参数包括中文：jieba，thulac，hanlp，npir，snownlp 英文：nltk，spacy

5.命名实体识别 named_entity_recognition
method参数包括中文：LTP(Nh 人名，Ni机构名，Ns地名),Hanlp,spacy_ch 英文：spacy_en，nltk

6.去停用词 remove_stopword

7.词频统计 count_word_frequency

在function上填入对应的功能，method里填入对应方法的method参数

需要提前安装相应的库，库的内容在require文件里

除此之外，还需要通过 python -m spacy download zh_core_web_sm 和 python -m spacy download en_core_web_sm 来安装 zh_core_web_sm==3.7.0和en_core_web_sm==3.7.1

快速开始/Quick Start

关于词法分析的库函数的使用，样例如下

from huggingface_hub import hf_hub_download
import importlib.util

# 替换为你的 Hugging Face 用户名和仓库名
def nlp(content, function, method):
    repo_id = "epetery/my-new-model"
    filename = "divide_corpus.py"
    stopwords_filename = "stopwords-master/baidu_stopwords.txt"

    # 下载文件到当前工作目录
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    stopwords_file_path = hf_hub_download(repo_id=repo_id, filename=stopwords_filename)

    # 导入模块
    spec = importlib.util.spec_from_file_location("divide_corpus", file_path)
    divide_corpus = importlib.util.module_from_spec(spec)
    spec.loader.exec_module(divide_corpus)

    divide_corpus.STOPWORDS_FILE_PATH = stopwords_file_path

    # 使用模块中的类和方法
    text_divider = getattr(divide_corpus, "NLP_Class")(content)
    if function != 'count_word_frequency':
        divided_text = getattr(text_divider, function)(method=method)
    else:
        seg_text = getattr(text_divider, 'segment')(method=method)
        freq_counter = getattr(divide_corpus, "NLP_Class")(seg_text)
        divided_text = freq_counter.count_word_frequency()
    return divided_text

# 使用模块中的函数

text = "This is a test text."
divided_text=nlp(text,'remove_stopword','nltk')
print(divided_text)

epetery
/

my-new-model

词法分析的库函数介绍/Introduction

需要提前安装相应的库，库的内容在require文件里

快速开始/Quick Start

Space using epetery/my-new-model 1