Spaces:

AlanTsai-0329
/

ptt_board_application

Sleeping

App Files Files Community

AlanTsai-0329 commited on Jun 12, 2023

Commit

92c0979

•

1 Parent(s): da121d0

Upload 25 files

Browse files

Files changed (26) hide show

.gitattributes +1 -0
Home.py +12 -0
pages/1_Dashboard.py +122 -0
pages/Control/Controls.py +0 -0
pages/Model/Load_Model.py +134 -0
pages/Model/__pycache__/Load_Model.cpython-39.pyc +0 -0
pages/model_param/board_classification_model/config.json +47 -0
pages/model_param/board_classification_model/pytorch_model.bin +3 -0
pages/model_param/board_classification_model/special_tokens_map.json +7 -0
pages/model_param/board_classification_model/tokenizer.json +0 -0
pages/model_param/board_classification_model/tokenizer_config.json +13 -0
pages/model_param/board_classification_model/training_args.bin +3 -0
pages/model_param/board_classification_model/vocab.txt +0 -0
pages/model_param/sentiment_analysis_model/config.json +39 -0
pages/model_param/sentiment_analysis_model/pytorch_model.bin +3 -0
pages/model_param/sentiment_analysis_model/special_tokens_map.json +7 -0
pages/model_param/sentiment_analysis_model/tokenizer.json +0 -0
pages/model_param/sentiment_analysis_model/tokenizer_config.json +15 -0
pages/model_param/sentiment_analysis_model/vocab.txt +0 -0
pages/model_param/summarization_model/config.json +36 -0
pages/model_param/summarization_model/generation_config.json +11 -0
pages/model_param/summarization_model/pytorch_model.bin +3 -0
pages/model_param/summarization_model/special_tokens_map.json +5 -0
pages/model_param/summarization_model/spiece.model +3 -0
pages/model_param/summarization_model/tokenizer.json +3 -0
pages/model_param/summarization_model/tokenizer_config.json +11 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 app/pages/model_param/summarization_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 app/pages/model_param/summarization_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+pages/model_param/summarization_model/tokenizer.json filter=lfs diff=lfs merge=lfs -text

Home.py ADDED Viewed

	@@ -0,0 +1,12 @@

+import streamlit as st
+st.header("【期末專題】文字礦工")
+st.subheader("PTT 版面分析應用")
+st.header("組員名單")
+st.markdown("""
+- 111AB8005 張云姵
+- 111AB8017 林維婕
+- 111AB8023 陳玉涵
+- 111AB8026 蔡尚宏
+""")

pages/1_Dashboard.py ADDED Viewed

	@@ -0,0 +1,122 @@

+import streamlit as st
+from pages.Model import Load_Model
+import warnings
+warnings.filterwarnings("ignore")
+st.set_page_config(
+    page_title="PTT 版面應用 App",
+    page_icon="🧊",
+    layout="wide",
+    initial_sidebar_state="expanded",
+    menu_items={
+        'About': """組員名單：張云姵、林維婕、陳玉涵、蔡尚宏""",
+    }
+)
+sample_text = [
+    'Select',
+    '[問卦] 怎麼沒有人在討論工業4.0了？ 兩年前選舉的時候 工業4.0、大數據、AI討論的沸沸揚揚 怎麼今年選舉的時候 好像沒有聽到任何工業4.0的消息？ 有卦嗎？ 口號要有實踐的方法吧 現在又跳出來了 日月光本來在高雄就有廠了吧',
+    '[問卦] 有沒有路上突然變很亮的八卦？ 小魯家附近的巷子 原本路過覺得都暗暗的 大概就是那種整路有10盞燈只開一兩盞的程度 不過剛剛經過時突然覺得整路變得超亮 原來所有的燈都打開了 路太亮反而好像有點不太習慣 有沒有人也發現生活周遭的路突然變得很亮 有卦嗎？ ',
+    'Re: [討論] 洗腦太成功!中國外交官公然講錯歷史啦!!!! 聯合國憲章 第五章 安全理事會 第23條 至今未改 原文上載的仍然是中華民國 Republic of china CHAPTER V: THE SECURITY COUNCIL COMPOSITION Article 23 The Security Council shall consist of fifteen Members of the United Nations. The Republic of China France the Union of Soviet Socialist Republics the United Kingdom of Great Britain and Northern Ireland and the United States of America shall be permanent members of the Security Council. The General Assembly shall elect ten other Members of the United Nations to be nonpermanent members of the Security Council due regard being specially paid in the first instance to the contribution of Members of the United Nations to the maintenance of international peace and security and to the other purposes of the Organization and also to equitable geographical distribution. 很奇怪從中華民國退出聯合國(1971) 中美斷交以後 (1979) 老共仍然沒有運用他們影響力去修正聯合國憲章，改成他們想要的版本 過去聯合國憲章也不是沒有修正草案過 但是對這個他們視如眼中釘、肉中刺的第23條 上載中華民國的文字 他們卻始終沒有辦法促其改動 是不想還是辦不到？ : 中共洗腦大成功 : 小弟剛剛看美國電視新聞 : 共匪外交官竟然不知道共產黨沒打過二戰 : 還建議 : 中美回到二戰時期為和平正義並肩作戰的關係 : 還提到飛虎隊、陳納德、史迪威將軍.... : 中國是第一個在聯合國憲章上簽字... : 貓滴 這些不都是中華民國、國民黨的歷史嗎？ : 有沒有卦？ : 補個國內報導： :  蔡英文真的要好好珍惜一下中華民國 作為一個前朝留下來資產的繼承者 其實並不容易 不要覺得好像沒了中華民國他們做起事來更輕鬆 我的看法是聯合國憲章原文 把Republic of China 直接改成China。只是一個動作 並不是什麼懶得改的問題，法國不也直接叫 France？ 更多的因素可能還在背後政治動機 法統地位純疑 就這樣。',
+    '[閒聊] 輝夜之後還會搞朋友測試嗎？ 就是啊 以往輝夜做這件事都需要早坂配合 可是現在的情況之後還讓早坂去做也太怪了吧 而且京都一役也讓輝夜不動的原則遭受衝擊 那之後輝夜還會搞朋友測試嗎？ ',
+    'Re: [問卦] 她跟我說還走不出來是什麼意思？ 因人而異，這裡是機掰客家人。 : : 剛跟一個喜歡的女生告白了 : 我以為我們互有好感 : 結果她跟我說她還走不出來上一段 : 我還有機會嗎？ : 求解 這種事情根據我的人生經驗 不是她走不出來，是不想走出來。 理想很美滿，現實很骨感。 要不要就試一試才知道 試都不試直接拒絕你的，就是把你當白癡。 今天才看某支影片 街訪分手但是當朋友的情侶 交往過四年、分手半年 女生在螢幕前面直接說不想跟男的當朋友 幹快笑死 男的都快哭出來了，還要被講這種話。 如果不是那個影片被刪了（應該是被嘴爆） 我會貼上來讓大家看看 上述影片印象中是 WebTVasis Taiwan 的 看他們會不會再放出來 結論： 封鎖她，找下一個。 反正你又不會去自殺 想自殺隔壁p板看一下 SOP 記得幫自己準備好後事，別嚇到別人。 都幾歲了 自己不要當白癡就好了 哪天還會上街訪被八卦酸民嘴 這種事情是因“人”而異的 穴穴指交 作者 hancel (hancel) 看板 Gossiping 標題 Re: [問卦] 女友出國交換一年 分手的機率多少？ 時間 Tue Jul 24 02:42:26 2018',
+    'Re: [閒聊] 西斯板有女生要包養男生 豪久咪有看西斯版哩0.0 這三年內豪像一次都咪有 是說哩 交往還看西斯版 也很可憐0.0 跟交往滑交友軟體一樣 看別人發性愛文 噗會J噠空虛咪0.0 現在直接看打炮實況 比較方便0.0 ',
+    '[實況] 萊莎的煉金工房2 看看系��面上有沒有改善什麼 YT:  圖奇網址在簽名檔 阿空實況台 主要實況PS和少量手機遊戲，偶爾玩玩PC遊戲，比例大概98.9:1:0.1 實況時間為每日21:00~24:00 時間會前後1一小時，確定休息會前一天結束時告知 目標是用嘴打BOSS Twitch :  Youtube :  巴哈小屋 :  習慣寫長文，把文章保存在這 UtsuhoReiuzi :轉錄至看板 PlayStation 12/03 20:07',
+    '[黑特] 帝寶住滿人，代表帝寶不貴？ 柯粉說：明倫社宅滿租了，代表明倫社宅不貴 帝寶也很搶手、住很滿，是不是代表帝寶也不貴呢？ 想不到勝文哥這麼委屈，只能跟別人擠不貴又住滿人的帝寶 ',
+    '[新聞] 阿中部長竟是撩妹高手！金句隨口電暈記者 阿中部長竟是撩妹高手金句隨口就來電暈記者 疫情指揮中心記者會上總是沉穩的指揮官陳時中， 其實是個調皮大男孩？還是個撩妹高手 ？《自由追新聞》採訪團隊專訪陳時中，除了分析疫情，也揭開陳時中較少在鏡頭前出現 的一面 武漢肺炎（新型冠狀病毒病，COVID19）疫情爆發以來，陳時中一度連續主持逾百場記者 會，被日媒形容為鐵人部長；溫和、理性且不失人文關懷的發言，不只穩穩控制疫情 ，也讓他圈粉無數。 夫妻相處到養生之道 陳時中吐真心話 影片中，陳時中面對記者的撩妹金句考驗，不但一一接招，還展現急智自創防疫撩 妹金句， 不僅撩功十足且後勁超強，讓記者聽到都直呼嚇死我了 陳時中也在專訪中透露他的養生之道、最愛的台灣小吃、夫妻放閃絕招，以及疫情過後最 想做的事 快點開影片，看陳時中分享防疫以外的真心話。   阿中真的超可愛 難怪會把記者電暈 渾身上下都散發出成熟的魅力 如果再年輕點早點出道 可能就是民進黨的半澤直樹或是金城武了 ',
+    'Re: [姆咪] 上 我想 這就是我跟大衛的距離了 大衛逗笑卡卡 我只能讓卡卡罵我操你媽 '
+    ]
+@st.cache_resource
+def load_all_model():
+    classify_model = Load_Model.Bert_Classify_Model()
+    classify_model.load_model()
+    senti_model = Load_Model.Sentiment_Model()
+    senti_model.load_model()
+    summarize_model = Load_Model.Summarization_Model()
+    summarize_model.load_model()
+    return classify_model, senti_model, summarize_model
+def text_area_widget():
+    input_content = st.selectbox(
+            "使用範例文章",
+            sample_text,
+            index=0
+        )
+    if input_content == "Select":
+        input_content = st.text_area(
+            label="請輸入文章..."
+            )
+    else:
+        input_content = st.text_area(
+            label="請輸入文章...",
+            value=input_content
+            )
+    return input_content
+def make_result_button(classify_model, senti_model, summarize_model, input_content):
+    if st.button("產出成果"):
+        if input_content == "":
+            st.error('請先輸入您想分析的文章或直接選擇範例文章！', icon="🚨")
+        else:
+            with st.spinner('Wait for it...'):
+                run_analysis(classify_model, senti_model, summarize_model, input_content)
+            st.balloons()
+def run_analysis(classify, senti, summarize, input):
+    tab_summarize, tab_classify, tab_sentiment = st.tabs(["文章摘要", "版面預測成果", "情緒分析"])
+    with tab_summarize:
+        st.subheader("文章摘要")
+        st.caption('以下為您的文章摘要。')
+        st.write(summarize.run_summarize(input))
+    with tab_classify:
+        st.subheader("版面預測成果")
+        st.caption('以下為版面預測的機率。')
+        classify_result = classify.predict(input).round(4)
+        st.write(f"您的文章最有可能是 {classify_result.iloc[0, 0]}，可能性為 {classify_result['機率'].max()*100:2f} %")
+        with st.expander("查看所有版面預測機率"):
+            st.dataframe(
+                data=classify_result,
+                use_container_width=True
+                )
+    with tab_sentiment:
+        st.subheader("情緒分析成果")
+        st.caption('以下為您的文章情緒')
+        senti_result = senti.run_sentiment(input)[0]
+        st.write(f"您的文章情緒為 {senti_result['label']}，分數為 {senti_result['score']}")
+def main():
+    # first init model
+    classify_model, senti_model, summarize_model = load_all_model()
+    # page design
+    head_section = st.container()
+    ana_section = st.container()
+    output_section = st.container()
+    with head_section:
+        st.title("Dashboard")
+        st.divider()
+    with ana_section:
+        input_content = text_area_widget()
+        st.divider()
+    with output_section:
+        make_result_button(classify_model, senti_model, summarize_model, input_content)
+if __name__ == '__main__':
+    main()

pages/Control/Controls.py ADDED Viewed

File without changes

pages/Model/Load_Model.py ADDED Viewed

	@@ -0,0 +1,134 @@

+import re
+import accelerate
+import numpy as np
+import pandas as pd
+import torch.nn.functional as F
+from pathlib import Path
+from transformers import AutoModelForSequenceClassification, BertTokenizerFast, pipeline
+accelerator = accelerate.Accelerator(cpu=True)
+class LoadException(Exception):
+    ...
+class LoadModelException(Exception):
+    ...
+class LoadTokenizerException(Exception):
+    ...
+class DIR:
+    MODEL_DIR = Path("pages/model_param")
+    CLASSIFIER_MODEL_DIR = Path(f"{MODEL_DIR}/board_classification_model")
+    SENTIMENT_MODEL_DIR = Path(f"{MODEL_DIR}/sentiment_analysis_model")
+    SUMMARIZATION_MODEL_DIR = Path(f"{MODEL_DIR}/summarization_model")
+class Bert_Classify_Model:
+    def __init__(self):
+        self.tokenizer_loaded = False
+        self.model_loaded = False
+    def load_model(self):
+        try:
+            self.tokenizer = BertTokenizerFast.from_pretrained(
+                pretrained_model_name_or_path=DIR.CLASSIFIER_MODEL_DIR,
+                local_files_only=True
+                )
+            self.tokenizer_loaded = True
+        except LoadTokenizerException:
+            raise "Tokenizer not loaded."
+        try:
+            self.model = AutoModelForSequenceClassification.from_pretrained(
+                pretrained_model_name_or_path=DIR.CLASSIFIER_MODEL_DIR,
+                local_files_only=True,
+                num_labels=4
+            )
+            self.model_loaded = True
+        except LoadModelException:
+            raise "Model not loaded."
+    @staticmethod
+    def __make_output(outputs):
+        id2label = {
+            "0": "C_Chat",
+            "1": "Gossiping",
+            "2": "HatePolotics",
+            "3": "Marginalman"
+            }
+        pred_prob = F.softmax(outputs.logits)
+        pred_prob_df = (
+            pd.DataFrame({
+                "版面": id2label.values(),
+                "機率": pred_prob[0, :].detach().numpy()
+                })
+            .sort_values(by="機率", ascending=False)
+        )
+        return pred_prob_df
+    def predict(self, text):
+        if (not self.tokenizer_loaded) and (not self.model_loaded):
+            raise LoadException("Not loaded.")
+        token_text = self.tokenizer(
+            text,
+            padding=True,
+            truncation=True,
+            return_tensors='pt'
+            )
+        outputs = self.model(**token_text)
+        result = self.__make_output(outputs)
+        return result
+class Sentiment_Model:
+    def __init__(self):
+        self.model_loaded = False
+    def load_model(self):
+        try:
+            self.model = pipeline(
+                "sentiment-analysis",
+                DIR.SENTIMENT_MODEL_DIR,
+            )
+            self.model_loaded = True
+        except LoadModelException:
+            raise "Model not loaded."
+    def run_sentiment(self, text):
+        if not self.model_loaded:
+            raise LoadModelException("model not loaded.")
+        outputs = self.model(text)
+        return outputs
+class Summarization_Model:
+    def __init__(self):
+        self.model_loaded = False
+    def load_model(self):
+        try:
+            self.model = pipeline(
+                "summarization",
+                DIR.SUMMARIZATION_MODEL_DIR
+            )
+            self.model_loaded = True
+        except LoadModelException:
+            raise "Model not loaded."
+        self.model_loaded = True
+    @staticmethod
+    def __make_output(outputs):
+        return outputs[0]["summary_text"]
+    def run_summarize(self, text):
+        if not self.model_loaded:
+            raise LoadModelException("model not loaded.")
+        outputs = self.model(text, max_length=1024)
+        result = self.__make_output(outputs)
+        return result

pages/Model/__pycache__/Load_Model.cpython-39.pyc ADDED Viewed

Binary file (4.67 kB). View file

pages/model_param/board_classification_model/config.json ADDED Viewed

	@@ -0,0 +1,47 @@

+{
+  "_name_or_path": "ckiplab/albert-tiny-chinese",
+  "architectures": [
+    "AlbertForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.0,
+  "bos_token_id": 101,
+  "classifier_dropout_prob": 0.1,
+  "down_scale_factor": 1,
+  "embedding_size": 128,
+  "eos_token_id": 102,
+  "gap_size": 0,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.0,
+  "hidden_size": 312,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3"
+  },
+  "initializer_range": 0.02,
+  "inner_group_num": 1,
+  "intermediate_size": 1248,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2,
+    "LABEL_3": 3
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "albert",
+  "net_structure_type": 0,
+  "num_attention_heads": 12,
+  "num_hidden_groups": 1,
+  "num_hidden_layers": 4,
+  "num_memory_blocks": 0,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "problem_type": "single_label_classification",
+  "tokenizer_class": "BertTokenizerFast",
+  "torch_dtype": "float32",
+  "transformers_version": "4.28.0",
+  "type_vocab_size": 2,
+  "vocab_size": 21128
+}

pages/model_param/board_classification_model/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6a0cb8aa2000d5211faebaf9d40c945f51dece7fdceef3435349b698217157f4
+size 16340421

pages/model_param/board_classification_model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

pages/model_param/board_classification_model/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

pages/model_param/board_classification_model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_lower_case": false,
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

pages/model_param/board_classification_model/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:546fa189039a1bdaf4bbc81c76bbfffec575c533bb90613c836d519d7ec2e832
+size 3579

pages/model_param/board_classification_model/vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

pages/model_param/sentiment_analysis_model/config.json ADDED Viewed

	@@ -0,0 +1,39 @@

+{
+  "_name_or_path": "IDEA-CCNL/Erlangshen-Roberta-110M-Sentiment",
+  "architectures": [
+    "BertForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "directionality": "bidi",
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "Negative",
+    "1": "Positive"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": null,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "output_past": true,
+  "pad_token_id": 1,
+  "pooler_fc_size": 768,
+  "pooler_num_attention_heads": 12,
+  "pooler_num_fc_layers": 3,
+  "pooler_size_per_head": 128,
+  "pooler_type": "first_token_transform",
+  "position_embedding_type": "absolute",
+  "torch_dtype": "float32",
+  "transformers_version": "4.28.0",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 21128
+}

pages/model_param/sentiment_analysis_model/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:615d7fe9040cbb298050a3a60797ced537736ef1dcf386199fa8b419098112d7
+size 409146741

pages/model_param/sentiment_analysis_model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

pages/model_param/sentiment_analysis_model/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

pages/model_param/sentiment_analysis_model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

pages/model_param/sentiment_analysis_model/vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

pages/model_param/summarization_model/config.json ADDED Viewed

	@@ -0,0 +1,36 @@

+{
+  "_name_or_path": "csebuetnlp/mT5_multilingual_XLSum",
+  "architectures": [
+    "MT5ForConditionalGeneration"
+  ],
+  "d_ff": 2048,
+  "d_kv": 64,
+  "d_model": 768,
+  "decoder_start_token_id": 0,
+  "dense_act_fn": "gelu_new",
+  "dropout_rate": 0.1,
+  "eos_token_id": 1,
+  "feed_forward_proj": "gated-gelu",
+  "initializer_factor": 1.0,
+  "is_encoder_decoder": true,
+  "is_gated_act": true,
+  "layer_norm_epsilon": 1e-06,
+  "length_penalty": 0.6,
+  "max_length": 84,
+  "model_type": "mt5",
+  "no_repeat_ngram_size": 2,
+  "num_beams": 4,
+  "num_decoder_layers": 12,
+  "num_heads": 12,
+  "num_layers": 12,
+  "output_past": true,
+  "pad_token_id": 0,
+  "relative_attention_max_distance": 128,
+  "relative_attention_num_buckets": 32,
+  "tie_word_embeddings": false,
+  "tokenizer_class": "T5Tokenizer",
+  "torch_dtype": "float32",
+  "transformers_version": "4.29.2",
+  "use_cache": true,
+  "vocab_size": 250112
+}

pages/model_param/summarization_model/generation_config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "_from_model_config": true,
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "length_penalty": 0.6,
+  "max_length": 84,
+  "no_repeat_ngram_size": 2,
+  "num_beams": 4,
+  "pad_token_id": 0,
+  "transformers_version": "4.29.2"
+}

pages/model_param/summarization_model/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:73285124713d581b135f95f92f49e70f24f6fa04f93ccf3bf8d6ed68d2f42a8c
+size 2329698485

pages/model_param/summarization_model/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "eos_token": "</s>",
+  "pad_token": "<pad>",
+  "unk_token": "<unk>"
+}

pages/model_param/summarization_model/spiece.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ef78f86560d809067d12bac6c09f19a462cb3af3f54d2b8acbba26e1433125d6
+size 4309802

pages/model_param/summarization_model/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:93c3578052e1605d8332eb961bc08d72e246071974e4cc54aa6991826b802aa5
+size 16330369

pages/model_param/summarization_model/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "additional_special_tokens": null,
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "</s>",
+  "extra_ids": 0,
+  "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<pad>",
+  "sp_model_kwargs": {},
+  "tokenizer_class": "T5Tokenizer",
+  "unk_token": "<unk>"
+}