fyang0507 commited on
Commit
7a6abb6
1 Parent(s): b5f7613

add resources

Browse files
resources/embedding_model_scores.csv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model name,Provider,Model size (#pamras),Model Size (disk),Download past month,Highlights,Time Load/Inference (online compute),Mean difference paired & unpaired Q & Docs
2
+ intfloat/multilingual-e5-large,Microsoft,560M,2.2G,93K,24 layers and the embedding size is 1024,5.0s/1920s,0.062
3
+ intfloat/multilingual-e5-base,Microsoft,278M,1.1G,42K,12 layers and the embedding size is 768,3.4s/531s,0.063
4
+ sentence-transformers/LaBSE,Google,,1.9G,88K,the embedding size is 768,5.7s/620s,0.19
5
+ maidalun1020/bce-embedding-base_v1,NetEase-Youdao,279M,1.1G,111K,optimized for RAG,3.0s/495s,0.23
6
+ BAAI/bge-large-zh-v1.5,Beijing Academy of Artificial Intelligence,326M,1.3G,22K,,1.6s/1730s,0.26
7
+ uer/sbert-base-chinese-nli,Tencent,,409M,8K,12 layers and the embedding size is 768,0.6s/1350s,0.22
8
+ sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2,Sentence Transformer,,449M,38K,384 embedding size,1.4s/392s,0.25
9
+ sentence-transformers/distiluse-base-multilingual-cased-v1,Sentence Transformer,,539M,31K,768 embedding size,1.3s/163s,0.28
10
+ sentence-transformers/distiluse-base-multilingual-cased-v2,Sentence Transformer,,539M,43K,768 enbedding size,1.2s/164s,0.25
11
+ sentence-transformers/paraphrase-multilingual-mpnet-base-v2,Sentence Transformer,,1.1G,24K,768 embedding size,2.7s/463s,0.21
resources/flowchart.png ADDED
resources/langsmith_walkthrough.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:463ca06f3e980b75c325e5cc93656a7cc7885906ce1a1eb95e6e6c230fae5ce2
3
+ size 9444302
resources/link-to-flowchart.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ https://lucid.app/lucidchart/286fb57c-e78a-4723-857b-cbd92252af8a/edit
resources/llm_scores.csv ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ,baichuan-inc/Baichuan-7B,hfl/chinese-alpaca-2-7b,Qwen/Qwen-7B-Chat,Qwen/Qwen1.5-7B-Chat,HuggingFaceH4/zephyr-7b-beta,01-ai/Yi-6B-Chat,BAAI/AquilaChat2-7B-16K
2
+ "Overall (lower the better)",17,15,11,10,14,24,Generate nonsense
3
+ Ready to use,"2
4
+
5
+ Some special tokens returned but easy to clean up","1
6
+
7
+ No special tokens in responses","2
8
+
9
+ Some special tokens returned but easy to clean up","1
10
+
11
+ Robust even without system prompt","1
12
+
13
+ No special tokens in responses","3
14
+
15
+ Returning many unrelated texts, indicating post-processing requirements",NA
16
+ Instruction following - general,"2
17
+
18
+ Having problem distinguish the poem types","2
19
+
20
+ Having problem distinguish the poem types","1
21
+
22
+ Perfectly follows","2
23
+
24
+ almost follows except one case with Chinese-only instruction","2
25
+
26
+ Perfectly follows if ignoring language requirements on poems","5
27
+
28
+ Occasionally not answering questions at all",NA
29
+ Instruction Following - language,"3
30
+
31
+ Always answers in Chinese","3
32
+
33
+ Always answers in Chinese","1
34
+
35
+ Perfectly distinguishing output language requirements","2
36
+
37
+ almost follows except one case with Chinese-only instruction","2
38
+
39
+ Having problem with citing Chinese poems","1
40
+
41
+ Perfectly distinguishing output language requirements",NA
42
+ Helpfulness and Creativeness,"1
43
+
44
+ Answer questions with helpful contexts","1
45
+
46
+ Answer questions with helpful contexts","2
47
+
48
+ Very concise, sometimes too concise","1
49
+
50
+ Answer questions with helpful contexts","1
51
+
52
+ Answer questions with helpful contexts","3
53
+
54
+ Too verbose",NA
55
+ Fact,"2
56
+
57
+ Wrong answers for citing poem","2
58
+
59
+ Wrong answers for citing poem","1
60
+
61
+ No obvious mistakes","1
62
+
63
+ No obvious mistakes","2
64
+
65
+ Wrong answers for citing poem","3
66
+
67
+ Wrong answers for citing poem and country terriory",NA
68
+ Reasoning,"2
69
+
70
+ Self-consistent in reasoning but factually wrong","2
71
+
72
+ Self-consistent in reasoning but factually wrong","2
73
+
74
+ Self-consistent in reasoning but factually wrong","1
75
+
76
+ Self-consistent and factually correct","2
77
+
78
+ Self-consistent in reasoning but factually wrong","2
79
+
80
+ Self-consistent in reasoning but factually wrong",NA
81
+ Coding,"3
82
+
83
+ Valid code but wrong formatting or explanation","3
84
+
85
+ Valid code but wrong formatting or explanation","1
86
+
87
+ Perfect codes with explanation","1
88
+
89
+ Perfect codes with explanation","1
90
+
91
+ Perfect codes with explanation","4
92
+
93
+ Nonsense",NA
94
+ Inference Speed,2,1,1,1,3,3,3
resources/search-limit-chinese.png ADDED
resources/test_llm_standalone_chat.txt ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 你是谁?
2
+
3
+ 李白是谁?
4
+
5
+ 请说出李白写过的三首诗的名字。
6
+
7
+ 请全文背诵第二首诗。
8
+
9
+ 李白和杜甫认识吗?请展示你的思考过程并陈述结论。
10
+
11
+ 忘记前面的对话。告诉我到底莎士比亚的作品到底是哈姆雷特还是哈姆莱特?
12
+
13
+ 请以莎士比亚为主题写一首古体诗,要求是七言绝句。
14
+
15
+ 请以莎士比亚为主题写一首现代诗,不超过150字。
16
+
17
+ who created you?
18
+
19
+ Name the top 3 countries in the world based on how big they are.
20
+
21
+ I want to find the day of the week for the current date. Please write code in Python to fulfill such requirement.
resources/vectordatabase_evaluation.csv ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ ,Redis,OpenSearch,Chroma,FAISS,Qdrant,Superbase,Pinecone
2
+ Offline/Local mode,Y,Y,Y,Y,Y,N,N
3
+ Serverless,N,N (requires docker),Y,Y,Y,N,Y
4
+ Offload to In-disk memory,Y,Y,Y,Y,N (can’t reload),Y,N
5
+ Support self-query,Y,Y,Y,N,Y,Y,Y
6
+ Support fuzzy match,"CONTAIN, LIKE","CONTAIN, LIKE",N,N,LIKE,LIKE,IN