lukestanley commited on
Commit
968cab3
1 Parent(s): 21ce4d4

Doc: Idea for speed improvements and intermediate results display, grouping future directions

Browse files
Files changed (1) hide show
  1. README.md +18 -12
README.md CHANGED
@@ -36,18 +36,24 @@ You can try out the ChillTranslator via the HuggingFace Space demo at [https://h
36
  - **Self-hostable, serverless, or APIs**: running DIY could save costs, avoid needing to sign up to APIs, and avoid the risk of toxic content causing API access to be revoked. We use llama-cpp-python with Mixtral, with a HTTP server option, and a fast "serverless" backend using RunPod currently.
37
 
38
  ## Possible future directions 🌟
39
- - **Integration**: example showing use as Python module, HTTP API, for use from other tools, browser extensions.
40
- - **Speed** improvements.
41
- - Generating rephrasings in parallel.
42
- - Use Jigsaw dataset to find spicy comments, making a dataset for training a translation transformer, maybe like Google's T5 to run faster than Mixtral could.
43
- - Split text into sentences e.g: with “pysbd” for parallel processing of translations.
44
- - Try using a 'Detoxify' scoring model instead of the current "spicy" score method.
45
- - Use natural language similarity techniques to compare possible rephrasing fidelity faster.
46
- - Enabling easy experimenting with online hosted LLM APIs
47
- - Making setup on different platforms easier
48
- - **Quality** improvements.
49
- - Collecting a dataset of spicy comments and their rephrasings.
50
- - Feedback loop: users could score rephrasings, or suggest their own.
 
 
 
 
 
 
51
 
52
  ## Getting started 🚀
53
 
 
36
  - **Self-hostable, serverless, or APIs**: running DIY could save costs, avoid needing to sign up to APIs, and avoid the risk of toxic content causing API access to be revoked. We use llama-cpp-python with Mixtral, with a HTTP server option, and a fast "serverless" backend using RunPod currently.
37
 
38
  ## Possible future directions 🌟
39
+
40
+ **Speed:**
41
+ - Generating rephrasings in parallel.
42
+ - Show intermediate results to the user, while waiting for the final result.
43
+ - Split text into sentences e.g: with “pysbd” for parallel processing of translations.
44
+
45
+ **Speed and Quality:**
46
+ - Use Jigsaw dataset to find spicy comments, making a dataset for training a translation transformer, maybe like Google's T5 to run faster than Mixtral could.
47
+ - Try using a 'Detoxify' scoring model instead of the current "spicy" score method.
48
+ - Use natural language similarity techniques to compare possible rephrasing fidelity faster.
49
+ - Collecting a dataset of spicy comments and their rephrasings.
50
+ - Feedback loop: users could score rephrasings, or suggest their own.
51
+
52
+ **Distribution:**
53
+ - Better example showing use as Python module, HTTP API, for use from other tools, browser extensions.
54
+ - Enabling easy experimenting with online hosted LLM APIs
55
+ - Making setup on different platforms easier
56
+
57
 
58
  ## Getting started 🚀
59