freemt commited on
Commit
efedef5
1 Parent(s): 09239dd

Update article in __main__ and docs

Browse files
docs/build/doctrees/environment.pickle CHANGED
Binary files a/docs/build/doctrees/environment.pickle and b/docs/build/doctrees/environment.pickle differ
 
docs/build/doctrees/examples.doctree CHANGED
Binary files a/docs/build/doctrees/examples.doctree and b/docs/build/doctrees/examples.doctree differ
 
docs/build/doctrees/index.doctree CHANGED
Binary files a/docs/build/doctrees/index.doctree and b/docs/build/doctrees/index.doctree differ
 
docs/build/doctrees/intro.doctree CHANGED
Binary files a/docs/build/doctrees/intro.doctree and b/docs/build/doctrees/intro.doctree differ
 
docs/build/doctrees/userguide-zh.doctree ADDED
Binary file (10.8 kB). View file
 
docs/build/doctrees/userguide.doctree ADDED
Binary file (9.98 kB). View file
 
docs/build/html/_sources/examples.rst.txt CHANGED
@@ -1,10 +1,10 @@
1
  Examples
2
  =============
3
 
 
 
4
  Installation/Usage:
5
  *******************
6
  As the package has not been published on PyPi yet, it CANNOT be install using pip.
7
 
8
- For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
9
-
10
-
 
1
  Examples
2
  =============
3
 
4
+ ``radiobee`` has in-built examples. Just click one of the rows in the ``Examples`` table and click ``Submit`` to testrun.
5
+
6
  Installation/Usage:
7
  *******************
8
  As the package has not been published on PyPi yet, it CANNOT be install using pip.
9
 
10
+ For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`
 
 
docs/build/html/_sources/index.rst.txt CHANGED
@@ -11,8 +11,10 @@ Welcome to radiobee's documentation!
11
  :caption: Contents:
12
 
13
  intro
14
- radiobee
 
15
  examples
 
16
 
17
  Indices and tables
18
  ==================
 
11
  :caption: Contents:
12
 
13
  intro
14
+ userguide
15
+ userguide-zh
16
  examples
17
+ radiobee
18
 
19
  Indices and tables
20
  ==================
docs/build/html/_sources/intro.rst.txt CHANGED
@@ -1,16 +1,16 @@
1
  Introduction
2
  ============
3
 
4
- ``radiobee`` (``radiobee aligner``) is a powerful dualtext aligner.
5
 
6
- The aim here was
7
 
8
  The current implementation has been developed in Python 3 and ``gradio``.
9
 
10
  Motivation
11
  **********
12
 
13
- Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.
14
 
15
  Limitations
16
  ***********
 
1
  Introduction
2
  ============
3
 
4
+ ``radiobee`` (or ``radiobee aligner`` in full) is a powerful dualtext aligner.
5
 
6
+ The aim here was to provide an interface to aligner two texts.
7
 
8
  The current implementation has been developed in Python 3 and ``gradio``.
9
 
10
  Motivation
11
  **********
12
 
13
+ Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used in machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.
14
 
15
  Limitations
16
  ***********
docs/build/html/_sources/userguide-zh.rst.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 使用说明
2
+ ----------
3
+
4
+ - ``radiobee aligner``是``bumblebee`` aligner`的孪生兄弟。请加入qq群`316287378`了解这些对齐工具.
5
+
6
+ - ``radiobee``目前仅支持中英、英中对齐。
7
+ - ``radiobee``目前仅支持纯文本文件上载 (txt, md, csv 等)。 可以以后会支持``docx``, ``pdf``, ``srt``, ``html``等格式。
8
+ - 第二次上载文件前请点击"Clear"。
9
+ - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
10
+ - ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
11
+
12
+ - ``esp``设大些或``min_samples``设小些可以得到更多的对齐对但也会 **误报对** (错误对)。另一方面,``esp``设小些或``min_samples``设大些则可能会错失一些’优质对‘。
13
+
14
+ - 嫌图太小的话,可以右击拷出图的链接用浏览器独立访问拷出来的链接或右击存盘再用看图程序打开存盘的图文件。
15
+ - ``Flag``: 运行出错是可以点击``Flag``存下有关参数查看或通知开发者。
docs/build/html/_sources/userguide.rst.txt ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ How to use
2
+ ----------
3
+
4
+ - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
+
6
+ - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
7
+ - Click "Clear" first for subsequent submits when uploading files.
8
+ - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
9
+ - Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
10
+
11
+ - Larger ``esp`` or smaller ``min_samples`` will result in more aligned pairs but also more **false positives** (pairs falsely identified as candidates). On the other hand, smaller ``esp`` or larger ``min_samples`` values tend to miss 'good' pairs.
12
+
13
+ - If you need to have a better look at the image, you can right-click on the image and select copy-image-address and open a new tab in the browser with the copied image address.
14
+ - ``Flag``: Should something go wrong, you can click Flag to save the output and inform the developer.
docs/build/html/examples.html CHANGED
@@ -18,7 +18,8 @@
18
  <script src="_static/js/theme.js"></script>
19
  <link rel="index" title="Index" href="genindex.html" />
20
  <link rel="search" title="Search" href="search.html" />
21
- <link rel="prev" title="radiobee package" href="radiobee.html" />
 
22
  </head>
23
 
24
  <body class="wy-body-for-nav">
@@ -39,11 +40,12 @@
39
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
40
  <ul class="current">
41
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
42
- <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
43
  <li class="toctree-l1 current"><a class="current reference internal" href="#">Examples</a><ul>
44
  <li class="toctree-l2"><a class="reference internal" href="#installation-usage">Installation/Usage:</a></li>
45
  </ul>
46
  </li>
 
47
  </ul>
48
 
49
  </div>
@@ -72,10 +74,11 @@
72
 
73
  <section id="examples">
74
  <h1>Examples<a class="headerlink" href="#examples" title="Permalink to this headline"></a></h1>
 
75
  <section id="installation-usage">
76
  <h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
77
  <p>As the package has not been published on PyPi yet, it CANNOT be install using pip.</p>
78
- <p>For now, the suggested method is to download the zipped package or use the online version at <a class="reference external" href="https://huggingface.co/spaces/mikeee/radiobee-aligner/">https://huggingface.co/spaces/mikeee/radiobee-aligner/</a></p>
79
  </section>
80
  </section>
81
 
@@ -83,7 +86,8 @@
83
  </div>
84
  </div>
85
  <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
86
- <a href="radiobee.html" class="btn btn-neutral float-left" title="radiobee package" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
 
87
  </div>
88
 
89
  <hr/>
 
18
  <script src="_static/js/theme.js"></script>
19
  <link rel="index" title="Index" href="genindex.html" />
20
  <link rel="search" title="Search" href="search.html" />
21
+ <link rel="next" title="radiobee package" href="radiobee.html" />
22
+ <link rel="prev" title="How to use" href="userguide.html" />
23
  </head>
24
 
25
  <body class="wy-body-for-nav">
 
40
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
41
  <ul class="current">
42
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
43
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
44
  <li class="toctree-l1 current"><a class="current reference internal" href="#">Examples</a><ul>
45
  <li class="toctree-l2"><a class="reference internal" href="#installation-usage">Installation/Usage:</a></li>
46
  </ul>
47
  </li>
48
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
49
  </ul>
50
 
51
  </div>
 
74
 
75
  <section id="examples">
76
  <h1>Examples<a class="headerlink" href="#examples" title="Permalink to this headline"></a></h1>
77
+ <p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> has in-built examples. Just click one of the rows in the <code class="docutils literal notranslate"><span class="pre">Examples</span></code> table and click <code class="docutils literal notranslate"><span class="pre">Submit</span></code> to testrun.</p>
78
  <section id="installation-usage">
79
  <h2>Installation/Usage:<a class="headerlink" href="#installation-usage" title="Permalink to this headline"></a></h2>
80
  <p>As the package has not been published on PyPi yet, it CANNOT be install using pip.</p>
81
+ <p>For now, the suggested method is to download the zipped package or use the online version at <cite>https://huggingface.co/spaces/mikeee/radiobee-aligner/ &lt;https://huggingface.co/spaces/mikeee/radiobee-aligner/&gt;</cite></p>
82
  </section>
83
  </section>
84
 
 
86
  </div>
87
  </div>
88
  <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
89
+ <a href="userguide.html" class="btn btn-neutral float-left" title="How to use" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
90
+ <a href="radiobee.html" class="btn btn-neutral float-right" title="radiobee package" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
91
  </div>
92
 
93
  <hr/>
docs/build/html/genindex.html CHANGED
@@ -37,8 +37,10 @@
37
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
38
  <ul>
39
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
40
- <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
 
41
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
 
42
  </ul>
43
 
44
  </div>
 
37
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
38
  <ul>
39
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
40
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
41
+ <li class="toctree-l1"><a class="reference internal" href="userguide-zh.html">使用说明</a></li>
42
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
43
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
44
  </ul>
45
 
46
  </div>
docs/build/html/index.html CHANGED
@@ -39,8 +39,10 @@
39
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
40
  <ul>
41
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
42
- <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
 
43
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
 
44
  </ul>
45
 
46
  </div>
@@ -77,6 +79,12 @@
77
  <li class="toctree-l2"><a class="reference internal" href="intro.html#limitations">Limitations</a></li>
78
  </ul>
79
  </li>
 
 
 
 
 
 
80
  <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a><ul>
81
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#submodules">Submodules</a></li>
82
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#radiobee-align-sents-module">radiobee.align_sents module</a></li>
@@ -109,10 +117,6 @@
109
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#module-contents">Module contents</a></li>
110
  </ul>
111
  </li>
112
- <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a><ul>
113
- <li class="toctree-l2"><a class="reference internal" href="examples.html#installation-usage">Installation/Usage:</a></li>
114
- </ul>
115
- </li>
116
  </ul>
117
  </div>
118
  </section>
 
39
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
40
  <ul>
41
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
42
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
43
+ <li class="toctree-l1"><a class="reference internal" href="userguide-zh.html">使用说明</a></li>
44
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
45
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
46
  </ul>
47
 
48
  </div>
 
79
  <li class="toctree-l2"><a class="reference internal" href="intro.html#limitations">Limitations</a></li>
80
  </ul>
81
  </li>
82
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
83
+ <li class="toctree-l1"><a class="reference internal" href="userguide-zh.html">使用说明</a></li>
84
+ <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a><ul>
85
+ <li class="toctree-l2"><a class="reference internal" href="examples.html#installation-usage">Installation/Usage:</a></li>
86
+ </ul>
87
+ </li>
88
  <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a><ul>
89
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#submodules">Submodules</a></li>
90
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#radiobee-align-sents-module">radiobee.align_sents module</a></li>
 
117
  <li class="toctree-l2"><a class="reference internal" href="radiobee.html#module-contents">Module contents</a></li>
118
  </ul>
119
  </li>
 
 
 
 
120
  </ul>
121
  </div>
122
  </section>
docs/build/html/intro.html CHANGED
@@ -18,7 +18,7 @@
18
  <script src="_static/js/theme.js"></script>
19
  <link rel="index" title="Index" href="genindex.html" />
20
  <link rel="search" title="Search" href="search.html" />
21
- <link rel="next" title="radiobee package" href="radiobee.html" />
22
  <link rel="prev" title="Welcome to radiobee’s documentation!" href="index.html" />
23
  </head>
24
 
@@ -44,8 +44,9 @@
44
  <li class="toctree-l2"><a class="reference internal" href="#limitations">Limitations</a></li>
45
  </ul>
46
  </li>
47
- <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
48
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
 
49
  </ul>
50
 
51
  </div>
@@ -74,12 +75,12 @@
74
 
75
  <section id="introduction">
76
  <h1>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline"></a></h1>
77
- <p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> (<code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code>) is a powerful dualtext aligner.</p>
78
- <p>The aim here was</p>
79
  <p>The current implementation has been developed in Python 3 and <code class="docutils literal notranslate"><span class="pre">gradio</span></code>.</p>
80
  <section id="motivation">
81
  <h2>Motivation<a class="headerlink" href="#motivation" title="Permalink to this headline"></a></h2>
82
- <p>Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.</p>
83
  </section>
84
  <section id="limitations">
85
  <h2>Limitations<a class="headerlink" href="#limitations" title="Permalink to this headline"></a></h2>
@@ -92,7 +93,7 @@
92
  </div>
93
  <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
94
  <a href="index.html" class="btn btn-neutral float-left" title="Welcome to radiobee’s documentation!" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
95
- <a href="radiobee.html" class="btn btn-neutral float-right" title="radiobee package" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
96
  </div>
97
 
98
  <hr/>
 
18
  <script src="_static/js/theme.js"></script>
19
  <link rel="index" title="Index" href="genindex.html" />
20
  <link rel="search" title="Search" href="search.html" />
21
+ <link rel="next" title="How to use" href="userguide.html" />
22
  <link rel="prev" title="Welcome to radiobee’s documentation!" href="index.html" />
23
  </head>
24
 
 
44
  <li class="toctree-l2"><a class="reference internal" href="#limitations">Limitations</a></li>
45
  </ul>
46
  </li>
47
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
48
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
49
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
50
  </ul>
51
 
52
  </div>
 
75
 
76
  <section id="introduction">
77
  <h1>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline"></a></h1>
78
+ <p><code class="docutils literal notranslate"><span class="pre">radiobee</span></code> (or <code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> in full) is a powerful dualtext aligner.</p>
79
+ <p>The aim here was to provide an interface to aligner two texts.</p>
80
  <p>The current implementation has been developed in Python 3 and <code class="docutils literal notranslate"><span class="pre">gradio</span></code>.</p>
81
  <section id="motivation">
82
  <h2>Motivation<a class="headerlink" href="#motivation" title="Permalink to this headline"></a></h2>
83
+ <p>Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used in machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.</p>
84
  </section>
85
  <section id="limitations">
86
  <h2>Limitations<a class="headerlink" href="#limitations" title="Permalink to this headline"></a></h2>
 
93
  </div>
94
  <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
95
  <a href="index.html" class="btn btn-neutral float-left" title="Welcome to radiobee’s documentation!" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
96
+ <a href="userguide.html" class="btn btn-neutral float-right" title="How to use" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
97
  </div>
98
 
99
  <hr/>
docs/build/html/objects.inv CHANGED
Binary files a/docs/build/html/objects.inv and b/docs/build/html/objects.inv differ
 
docs/build/html/search.html CHANGED
@@ -40,8 +40,10 @@
40
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
41
  <ul>
42
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
43
- <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
 
44
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
 
45
  </ul>
46
 
47
  </div>
 
40
  <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
41
  <ul>
42
  <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
43
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
44
+ <li class="toctree-l1"><a class="reference internal" href="userguide-zh.html">使用说明</a></li>
45
  <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
46
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
47
  </ul>
48
 
49
  </div>
docs/build/html/searchindex.js CHANGED
@@ -1 +1 @@
1
- Search.setIndex({docnames:["examples","index","intro","modules","radiobee"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst"],objects:{},objnames:{},objtypes:{},terms:{"3":2,As:0,For:0,If:2,The:2,ad:2,aim:2,align:[0,2],align_s:[1,3],align_text:[1,3],although:2,amend_avec:[1,3],app:[1,3],ar:2,been:[0,2],can:2,cannot:0,cat:2,cmat2tset:[1,3],co:0,contact:2,content:3,current:2,de:2,develop:2,docterm_scor:[1,3],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,etc:2,exampl:[1,2],file2text:[1,3],files2df:[1,3],further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],gradio:2,ha:[0,2],help:2,here:2,http:0,huggingfac:0,implement:2,index:1,insert_spac:[1,3],instal:1,interpolate_pset:[1,3],introduct:1,ja:2,languag:2,learn:2,limit:1,lists2cmat:[1,3],loadtext:[1,3],machin:2,mdx_e2c:[1,3],method:0,mikee:0,modul:[1,3],motiv:1,now:0,onli:2,onlin:0,packag:[0,1,3],page:1,pair:2,paragraph:2,particular:2,permit:2,pip:0,plot_cmat:[1,3],plot_df:[1,3],power:2,process_upload:[1,3],publish:0,pypi:0,python:2,radiobe:[0,2],ru:2,search:1,seg_text:[1,3],sentenc:2,shuffle_s:[1,3],smatrix:[1,3],space:0,submodul:[1,3],suggest:0,support:2,term:2,text:2,time:2,tmx:2,translat:2,trim_df:[1,3],us:[0,2],usag:1,version:0,wa:2,welcom:2,when:2,willing:2,yet:0,you:2,zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package"],titleterms:{align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,usag:0,welcom:1}})
 
1
+ Search.setIndex({docnames:["examples","index","intro","modules","radiobee","userguide","userguide-zh"],envversion:{"sphinx.domains.c":2,"sphinx.domains.changeset":1,"sphinx.domains.citation":1,"sphinx.domains.cpp":4,"sphinx.domains.index":1,"sphinx.domains.javascript":2,"sphinx.domains.math":2,"sphinx.domains.python":3,"sphinx.domains.rst":2,"sphinx.domains.std":2,sphinx:56},filenames:["examples.rst","index.rst","intro.rst","modules.rst","radiobee.rst","userguide.rst","userguide-zh.rst"],objects:{},objnames:{},objtypes:{},terms:{"12":[5,6],"3":2,"316287378":[5,6],"4":[5,6],"8":[5,6],"\u4e00\u822c\u65e0\u9700\u7406\u4f1a\u8fd9\u4e9b\u53c2\u6570":6,"\u4e86\u89e3\u8fd9\u4e9b\u5bf9\u9f50\u5de5\u5177":6,"\u4f18\u8d28\u5bf9":6,"\u4f7f\u7528\u8bf4\u660e":1,"\u53e6\u4e00\u65b9\u9762":6,"\u53ef\u4ee5\u4ee5\u540e\u4f1a\u652f\u6301":6,"\u53ef\u4ee5\u53f3\u51fb\u62f7\u51fa\u56fe\u7684\u94fe\u63a5\u7528\u6d4f\u89c8\u5668\u72ec\u7acb\u8bbf\u95ee\u62f7\u51fa\u6765\u7684\u94fe\u63a5\u6216\u53f3\u51fb\u5b58\u76d8\u518d\u7528\u770b\u56fe\u7a0b\u5e8f\u6253\u5f00\u5b58\u76d8\u7684\u56fe\u6587\u4ef6":6,"\u548c":6,"\u5acc\u56fe\u592a\u5c0f\u7684\u8bdd":6,"\u5b58\u4e0b\u6709\u5173\u53c2\u6570\u67e5\u770b\u6216\u901a\u77e5\u5f00\u53d1\u8005":6,"\u662f":6,"\u7684\u5b6a\u751f\u5144\u5f1f":6,"\u7684\u5efa\u8bae\u503c":6,"\u76ee\u524d\u4ec5\u652f\u6301\u4e2d\u82f1":6,"\u76ee\u524d\u4ec5\u652f\u6301\u7eaf\u6587\u672c\u6587\u4ef6\u4e0a\u8f7d":6,"\u7b2c\u4e8c\u6b21\u4e0a\u8f7d\u6587\u4ef6\u524d\u8bf7\u70b9\u51fb":6,"\u7b49":6,"\u7b49\u683c\u5f0f":6,"\u82f1\u4e2d\u5bf9\u9f50":6,"\u8bbe\u5927\u4e9b\u5219\u53ef\u80fd\u4f1a\u9519\u5931\u4e00\u4e9b":6,"\u8bbe\u5927\u4e9b\u6216":6,"\u8bbe\u5c0f\u4e9b\u53ef\u4ee5\u5f97\u5230\u66f4\u591a\u7684\u5bf9\u9f50\u5bf9\u4f46\u4e5f\u4f1a":6,"\u8bbe\u5c0f\u4e9b\u6216":6,"\u8bef\u62a5\u5bf9":6,"\u8bf7\u52a0\u5165qq\u7fa4":6,"\u8fd0\u884c\u51fa\u9519\u662f\u53ef\u4ee5\u70b9\u51fb":6,"\u9519\u8bef\u5bf9":6,"do":5,"new":5,As:0,For:0,If:[2,5],On:5,The:2,To:5,about:5,ad:2,address:5,aim:2,align:[0,2,5,6],align_s:[1,3],align_text:[1,3],also:5,although:2,amend_avec:[1,3],an:2,app:[1,3],ar:[2,5],been:[0,2],better:5,browser:5,built:0,bumblebe:[5,6],can:[2,5],candid:5,cannot:0,cat:2,clear:[5,6],click:[0,5],cmat2tset:[1,3],co:0,contact:2,content:3,copi:5,csv:[5,6],current:2,de:2,develop:[2,5],dl_type:[5,6],docterm_scor:[1,3],docx:[5,6],download:0,dual:2,dualtext:2,e:2,ebook:2,educ:2,en2zh:[1,3],en2zh_token:[1,3],en:2,epsilon:[5,6],esp:[5,6],etc:[2,5],exampl:[1,2],fals:5,file2text:[1,3],file:5,files2df:[1,3],first:5,flag:[5,6],format:5,full:2,further:2,g:2,gen_aset:[1,3],gen_eps_minsampl:[1,3],gen_model:[1,3],gen_pset:[1,3],gen_row_align:[1,3],go:5,good:5,gradio:2,group:5,ha:[0,2],hand:5,have:5,help:2,here:2,how:1,html:[5,6],http:0,huggingfac:0,identifi:5,idf_typ:[5,6],imag:5,implement:2,index:1,inform:5,insert_spac:[1,3],instal:1,interfac:2,interpolate_pset:[1,3],introduct:1,ja:2,join:5,just:0,know:5,languag:2,larger:5,later:5,learn:2,limit:1,lists2cmat:[1,3],loadtext:[1,3],look:5,machin:2,mai:5,md:[5,6],mdx_e2c:[1,3],method:0,mikee:0,min_sampl:[5,6],minimum:[5,6],miss:5,modul:[1,3],more:5,motiv:1,need:5,norm:[5,6],normal:5,now:0,one:0,onli:2,onlin:0,open:5,other:5,output:5,packag:[0,1,3],page:1,pair:[2,5],paragraph:2,particular:2,pdf:[5,6],permit:2,pip:0,pleas:5,plot_cmat:[1,3],plot_df:[1,3],posit:5,power:2,process_upload:[1,3],provid:2,publish:0,pure:5,pypi:0,python:2,qq:5,radiobe:[0,2,5,6],result:5,right:5,row:0,ru:2,save:5,search:1,seg_text:[1,3],select:5,sentenc:2,should:5,shuffle_s:[1,3],sibl:5,smaller:5,smatrix:[1,3],someth:5,space:0,srt:[5,6],submit:[0,5],submodul:[1,3],subsequ:5,suggest:[0,5],support:[2,5],tab:5,tabl:0,tend:5,term:2,testrun:0,text:[2,5],tf_type:[5,6],time:2,tmx:2,touch:5,translat:2,trim_df:[1,3],two:2,txt:[5,6],unless:5,upload:5,us:[0,1,2],usag:1,valu:5,version:0,wa:2,welcom:2,what:5,when:[2,5],willing:2,wrong:5,yet:0,you:[2,5],zh:2,zip:0},titles:["Examples","Welcome to radiobee\u2019s documentation!","Introduction","radiobee","radiobee package","How to use","\u4f7f\u7528\u8bf4\u660e"],titleterms:{"\u4f7f\u7528\u8bf4\u660e":6,align_s:4,align_text:4,amend_avec:4,app:4,cmat2tset:4,content:[1,4],docterm_scor:4,document:1,en2zh:4,en2zh_token:4,exampl:0,file2text:4,files2df:4,gen_aset:4,gen_eps_minsampl:4,gen_model:4,gen_pset:4,gen_row_align:4,how:5,indic:1,insert_spac:4,instal:0,interpolate_pset:4,introduct:2,limit:2,lists2cmat:4,loadtext:4,mdx_e2c:4,modul:4,motiv:2,packag:4,plot_cmat:4,plot_df:4,process_upload:4,radiobe:[1,3,4],s:1,seg_text:4,shuffle_s:4,smatrix:4,submodul:4,tabl:1,trim_df:4,us:5,usag:0,welcom:1}})
docs/build/html/userguide-zh.html ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html class="writer-html5" lang="en" >
3
+ <head>
4
+ <meta charset="utf-8" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" />
5
+
6
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
+ <title>使用说明 &mdash; radiobee 0.1.0beta2 documentation</title>
8
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
9
+ <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
10
+ <!--[if lt IE 9]>
11
+ <script src="_static/js/html5shiv.min.js"></script>
12
+ <![endif]-->
13
+
14
+ <script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
15
+ <script src="_static/jquery.js"></script>
16
+ <script src="_static/underscore.js"></script>
17
+ <script src="_static/doctools.js"></script>
18
+ <script src="_static/js/theme.js"></script>
19
+ <link rel="index" title="Index" href="genindex.html" />
20
+ <link rel="search" title="Search" href="search.html" />
21
+ <link rel="next" title="Examples" href="examples.html" />
22
+ <link rel="prev" title="How to use" href="userguide.html" />
23
+ </head>
24
+
25
+ <body class="wy-body-for-nav">
26
+ <div class="wy-grid-for-nav">
27
+ <nav data-toggle="wy-nav-shift" class="wy-nav-side">
28
+ <div class="wy-side-scroll">
29
+ <div class="wy-side-nav-search" >
30
+ <a href="index.html" class="icon icon-home"> radiobee
31
+ </a>
32
+ <div role="search">
33
+ <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
34
+ <input type="text" name="q" placeholder="Search docs" />
35
+ <input type="hidden" name="check_keywords" value="yes" />
36
+ <input type="hidden" name="area" value="default" />
37
+ </form>
38
+ </div>
39
+ </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
40
+ <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
41
+ <ul class="current">
42
+ <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
43
+ <li class="toctree-l1"><a class="reference internal" href="userguide.html">How to use</a></li>
44
+ <li class="toctree-l1 current"><a class="current reference internal" href="#">使用说明</a></li>
45
+ <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
46
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
47
+ </ul>
48
+
49
+ </div>
50
+ </div>
51
+ </nav>
52
+
53
+ <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
54
+ <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
55
+ <a href="index.html">radiobee</a>
56
+ </nav>
57
+
58
+ <div class="wy-nav-content">
59
+ <div class="rst-content">
60
+ <div role="navigation" aria-label="Page navigation">
61
+ <ul class="wy-breadcrumbs">
62
+ <li><a href="index.html" class="icon icon-home"></a> &raquo;</li>
63
+ <li>使用说明</li>
64
+ <li class="wy-breadcrumbs-aside">
65
+ <a href="_sources/userguide-zh.rst.txt" rel="nofollow"> View page source</a>
66
+ </li>
67
+ </ul>
68
+ <hr/>
69
+ </div>
70
+ <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
71
+ <div itemprop="articleBody">
72
+
73
+ <section id="id1">
74
+ <h1>使用说明<a class="headerlink" href="#id1" title="Permalink to this headline"></a></h1>
75
+ <ul class="simple">
76
+ <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner``是``bumblebee</span></code> aligner`的孪生兄弟。请加入qq群`316287378`了解这些对齐工具.</p></li>
77
+ <li><p><a href="#id2"><span class="problematic" id="id3">``</span></a>radiobee``目前仅支持中英、英中对齐。</p></li>
78
+ <li><p><code class="docutils literal notranslate"><span class="pre">radiobee``目前仅支持纯文本文件上载</span> <span class="pre">(txt,</span> <span class="pre">md,</span> <span class="pre">csv</span> <span class="pre">等)。</span> <span class="pre">可以以后会支持``docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <a href="#id4"><span class="problematic" id="id5">``</span></a>html``等格式。</p></li>
79
+ <li><p>第二次上载文件前请点击”Clear”。</p></li>
80
+ <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: 一般无需理会这些参数。</p></li>
81
+ <li><p><code class="docutils literal notranslate"><span class="pre">esp</span></code> 和 <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> 的建议值 – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p>
82
+ <ul>
83
+ <li><p><a href="#id6"><span class="problematic" id="id7">``</span></a>esp``设大些或``min_samples``设小些可以得到更多的对齐对但也会 <strong>误报对</strong> (错误对)。另一方面,<a href="#id8"><span class="problematic" id="id9">``</span></a>esp``设小些或``min_samples``设大些则可能会错失一些’优质对‘。</p></li>
84
+ </ul>
85
+ </li>
86
+ <li><p>嫌图太小的话,可以右击拷出图的链接用浏览器独立访问拷出来的链接或右击存盘再用看图程序打开存盘的图文件。</p></li>
87
+ <li><p><code class="docutils literal notranslate"><span class="pre">Flag</span></code>: 运行出错是可以点击``Flag``存下有关参数查看或通知开发者。</p></li>
88
+ </ul>
89
+ </section>
90
+
91
+
92
+ </div>
93
+ </div>
94
+ <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
95
+ <a href="userguide.html" class="btn btn-neutral float-left" title="How to use" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
96
+ <a href="examples.html" class="btn btn-neutral float-right" title="Examples" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
97
+ </div>
98
+
99
+ <hr/>
100
+
101
+ <div role="contentinfo">
102
+ <p>&#169; Copyright 2022, mu.</p>
103
+ </div>
104
+
105
+ Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
106
+ <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
107
+ provided by <a href="https://readthedocs.org">Read the Docs</a>.
108
+
109
+
110
+ </footer>
111
+ </div>
112
+ </div>
113
+ </section>
114
+ </div>
115
+ <script>
116
+ jQuery(function () {
117
+ SphinxRtdTheme.Navigation.enable(true);
118
+ });
119
+ </script>
120
+
121
+ </body>
122
+ </html>
docs/build/html/userguide.html ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html class="writer-html5" lang="en" >
3
+ <head>
4
+ <meta charset="utf-8" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" />
5
+
6
+ <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
+ <title>How to use &mdash; radiobee 0.1.0beta2 documentation</title>
8
+ <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
9
+ <link rel="stylesheet" href="_static/css/theme.css" type="text/css" />
10
+ <!--[if lt IE 9]>
11
+ <script src="_static/js/html5shiv.min.js"></script>
12
+ <![endif]-->
13
+
14
+ <script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
15
+ <script src="_static/jquery.js"></script>
16
+ <script src="_static/underscore.js"></script>
17
+ <script src="_static/doctools.js"></script>
18
+ <script src="_static/js/theme.js"></script>
19
+ <link rel="index" title="Index" href="genindex.html" />
20
+ <link rel="search" title="Search" href="search.html" />
21
+ <link rel="next" title="使用说明" href="userguide-zh.html" />
22
+ <link rel="prev" title="Introduction" href="intro.html" />
23
+ </head>
24
+
25
+ <body class="wy-body-for-nav">
26
+ <div class="wy-grid-for-nav">
27
+ <nav data-toggle="wy-nav-shift" class="wy-nav-side">
28
+ <div class="wy-side-scroll">
29
+ <div class="wy-side-nav-search" >
30
+ <a href="index.html" class="icon icon-home"> radiobee
31
+ </a>
32
+ <div role="search">
33
+ <form id="rtd-search-form" class="wy-form" action="search.html" method="get">
34
+ <input type="text" name="q" placeholder="Search docs" />
35
+ <input type="hidden" name="check_keywords" value="yes" />
36
+ <input type="hidden" name="area" value="default" />
37
+ </form>
38
+ </div>
39
+ </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
40
+ <p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
41
+ <ul class="current">
42
+ <li class="toctree-l1"><a class="reference internal" href="intro.html">Introduction</a></li>
43
+ <li class="toctree-l1 current"><a class="current reference internal" href="#">How to use</a></li>
44
+ <li class="toctree-l1"><a class="reference internal" href="userguide-zh.html">使用说明</a></li>
45
+ <li class="toctree-l1"><a class="reference internal" href="examples.html">Examples</a></li>
46
+ <li class="toctree-l1"><a class="reference internal" href="radiobee.html">radiobee package</a></li>
47
+ </ul>
48
+
49
+ </div>
50
+ </div>
51
+ </nav>
52
+
53
+ <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
54
+ <i data-toggle="wy-nav-top" class="fa fa-bars"></i>
55
+ <a href="index.html">radiobee</a>
56
+ </nav>
57
+
58
+ <div class="wy-nav-content">
59
+ <div class="rst-content">
60
+ <div role="navigation" aria-label="Page navigation">
61
+ <ul class="wy-breadcrumbs">
62
+ <li><a href="index.html" class="icon icon-home"></a> &raquo;</li>
63
+ <li>How to use</li>
64
+ <li class="wy-breadcrumbs-aside">
65
+ <a href="_sources/userguide.rst.txt" rel="nofollow"> View page source</a>
66
+ </li>
67
+ </ul>
68
+ <hr/>
69
+ </div>
70
+ <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
71
+ <div itemprop="articleBody">
72
+
73
+ <section id="how-to-use">
74
+ <h1>How to use<a class="headerlink" href="#how-to-use" title="Permalink to this headline"></a></h1>
75
+ <ul class="simple">
76
+ <li><p><code class="docutils literal notranslate"><span class="pre">radiobee</span> <span class="pre">aligner</span></code> is a sibling of <cite>bumblebee aligner</cite>. To know more about these aligners, please join qq group <cite>316287378</cite>.</p></li>
77
+ <li><p>Uploaded files should be in pure text format (txt, md, csv etc). <code class="docutils literal notranslate"><span class="pre">docx</span></code>, <code class="docutils literal notranslate"><span class="pre">pdf</span></code>, <code class="docutils literal notranslate"><span class="pre">srt</span></code>, <code class="docutils literal notranslate"><span class="pre">html</span></code> etc may be supported later on.</p></li>
78
+ <li><p>Click “Clear” first for subsequent submits when uploading files.</p></li>
79
+ <li><p><code class="docutils literal notranslate"><span class="pre">tf_type</span></code> <code class="docutils literal notranslate"><span class="pre">idf_type</span></code> <code class="docutils literal notranslate"><span class="pre">dl_type</span></code> <code class="docutils literal notranslate"><span class="pre">norm</span></code>: Normally there is no need to touch these unless you know what you are doing.</p></li>
80
+ <li><p>Suggested <code class="docutils literal notranslate"><span class="pre">esp</span></code> and <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values – <code class="docutils literal notranslate"><span class="pre">esp</span></code> (minimum epsilon): 8-12, <code class="docutils literal notranslate"><span class="pre">min_samples</span></code>: 4-8.</p></li>
81
+ </ul>
82
+ <blockquote>
83
+ <div><ul class="simple">
84
+ <li><p>Larger <code class="docutils literal notranslate"><span class="pre">esp</span></code> or smaller <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> will result in more aligned pairs but also more <strong>false positives</strong> (pairs falsely identified as candidates). On the other hand, smaller <code class="docutils literal notranslate"><span class="pre">esp</span></code> or larger <code class="docutils literal notranslate"><span class="pre">min_samples</span></code> values tend to miss ‘good’ pairs.</p></li>
85
+ </ul>
86
+ </div></blockquote>
87
+ <ul class="simple">
88
+ <li><p>If you need to have a better look at the image, you can right-click on the image and select copy-image-address and open a new tab in the browser with the copied image address.</p></li>
89
+ <li><p><code class="docutils literal notranslate"><span class="pre">Flag</span></code>: Should something go wrong, you can click Flag to save the output and inform the developer.</p></li>
90
+ </ul>
91
+ </section>
92
+
93
+
94
+ </div>
95
+ </div>
96
+ <footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
97
+ <a href="intro.html" class="btn btn-neutral float-left" title="Introduction" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
98
+ <a href="userguide-zh.html" class="btn btn-neutral float-right" title="使用说明" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
99
+ </div>
100
+
101
+ <hr/>
102
+
103
+ <div role="contentinfo">
104
+ <p>&#169; Copyright 2022, mu.</p>
105
+ </div>
106
+
107
+ Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
108
+ <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
109
+ provided by <a href="https://readthedocs.org">Read the Docs</a>.
110
+
111
+
112
+ </footer>
113
+ </div>
114
+ </div>
115
+ </section>
116
+ </div>
117
+ <script>
118
+ jQuery(function () {
119
+ SphinxRtdTheme.Navigation.enable(true);
120
+ });
121
+ </script>
122
+
123
+ </body>
124
+ </html>
docs/run-make.bat ADDED
@@ -0,0 +1 @@
 
 
1
+ make clean && make html
docs/source/conf.py CHANGED
@@ -55,4 +55,4 @@ html_theme = 'sphinx_rtd_theme'
55
  # Add any paths that contain custom static files (such as style sheets) here,
56
  # relative to this directory. They are copied after the builtin static files,
57
  # so a file named "default.css" will overwrite the builtin "default.css".
58
- html_static_path = ['_static']
 
55
  # Add any paths that contain custom static files (such as style sheets) here,
56
  # relative to this directory. They are copied after the builtin static files,
57
  # so a file named "default.css" will overwrite the builtin "default.css".
58
+ html_static_path = ['_static']
docs/source/examples.rst CHANGED
@@ -1,10 +1,10 @@
1
  Examples
2
  =============
3
 
 
 
4
  Installation/Usage:
5
  *******************
6
  As the package has not been published on PyPi yet, it CANNOT be install using pip.
7
 
8
- For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`_
9
-
10
-
 
1
  Examples
2
  =============
3
 
4
+ ``radiobee`` has in-built examples. Just click one of the rows in the ``Examples`` table and click ``Submit`` to testrun.
5
+
6
  Installation/Usage:
7
  *******************
8
  As the package has not been published on PyPi yet, it CANNOT be install using pip.
9
 
10
+ For now, the suggested method is to download the zipped package or use the online version at `https://huggingface.co/spaces/mikeee/radiobee-aligner/ <https://huggingface.co/spaces/mikeee/radiobee-aligner/>`
 
 
docs/source/index.rst CHANGED
@@ -11,8 +11,10 @@ Welcome to radiobee's documentation!
11
  :caption: Contents:
12
 
13
  intro
14
- radiobee
 
15
  examples
 
16
 
17
  Indices and tables
18
  ==================
 
11
  :caption: Contents:
12
 
13
  intro
14
+ userguide
15
+ userguide-zh
16
  examples
17
+ radiobee
18
 
19
  Indices and tables
20
  ==================
docs/source/intro.rst CHANGED
@@ -1,16 +1,16 @@
1
  Introduction
2
  ============
3
 
4
- ``radiobee`` (``radiobee aligner``) is a powerful dualtext aligner.
5
 
6
- The aim here was
7
 
8
  The current implementation has been developed in Python 3 and ``gradio``.
9
 
10
  Motivation
11
  **********
12
 
13
- Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.
14
 
15
  Limitations
16
  ***********
 
1
  Introduction
2
  ============
3
 
4
+ ``radiobee`` (or ``radiobee aligner`` in full) is a powerful dualtext aligner.
5
 
6
+ The aim here was to provide an interface to aligner two texts.
7
 
8
  The current implementation has been developed in Python 3 and ``gradio``.
9
 
10
  Motivation
11
  **********
12
 
13
+ Aligned texts (paragraph-to-paragraph or sentence-to-sentence) can be used in machine learning (e.g. machine translation), CAT (tmx, translation terms etc.) and education (dual-language ebook), etc.
14
 
15
  Limitations
16
  ***********
docs/source/userguide-zh.rst ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 使用说明
2
+ ----------
3
+
4
+ - ``radiobee aligner``是``bumblebee`` aligner`的孪生兄弟。请加入qq群`316287378`了解这些对齐工具.
5
+
6
+ - ``radiobee``目前仅支持中英、英中对齐。
7
+ - ``radiobee``目前仅支持纯文本文件上载 (txt, md, csv 等)。 可以以后会支持``docx``, ``pdf``, ``srt``, ``html``等格式。
8
+ - 第二次上载文件前请点击"Clear"。
9
+ - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: 一般无需理会这些参数。
10
+ - ``esp`` 和 ``min_samples`` 的建议值 -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
11
+
12
+ - ``esp``设大些或``min_samples``设小些可以得到更多的对齐对但也会 **误报对** (错误对)。另一方面,``esp``设小些或``min_samples``设大些则可能会错失一些’优质对‘。
13
+
14
+ - 嫌图太小的话,可以右击拷出图的链接用浏览器独立访问拷出来的链接或右击存盘再用看图程序打开存盘的图文件。
15
+ - ``Flag``: 运行出错是可以点击``Flag``存下有关参数查看或通知开发者。
docs/source/userguide.rst ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ How to use
2
+ ----------
3
+
4
+ - ``radiobee aligner`` is a sibling of `bumblebee aligner`. To know more about these aligners, please join qq group `316287378`.
5
+
6
+ - Uploaded files should be in pure text format (txt, md, csv etc). ``docx``, ``pdf``, ``srt``, ``html`` etc may be supported later on.
7
+ - Click "Clear" first for subsequent submits when uploading files.
8
+ - ``tf_type`` ``idf_type`` ``dl_type`` ``norm``: Normally there is no need to touch these unless you know what you are doing.
9
+ - Suggested ``esp`` and ``min_samples`` values -- ``esp`` (minimum epsilon): 8-12, ``min_samples``: 4-8.
10
+
11
+ - Larger ``esp`` or smaller ``min_samples`` will result in more aligned pairs but also more **false positives** (pairs falsely identified as candidates). On the other hand, smaller ``esp`` or larger ``min_samples`` values tend to miss 'good' pairs.
12
+
13
+ - If you need to have a better look at the image, you can right-click on the image and select copy-image-address and open a new tab in the browser with the copied image address.
14
+ - ``Flag``: Should something go wrong, you can click Flag to save the output and inform the developer.
gradio_queue.db CHANGED
Binary files a/gradio_queue.db and b/gradio_queue.db differ
 
radiobee/__main__.py CHANGED
@@ -97,8 +97,8 @@ if __name__ == "__main__":
97
  # """
98
  import logzero
99
 
100
- # debug = True
101
- debug = False
102
  if debug:
103
  logzero.loglevel(10)
104
  logger.debug(" debug ")
@@ -115,7 +115,7 @@ if __name__ == "__main__":
115
  inputs = [
116
  gr.inputs.File(label="file 1"),
117
  # gr.inputs.File(file_count="multiple", label="file 2", optional=True),
118
- gr.inputs.File(label="file 2", optional=True),
119
  ]
120
 
121
  # modi 1
@@ -231,6 +231,16 @@ if __name__ == "__main__":
231
  10,
232
  4,
233
  ],
 
 
 
 
 
 
 
 
 
 
234
  ]
235
  outputs = ["dataframe", "plot"]
236
  outputs = ["plot"]
@@ -297,7 +307,16 @@ if __name__ == "__main__":
297
  # bypass if file1 or file2 is str input
298
  # if not (isinstance(file1, str) or isinstance(file2, str)):
299
  text1 = file2text(file1)
300
- text2 = file2text(file2)
 
 
 
 
 
 
 
 
 
301
  lang1, _ = fastlid(text1)
302
  lang2, _ = fastlid(text2)
303
 
@@ -485,6 +504,7 @@ if __name__ == "__main__":
485
  else:
486
  raise SystemExit(f"Tried {numb} times to no avail, giving up...")
487
 
 
488
  article = dedent(
489
  """
490
  ## NB
@@ -499,8 +519,16 @@ if __name__ == "__main__":
499
  'good' pairs.
500
  * If you need to have a better look at the image, you can right-click on the image and select copy-image-address and open a new tab in the browser with the copied image address.
501
  * `Flag`: Should something go wrong, you can click Flag to save the output and inform the developer.
502
- """
503
- )
 
 
 
 
 
 
 
 
504
  css_image = ".output_image, .input_image {height: 40rem !important; width: 100% !important;}"
505
  # css = ".output_image, .input_image {height: 20rem !important; width: 100% !important;}"
506
  css_input_file = (
@@ -530,9 +558,7 @@ if __name__ == "__main__":
530
  # theme="darkgrass",
531
  theme="grass",
532
  layout="vertical", # horizontal unaligned
533
- # height=150, # 500
534
- width=900, # 900
535
- allow_flagging=True,
536
  flagging_options=[
537
  "fatal",
538
  "bug",
@@ -551,6 +577,8 @@ if __name__ == "__main__":
551
  server_port=server_port,
552
  # show_tips=True,
553
  enable_queue=True,
 
 
554
  )
555
 
556
  _ = """
 
97
  # """
98
  import logzero
99
 
100
+ debug = True
101
+ # debug = False
102
  if debug:
103
  logzero.loglevel(10)
104
  logger.debug(" debug ")
 
115
  inputs = [
116
  gr.inputs.File(label="file 1"),
117
  # gr.inputs.File(file_count="multiple", label="file 2", optional=True),
118
+ gr.inputs.File(label="file 2 (if empty, radiobee will attempt to separate file 1 to two)", optional=True),
119
  ]
120
 
121
  # modi 1
 
231
  10,
232
  4,
233
  ],
234
+ [
235
+ "data/test-dual.txt",
236
+ None,
237
+ "linear",
238
+ "None",
239
+ "None",
240
+ "None",
241
+ 10,
242
+ 6,
243
+ ],
244
  ]
245
  outputs = ["dataframe", "plot"]
246
  outputs = ["plot"]
 
307
  # bypass if file1 or file2 is str input
308
  # if not (isinstance(file1, str) or isinstance(file2, str)):
309
  text1 = file2text(file1)
310
+
311
+ if file2 is None:
312
+ logger.debug("file2 is None")
313
+ text2 = ""
314
+
315
+ # split text1 to text1 and text2
316
+
317
+ else:
318
+ logger.debug("file2.name ", file2.name)
319
+ text2 = file2text(file2)
320
  lang1, _ = fastlid(text1)
321
  lang2, _ = fastlid(text2)
322
 
 
504
  else:
505
  raise SystemExit(f"Tried {numb} times to no avail, giving up...")
506
 
507
+ # moved to userguide.rst in docs
508
  article = dedent(
509
  """
510
  ## NB
 
519
  'good' pairs.
520
  * If you need to have a better look at the image, you can right-click on the image and select copy-image-address and open a new tab in the browser with the copied image address.
521
  * `Flag`: Should something go wrong, you can click Flag to save the output and inform the developer.
522
+ """
523
+ ).strip()
524
+ article = dedent(
525
+ """
526
+ [https://radiobee.readthedocs.io/](https://radiobee.readthedocs.io/)
527
+
528
+ [中文使用说明](https://radiobee.readthedocs.io/en/latest/userguide-zh.html#)
529
+ """
530
+ ).strip()
531
+
532
  css_image = ".output_image, .input_image {height: 40rem !important; width: 100% !important;}"
533
  # css = ".output_image, .input_image {height: 20rem !important; width: 100% !important;}"
534
  css_input_file = (
 
558
  # theme="darkgrass",
559
  theme="grass",
560
  layout="vertical", # horizontal unaligned
561
+ allow_flagging="auto",
 
 
562
  flagging_options=[
563
  "fatal",
564
  "bug",
 
577
  server_port=server_port,
578
  # show_tips=True,
579
  enable_queue=True,
580
+ # height=150, # 500
581
+ # width=900, # 900
582
  )
583
 
584
  _ = """
radiobee/gen_vector.py CHANGED
@@ -1,24 +1,28 @@
1
  """gen tokens for english or chinese text for a given model."""
2
  # pylint: disable=
3
 
4
- from typing import List
5
 
6
  from textacy.representations import Vectorizer
7
  from radiobee.insert_spaces import insert_spaces
8
  # from radiobee.gen_model import gen_model
9
 
10
 
11
- def gen_vector(text: str, model: Vectorizer) -> List[float]:
12
  """Gen vector for a give model.
13
 
14
  Args:
15
  text: string of Chinese chars or English words.
16
-
17
  filename = r"data\test-dual.txt"
18
  text = loadtext(filename)
19
  list1, list2 = zip(*text2lists(text))
20
  model = gen_model(list1)
21
  """
22
- vec = insert_spaces(text).split()
 
 
 
23
 
24
- return model.transform(vec)
 
 
1
  """gen tokens for english or chinese text for a given model."""
2
  # pylint: disable=
3
 
4
+ from typing import List, Union
5
 
6
  from textacy.representations import Vectorizer
7
  from radiobee.insert_spaces import insert_spaces
8
  # from radiobee.gen_model import gen_model
9
 
10
 
11
+ def gen_vector(text: Union[str, List[str]], model: Vectorizer) -> List[float]:
12
  """Gen vector for a give model.
13
 
14
  Args:
15
  text: string of Chinese chars or English words.
16
+
17
  filename = r"data\test-dual.txt"
18
  text = loadtext(filename)
19
  list1, list2 = zip(*text2lists(text))
20
  model = gen_model(list1)
21
  """
22
+ if isinstance(text, str):
23
+ vec = insert_spaces(text).split()
24
+
25
+ return model.transform(vec)
26
 
27
+ # already same tokens as used to gen_model
28
+ return model.transform(text)
radiobee/text2lists.py CHANGED
@@ -38,7 +38,7 @@ def text2lists(text: Union[Iterable[str], str]) -> List[Tuple[str, str]]:
38
 
39
  # find offset
40
 
41
- left = []
42
- right = []
43
 
44
  return [("", "")]
 
38
 
39
  # find offset
40
 
41
+ left = [] # noqa
42
+ right = [] # noqa
43
 
44
  return [("", "")]
run-radiobee.bat CHANGED
@@ -1,4 +1,5 @@
1
  REM nodemon -V -w radiobee -x "sleep 3 && python -m radiobee"
2
  REM nodemon -V -w radiobee -x python -m radiobee
3
  REM nodemon -V -w radiobee -x py -3.8 -m radiobee
4
- nodemon -V -w radiobee -x "run-p pyright flake8 && py -3.8 -m radiobee"
 
 
1
  REM nodemon -V -w radiobee -x "sleep 3 && python -m radiobee"
2
  REM nodemon -V -w radiobee -x python -m radiobee
3
  REM nodemon -V -w radiobee -x py -3.8 -m radiobee
4
+ REM nodemon -V -w radiobee -x "run-p pyright flake8 && py -3.8 -m radiobee"
5
+ nodemon -V -w radiobee -x "run-p pyright && py -3.8 -m radiobee"
tests/test_seg_text.py CHANGED
@@ -26,7 +26,7 @@ def test_seg_text_blanks(test_input, expected):
26
  assert seg_text(test_input) == expected
27
 
28
 
29
- def test_seg_text_semicolon ():
30
  """Test semicolon."""
31
  text = """ “元宇宙”,英文為“Metaverse”。該詞出自1992年;的科幻小說《雪崩》。 """
32
  assert len(seg_text(text)) == 2
@@ -36,7 +36,7 @@ def test_seg_text_semicolon ():
36
  assert len(seg_text(text, 'en')) == 1
37
 
38
 
39
- def test_seg_text_semicolon_extra ():
40
  """Test semicolon."""
41
  extra = "[;;]"
42
  text = """ “元宇宙”,英文為“Metaverse”。該詞出自1992年;的科幻小說《雪崩》。 """
 
26
  assert seg_text(test_input) == expected
27
 
28
 
29
+ def test_seg_text_semicolon():
30
  """Test semicolon."""
31
  text = """ “元宇宙”,英文為“Metaverse”。該詞出自1992年;的科幻小說《雪崩》。 """
32
  assert len(seg_text(text)) == 2
 
36
  assert len(seg_text(text, 'en')) == 1
37
 
38
 
39
+ def test_seg_text_semicolon_extra():
40
  """Test semicolon."""
41
  extra = "[;;]"
42
  text = """ “元宇宙”,英文為“Metaverse”。該詞出自1992年;的科幻小說《雪崩》。 """
tests/test_text2lists.py CHANGED
@@ -6,4 +6,4 @@ from radiobee.loadtext import loadtext
6
  def test_text2lists():
7
  """Test text2lists data\test-dual.txt."""
8
  filename = r"data\test-dual.txt"
9
- text = loadtext(filename)
 
6
  def test_text2lists():
7
  """Test text2lists data\test-dual.txt."""
8
  filename = r"data\test-dual.txt"
9
+ text = loadtext(filename) # noqa