pjox commited on
Commit
03838b2
1 Parent(s): a991788

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md CHANGED
@@ -1,3 +1,45 @@
1
  ---
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: fr
3
+ tags:
4
+ - Early Modern French
5
+ - Historical
6
+ - POS
7
+ - flair
8
  license: apache-2.0
9
+ datasets:
10
+ - freemlpm
11
+ library_name: flair
12
+ pipeline_tag: token-classification
13
  ---
14
+
15
+ <a href="https://portizs.eu/publication/2022/lrec/dalembert/">
16
+ <img width="300px" src="https://portizs.eu/publication/2022/lrec/dalembert/featured_hu18bf34d40cdc71c744bdd15e48ff0b23_61788_720x2500_fit_q100_h2_lanczos_3.webp">
17
+ </a>
18
+
19
+ # D'AlemBERT-POS model
20
+
21
+ This model is fine-tuned version of a [D'AlemBERT](https://huggingface.co/pjox/dalembert) on the [FreEMLPM corpus](https://doi.org/10.5281/zenodo.6481300) for Early Modern French. It was
22
+ introduced in [this paper](https://aclanthology.org/2022.lrec-1.359/).
23
+
24
+ ### BibTeX entry and citation info
25
+
26
+ ```bibtex
27
+ @inproceedings{gabay-etal-2022-freem,
28
+ title = "From {F}re{EM} to D{'}{A}lem{BERT}: a Large Corpus and a Language Model for Early {M}odern {F}rench",
29
+ author = "Gabay, Simon and
30
+ Ortiz Suarez, Pedro and
31
+ Bartz, Alexandre and
32
+ Chagu{\'e}, Alix and
33
+ Bawden, Rachel and
34
+ Gambette, Philippe and
35
+ Sagot, Beno{\^\i}t",
36
+ booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
37
+ month = jun,
38
+ year = "2022",
39
+ address = "Marseille, France",
40
+ publisher = "European Language Resources Association",
41
+ url = "https://aclanthology.org/2022.lrec-1.359",
42
+ pages = "3367--3374",
43
+ abstract = "anguage models for historical states of language are becoming increasingly important to allow the optimal digitisation and analysis of old textual sources. Because these historical states are at the same time more complex to process and more scarce in the corpora available, this paper presents recent efforts to overcome this difficult situation. These efforts include producing a corpus, creating the model, and evaluating it with an NLP task currently used by scholars in other ongoing projects.",
44
+ }
45
+ ```