Full-text search
+ 1,000 results
nlpai-lab / databricks-dolly-15k-ko
README.md
dataset
3 matches
tags:
task_categories:question-answering, task_categories:summarization, size_categories:10K<n<100K, language:ko, license:cc-by-sa-3.0, arxiv:2203.02155, region:us
12
13
14
15
16
the DeepL API
Note: There are cases where multilingual data has been converted to monolingual data during batch translation to Korean using the API.
Below is databricks-dolly-15k's README.
nlpai-lab / openassistant-guanaco-ko
README.md
dataset
3 matches
tags:
task_categories:text-generation, task_categories:question-answering, task_categories:summarization, size_categories:1K<n<10K, language:ko, license:apache-2.0, croissant, region:us
15
16
17
18
19
the DeepL API
Note: There are cases where multilingual data has been converted to monolingual data during batch translation to Korean using the API.
Below is Guanaco's README.
thomasavare / italian-dataset-deepl
README.md
dataset
2 matches
turing-motors / LLaVA-v1.5-Instruct-620K-JA
README.md
dataset
4 matches
tags:
task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:cc-by-nc-4.0, region:us
17
18
19
20
21
sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.
**Resources for More Information:**
For information on the original dataset: [LLaVA](https://llava-vl.github.io/)
turing-motors / LLaVA-Pretrain-JA
README.md
dataset
4 matches
tags:
task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:other, croissant, region:us
17
18
19
20
21
sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.
**Resources for More Information:**
For information on the original dataset: [LLaVA](https://llava-vl.github.io/)
taeshahn / ko-lima
README.md
dataset
4 matches
tags:
size_categories:1K<n<10K, language:ko, license:cc-by-nc-sa-4.0, lima, kolima, korean, instruction, croissant, arxiv:2305.11206, region:us
37
38
39
40
41
역에는 [DeepL API](https://www.deepl.com/docs-api)를 활용하였고, SK(주) Tech Collaborative Lab으로부터 비용을 지원받았습니다. 전체 텍스트 중에서 code block이나 수식을 나타내는 특수문자 사이의 텍스트는 원문을 유지하는 형태로 번역을 진행하였으며, `train` 데이터셋 1,030건과 `test` 데이터셋 300건으로 구성된 총 1,330건의 데이터를 활용하실 수 있습니다. 현재 동일한 번역 문장을 `plain`, `vicuna` 두 가지 포멧으로 제공합니다.
데이터셋 관련하여 문의가 있으신 경우 [메일](mailto:taes.hahn@gmail.com)을 통해 연락주세요! 🥰
turing-motors / LLaVA-Instruct-150K-JA
README.md
dataset
4 matches
tags:
task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:cc-by-nc-4.0, region:us
17
18
19
20
21
sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.
**Resources for More Information:**
For information on the original dataset: [LLaVA Visual Instruct 150K](https://llava-vl.github.io/)
32erwedwsedfcw3fc322 / Srt-Translator
README.md
model
5 matches
praveensonu / alpaca_it_6k
README.md
dataset
4 matches
dangbert / alpaca-cleaned-nl
README.md
dataset
2 matches
hidenoriyamano37 / SrtUtilTools
app.py
space
18 matches
Bingsu / ko_alpaca_data
README.md
dataset
3 matches
tags:
task_categories:text-generation, size_categories:10K<n<100K, language:ko, license:cc-by-nc-4.0, croissant, region:us
44
45
46
47
48
the DeepL API, except for 'output', which we did not translate because it is the output of OpenAI's `text-davinci-003` model.
2. Generate output data
Then, using the instruction and input, generate output data via the OpenAI ChatGPT API (gpt-3.5-turbo).
cawacci / chatwithdocuments2
app.py
space
18 matches
csujeong / kullm-v2.1
README.md
dataset
2 matches
SINAI / RefutES
README.md
dataset
2 matches
tags:
language:es, license:cc-by-nc-sa-4.0, counter-narrative, counterspeech, region:us
18
19
20
21
22
the DeepL API. All translations were reviewed by our annotators, and in those pairs where the translations were erroneous, they were edited. The associated counternarrative (CN) to each hate-speech message (HS) is generated by the GPT-4 model using a prompt strategy. The strategy used consisted in a Few Shot Learning Strategy, where the model was prompted with a task description and 8 examples of HS-CN pairs (one for each target). In addition, the counternarrative generated by GPT-4 has been evaluated by human experts using different metrics:
- Offensiveness:
- 0 (not sure)
- 1 (not offensive)
RASMUS / Whisper-youtube-crosslingual-subtitles
app.py
space
15 matches
LEADERRAILUSE / Whisper-youtube-crosslingual-subtitles
app.py
space
15 matches
shadow / Whisper-youtube-crosslingual-subtitles
app.py
space
14 matches