Full Text Search - Hugging Face

Full-text search

models datasets spaces

+ 1,000 results

nlpai-lab / databricks-dolly-15k-ko

README.md

dataset

3 matches

tags: task_categories:question-answering, task_categories:summarization, size_categories:10K<n<100K, language:ko, license:cc-by-sa-3.0, arxiv:2203.02155, region:us

Note: There are cases where multilingual data has been converted to monolingual data during batch translation to Korean using the API.

Below is databricks-dolly-15k's README.

nlpai-lab / openassistant-guanaco-ko

README.md

dataset

3 matches

tags: task_categories:text-generation, task_categories:question-answering, task_categories:summarization, size_categories:1K<n<10K, language:ko, license:apache-2.0, croissant, region:us

Note: There are cases where multilingual data has been converted to monolingual data during batch translation to Korean using the API.

Below is Guanaco's README.

thomasavare / italian-dataset-deepl

README.md

dataset

2 matches

tags: language:en, language:it, license:cc-by-nc-nd-4.0, croissant, region:us

with Deepl API of waste-classification-v2 dataset (500 first rows).

[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

turing-motors / LLaVA-v1.5-Instruct-620K-JA

README.md

dataset

4 matches

tags: task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:cc-by-nc-4.0, region:us

sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.

**Resources for More Information:**

For information on the original dataset: [LLaVA](https://llava-vl.github.io/)

turing-motors / LLaVA-Pretrain-JA

README.md

dataset

4 matches

tags: task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:other, croissant, region:us

sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.

**Resources for More Information:**

For information on the original dataset: [LLaVA](https://llava-vl.github.io/)

taeshahn / ko-lima

README.md

dataset

4 matches

tags: size_categories:1K<n<10K, language:ko, license:cc-by-nc-sa-4.0, lima, kolima, korean, instruction, croissant, arxiv:2305.11206, region:us

역에는 [DeepL API](https://www.deepl.com/docs-api)를 활용하였고, SK(주) Tech Collaborative Lab으로부터 비용을 지원받았습니다. 전체 텍스트 중에서 code block이나 수식을 나타내는 특수문자 사이의 텍스트는 원문을 유지하는 형태로 번역을 진행하였으며, `train` 데이터셋 1,030건과 `test` 데이터셋 300건으로 구성된 총 1,330건의 데이터를 활용하실 수 있습니다. 현재 동일한 번역 문장을 `plain`, `vicuna` 두 가지 포멧으로 제공합니다.

데이터셋 관련하여 문의가 있으신 경우 [메일](mailto:taes.hahn@gmail.com)을 통해 연락주세요! 🥰

turing-motors / LLaVA-Instruct-150K-JA

README.md

dataset

4 matches

tags: task_categories:visual-question-answering, task_categories:question-answering, size_categories:100K<n<1M, language:ja, license:cc-by-nc-4.0, region:us

sing DeepL API and is aimed at serving similar purposes in the context of Japanese language.

**Resources for More Information:**

For information on the original dataset: [LLaVA Visual Instruct 150K](https://llava-vl.github.io/)

32erwedwsedfcw3fc322 / Srt-Translator

README.md

model

5 matches

tags: region:us

in a DeepL API key or a ChatGPT API key. The free version of DeepL allows usage for up to 500,000 characters per month.

On the other hand, the ChatGPT3.5-turbo model incurs API costs ranging from 0.25$ to 0.42$ per 30 minute of video subtitles.

praveensonu / alpaca_it_6k

README.md

dataset

4 matches

tags: task_categories:text-generation, size_categories:1K<n<10K, language:it, instruction fine-tuning, croissant, region:us

Free DeepL API credits to **Italian**. We translated 6353 instructions.

### Dataset Description

dangbert / alpaca-cleaned-nl

README.md

dataset

2 matches

tags: croissant, region:us

the DeepL API. The `orig_index` field indicates the original index of a given dataset item within the original dataset.

This dataset is a work in progress, I hope to continue updating extending it with more translations.

hidenoriyamano37 / SrtUtilTools

app.py

space

18 matches

nlpai-lab / kullm-v2

README.md

dataset

2 matches

tags: task_categories:text-generation, size_categories:10K<n<100K, language:ko, license:apache-2.0, croissant, region:us

Bingsu / ko_alpaca_data

README.md

dataset

3 matches

tags: task_categories:text-generation, size_categories:10K<n<100K, language:ko, license:cc-by-nc-4.0, croissant, region:us

the DeepL API, except for 'output', which we did not translate because it is the output of OpenAI's `text-davinci-003` model.

2. Generate output data

Then, using the instruction and input, generate output data via the OpenAI ChatGPT API (gpt-3.5-turbo).

cawacci / chatwithdocuments2

app.py

space

18 matches

ts # DeepL API request

csujeong / kullm-v2.1

README.md

dataset

2 matches

tags: task_categories:text-generation, size_categories:10K<n<100K, language:ko, license:apache-2.0, region:us

SINAI / RefutES

README.md

dataset

2 matches

tags: language:es, license:cc-by-nc-sa-4.0, counter-narrative, counterspeech, region:us

the DeepL API. All translations were reviewed by our annotators, and in those pairs where the translations were erroneous, they were edited. The associated counternarrative (CN) to each hate-speech message (HS) is generated by the GPT-4 model using a prompt strategy. The strategy used consisted in a Few Shot Learning Strategy, where the model was prompted with a task description and 8 examples of HS-CN pairs (one for each target). In addition, the counternarrative generated by GPT-4 has been evaluated by human experts using different metrics:

- Offensiveness:

- 1 (not offensive)

RASMUS / Whisper-youtube-crosslingual-subtitles

app.py

space

15 matches

ps://api-free.deepl.com/v2/usage', headers=headers)

usage = json.loads(usage.text)

deepL_character_usage = str(usage['character_count'])

print("deepL_character_usage")

LEADERRAILUSE / Whisper-youtube-crosslingual-subtitles

app.py

space

15 matches

ps://api-free.deepl.com/v2/usage', headers=headers)

usage = json.loads(usage.text)

deepL_character_usage = str(usage['character_count'])

print("deepL_character_usage")

shadow / Whisper-youtube-crosslingual-subtitles

app.py

space

14 matches

ps://api-free.deepl.com/v2/usage', headers=headers)

usage = json.loads(usage.text)

deepL_character_usage = str(usage['character_count'])

print("deepL_character_usage")

SanchezVFX / whisper-subtitles

app.py

space

14 matches

ps://api-free.deepl.com/v2/usage', headers=headers)

usage = json.loads(usage.text)

deepL_character_usage = str(usage['character_count'])

print("deepL_character_usage")