--- extra_gated_heading: Access aimped/nlp-health-translation-base-es-en on Hugging Face extra_gated_description: >- This is a form to enable access to this model on Hugging Face after you have been granted access from the Aimped. Please visit the [Aimped website](https://aimped.ai/) to Sign Up and accept our Terms of Use and Privacy Policy before submitting this form. Requests will be processed in 1-2 days. extra_gated_prompt: >- **Your Hugging Face account email address MUST match the email you provide on the Aimped website or your request will not be approved.** extra_gated_button_content: Submit extra_gated_fields: I agree to share my name, email address, and username with Aimped and confirm that I have already been granted download access on the Aimped website: checkbox license: cc-by-nc-4.0 language: - en - es metrics: - bleu pipeline_tag: translation widget: - text: >- Es importante comprender cómo la pandemia de COVID-19 ha afectado a la innovación incremental y su protección mediante derechos de propiedad industrial, con el fin de obtener información valiosa para desarrollar políticas públicas y estrategias empresariales. - text: >- El objetivo de este estudio fue analizar las innovaciones incrementales como respuesta a la pandemia que han sido protegidas por derechos de propiedad industrial, y examinar si la pandemia de la COVID-19 había tenido un efecto positivo o negativo en la innovación incremental, fomentándola o inhibiéndola. tags: - medical - translation - medical translation datasets: - aimped/medical-translation-test-set ---
# Description of the Model
Paper: LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation
The Medical Translation AI model represents a specialized language model, trained for the accurate translations of medical documents from Spanish to English. Its primary objective is to provide healthcare professionals, researchers, and individuals within the medical field with a reliable tool for the precise translation of a wide spectrum of medical documents.
The development of this model entailed the utilization of the
Hensinki/MarianMT neural translation architecture, which required 2+ days of intensive training using A100 (24G RAM) GPU.To create an exceptionally high-quality corpus for training the translation model, we combined both publicly available and proprietary datasets. These datasets were further enriched by meticulously curated text collected from online sources. In addition, the inclusion of clinical and discharge reports from diverse healthcare institutions enhanced the dataset's depth and diversity. This meticulous curation process plays a pivotal role in ensuring the model's ability to generate accurate translations tailored specifically to the medical domain, meeting the stringent standards expected by our users.
The versatility of the Medical Translation AI model extends to the translation of a wide array of healthcare-related documents, encompassing medical reports, patient records, medication instructions, research manuscripts, clinical trial documents, and more. By harnessing the capabilities of this model, users can efficiently and dependably obtain translations, thereby streamlining and expediting the often complex task of language translation within the medical field.
The model we have developed outperforms leading translation companies like Google, Helsinki-Opus/MarianMT, and DeepL when compared against our meticulously curated proprietary test data set.
ROUGE |
BLEU |
METEOR |
BERT |
|
Aimped |
0.89 | 0.70 | 0.88 | 0.98 |
0.82 | 0.56 | 0.81 | 0.94 | |
DeepL | 0.81 | 0.55 | 0.80 | 0.94 |
Opus/MarianMT | 0.79 | 0.52 | 0.78 | 0.93 |
Text Format Requirements: The text to be translated must adhere to a structured and grammatically correct format, including proper paragraph and sentence structures. Spelling errors or formatting issues, such as line breaks occurring before the completion of a sentence, will not be automatically corrected.
Trainin data: Public and in-house datasets.
Test data: Public and in-house datasets which is available here.