dylangamachefl commited on
Commit
b08909a
·
0 Parent(s):

Initial commit for text translation app

Browse files
Files changed (4) hide show
  1. Dockerfile +22 -0
  2. README.md +142 -0
  3. app.py +175 -0
  4. requirements.txt +3 -0
Dockerfile ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Choose an appropriate Python base image
2
+ FROM python:3.9-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Copy the requirements file into the container
8
+ # Ensure this requirements.txt is in your GitHub repo and lists streamlit, requests, python-dotenv
9
+ COPY requirements.txt .
10
+
11
+ # Install Python dependencies
12
+ RUN pip install --no-cache-dir -r requirements.txt
13
+
14
+ # Copy your application code (app.py and any other needed files) into the container
15
+ COPY . .
16
+
17
+ # Expose the port Streamlit runs on (default is 8501)
18
+ EXPOSE 8501
19
+
20
+ # Command to run your Streamlit application
21
+ # Ensure HUGGING_FACE_API_TOKEN is set as a secret in your HF Space settings
22
+ CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
README.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: HF Sentiment Analyzer
3
+ emoji: 🤗 # You can choose an emoji
4
+ colorFrom: blue # Or any color
5
+ colorTo: green # Or any color
6
+ sdk: docker
7
+ app_file: app.py
8
+ # For Docker, you don't usually specify sdk_version directly here
9
+ # unless the template specifically requires it.
10
+ # If your Dockerfile handles Python/Streamlit versions, that's usually enough.
11
+ # If the Streamlit Docker Template implies a specific Dockerfile or setup,
12
+ # then 'sdk: docker' and 'app_file: app.py' are key.
13
+ # The template might also have set 'dockerfile: Dockerfile' if it expects one.
14
+ pinned: false
15
+ ---
16
+
17
+ # 🌍 Hugging Face Text Translation Tool
18
+
19
+ An interactive web application that translates text into various languages using Hugging Face's state-of-the-art translation models via the Inference API. This project is part of a 4-week AI project portfolio building challenge.
20
+
21
+ **Live Demo:** [Link to your Deployed App on Hugging Face Spaces]
22
+
23
+ **Project Repository:** `https://github.com/dylangamachefl/hf-text-translator` (Or your actual repo name)
24
+
25
+ ![Screenshot of Text Translator App](translator-screenshot.png)
26
+ *(Replace `translator-screenshot.png` with the actual path/name if different, or embed the image directly if preferred by dragging it into the GitHub text editor for the README)*
27
+
28
+ ## 📖 Overview
29
+
30
+ This application provides a simple and intuitive interface for users to:
31
+ 1. Input text they wish to translate.
32
+ 2. Select a target language from a predefined list.
33
+ 3. Receive the translated text, processed by powerful models hosted on Hugging Face.
34
+
35
+ The primary goal is to demonstrate the ability to integrate with external AI services (Hugging Face Inference API) and build a functional NLP application with a user-friendly UI.
36
+
37
+ ## 🎯 Problem Solved
38
+
39
+ In an increasingly globalized world, language barriers can hinder communication and access to information. This tool offers a quick and accessible way to translate text, helping to bridge these gaps. It showcases how pre-trained AI models can be leveraged to build practical solutions for common language-related tasks.
40
+
41
+ ## ✨ Skills Showcased
42
+
43
+ * **AI/ML Implementation:** Utilizing pre-trained NLP models for a specific task (translation).
44
+ * **Python:** Core programming language for backend logic and API interaction.
45
+ * **ML Libraries (Conceptual):** Understanding the role and use of Hugging Face Transformers (even if used via API).
46
+ * **API Integration:** Connecting to and consuming the Hugging Face Inference API.
47
+ * **Data Handling:** Sending text data to the API and parsing JSON responses.
48
+ * **NLP (using APIs):** Practical application of Natural Language Processing for translation.
49
+ * **Web Development (UI):** Building an interactive user interface with Streamlit.
50
+ * **Environment Management:** Use of `.env` for API keys.
51
+ * **Version Control:** Git and GitHub for project management.
52
+ * **Deployment:** Deploying the application to Hugging Face Spaces.
53
+ * **Documentation:** Creating clear and concise project documentation (this README).
54
+
55
+ ## 🛠️ How It Works
56
+
57
+ 1. **User Input:** The user types or pastes the text they want to translate into a text area.
58
+ 2. **Language Selection:** The user selects the desired target language from a dropdown menu. Each language option is mapped to a specific Hugging Face translation model ID (primarily from the Helsinki-NLP group, e.g., `Helsinki-NLP/opus-mt-en-es` for English to Spanish).
59
+ 3. **API Call:** When the "Translate" button is clicked:
60
+ * The Python backend (using the `requests` library) constructs a POST request to the Hugging Face Inference API endpoint for the selected model.
61
+ * The input text is sent in the JSON payload.
62
+ * The Hugging Face API token (loaded securely from environment variables) is included in the request headers for authentication.
63
+ 4. **Processing:** The Hugging Face infrastructure runs the inference on the chosen translation model.
64
+ 5. **Response Handling:** The application receives the API's JSON response, which contains the translated text (typically within a list and dictionary structure like `[{'translation_text': '...'}]`).
65
+ 6. **Display Output:** The translated text is extracted from the response and displayed to the user in the Streamlit interface. Error handling is implemented to manage API issues or unexpected responses.
66
+
67
+ ## 💻 Technologies Used
68
+
69
+ * **Programming Language:** Python 3.x
70
+ * **AI Models/API:**
71
+ * Hugging Face Hub
72
+ * Hugging Face Inference API (Free Tier)
73
+ * Helsinki-NLP Translation Models (e.g., `opus-mt-*`)
74
+ * **Python Libraries:**
75
+ * `streamlit`: For building the web application UI.
76
+ * `requests`: For making HTTP requests to the Hugging Face API.
77
+ * `python-dotenv`: For managing environment variables (like the API token) locally.
78
+ * **Version Control:** Git & GitHub
79
+ * **Deployment:** Hugging Face Spaces
80
+ * **Development Environment:** Visual Studio Code (or your preferred IDE), Python Virtual Environment (`venv`)
81
+
82
+ ## 🚀 Setup and Local Development
83
+
84
+ To run this project locally, follow these steps:
85
+
86
+ 1. **Clone the repository:**
87
+ ```bash
88
+ git clone https://github.com/[Your GitHub Username]/hf-text-translator.git
89
+ cd hf-text-translator
90
+ ```
91
+
92
+ 2. **Set up a Python virtual environment:**
93
+ (Assuming you have a shared `venv` in a parent `ai-portfolio` directory as per the overall plan)
94
+ ```bash
95
+ # From within hf-text-translator directory:
96
+ # For macOS/Linux:
97
+ source ../venv/bin/activate
98
+ # For Windows (Git Bash or PowerShell):
99
+ # source ../venv/Scripts/activate
100
+ # For Windows (Command Prompt):
101
+ # ..\venv\Scripts\activate
102
+ ```
103
+ If you don't have the shared venv or prefer a dedicated one for this project:
104
+ ```bash
105
+ python -m venv venv
106
+ # Activate it:
107
+ # macOS/Linux: source venv/bin/activate
108
+ # Windows: venv\Scripts\activate
109
+ ```
110
+
111
+ 3. **Install dependencies:**
112
+ ```bash
113
+ pip install -r requirements.txt
114
+ ```
115
+
116
+ 4. **Set up your Hugging Face API Token:**
117
+ * Create a `.env` file in the root of your main `ai-portfolio` project directory (i.e., one level above this `hf-text-translator` project).
118
+ * Add your Hugging Face API token to the `.env` file:
119
+ ```
120
+ HUGGING_FACE_API_TOKEN="your_hf_api_token_here"
121
+ ```
122
+ * *Note: The `app.py` is configured to look for `.env` in the parent directory. If your `.env` file is elsewhere, you might need to adjust the `load_dotenv()` path in `app.py`.*
123
+
124
+ 5. **Run the Streamlit application:**
125
+ ```bash
126
+ streamlit run app.py
127
+ ```
128
+ The application should open in your web browser.
129
+
130
+ ## 🔮 Future Enhancements (Optional)
131
+
132
+ * **Auto-detect source language:** Implement a feature to automatically detect the language of the input text.
133
+ * **Support more languages:** Expand the list of available target languages by adding more Helsinki-NLP models.
134
+ * **Batch translation:** Allow users to upload a file for translating multiple pieces of text.
135
+ * **Improved UI/UX:** Further refine the user interface for better aesthetics and usability.
136
+
137
+ ## 🙏 Acknowledgements
138
+
139
+ * The Hugging Face team for their incredible models, Inference API, and Spaces platform.
140
+ * The developers of Streamlit for making web app creation in Python so accessible.
141
+
142
+ ---
app.py ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import requests
3
+ import os
4
+ from dotenv import load_dotenv
5
+
6
+ # --- Configuration ---
7
+ # Attempt to load .env file.
8
+ # Assumes .env is in the parent directory of this script's location (e.g., ../.env)
9
+ # If your app.py is in the root of your project (where .env also is),
10
+ # load_dotenv() without arguments might work.
11
+ # For Hugging Face Spaces, you'll set secrets directly in the Space settings.
12
+ dotenv_path = os.path.join(
13
+ os.path.dirname(__file__), "..", ".env"
14
+ ) # Path to .env in parent directory
15
+ if os.path.exists(dotenv_path):
16
+ load_dotenv(dotenv_path=dotenv_path)
17
+ else:
18
+ # Fallback if .env is in the current directory (less likely for multi-project setup)
19
+ load_dotenv()
20
+
21
+
22
+ API_TOKEN = os.getenv("HUGGING_FACE_API_TOKEN")
23
+ API_URL_BASE = "https://api-inference.huggingface.co/models/"
24
+ HEADERS = {"Authorization": f"Bearer {API_TOKEN}"}
25
+
26
+ # Define available models (user-friendly name: model_id)
27
+ # You can find more models at https://huggingface.co/models?pipeline_tag=translation
28
+ # Filter by source language and target language.
29
+ TRANSLATION_MODELS = {
30
+ "English to Spanish": "Helsinki-NLP/opus-mt-en-es",
31
+ "English to French": "Helsinki-NLP/opus-mt-en-fr",
32
+ "English to German": "Helsinki-NLP/opus-mt-en-de",
33
+ "English to Chinese (Simplified)": "Helsinki-NLP/opus-mt-en-zh",
34
+ "English to Japanese": "Helsinki-NLP/opus-mt-en-jap", # Check model hub for exact ID if this doesn't work
35
+ "Spanish to English": "Helsinki-NLP/opus-mt-es-en",
36
+ "French to English": "Helsinki-NLP/opus-mt-fr-en",
37
+ # Add more models/languages as desired
38
+ }
39
+
40
+
41
+ # --- Hugging Face API Call Function ---
42
+ def query_translation(text_to_translate, model_id):
43
+ """
44
+ Sends a request to the Hugging Face Inference API for translation.
45
+ """
46
+ if not API_TOKEN: # Check if token was loaded
47
+ st.error(
48
+ "Hugging Face API Token not found. Please configure it in your .env file or Space secrets."
49
+ )
50
+ return None
51
+
52
+ api_url = API_URL_BASE + model_id
53
+ payload = {"inputs": text_to_translate}
54
+
55
+ try:
56
+ response = requests.post(
57
+ api_url, headers=HEADERS, json=payload, timeout=30
58
+ ) # Added timeout
59
+ response.raise_for_status() # Raises an HTTPError for bad responses (4XX or 5XX)
60
+ return response.json()
61
+ except requests.exceptions.HTTPError as errh:
62
+ st.error(f"Translation API HTTP Error: {errh}")
63
+ error_details = "No additional details from API."
64
+ try:
65
+ error_details = response.json().get("error", response.text)
66
+ except ValueError: # If response.text is not JSON
67
+ error_details = response.text
68
+ st.info(f"Details: {error_details}")
69
+ return None
70
+ except requests.exceptions.ConnectionError as errc:
71
+ st.error(f"Translation API Connection Error: {errc}")
72
+ return None
73
+ except requests.exceptions.Timeout as errt:
74
+ st.error(f"Translation API Timeout Error: {errt}")
75
+ return None
76
+ except requests.exceptions.RequestException as err:
77
+ st.error(f"Translation API Request Error: {err}")
78
+ return None
79
+ except (
80
+ ValueError
81
+ ): # If response is not JSON (should be caught by response.json() above but good to have)
82
+ st.error("Error: Received non-JSON response from translation API.")
83
+ st.info(
84
+ f"Raw Response: {response.text if 'response' in locals() else 'No response object'}"
85
+ )
86
+ return None
87
+
88
+
89
+ # --- Streamlit UI ---
90
+ st.set_page_config(page_title="🌍 Text Translator", layout="wide")
91
+
92
+ st.title("🌍 Text Translation Tool")
93
+ st.markdown(
94
+ "Translate text into various languages using Hugging Face's Inference API. "
95
+ "This app demonstrates API integration for NLP tasks."
96
+ )
97
+
98
+ # Check for API token at the beginning of UI rendering
99
+ if not API_TOKEN:
100
+ st.error("Hugging Face API Token not configured. The application cannot function.")
101
+ st.markdown(
102
+ "Please ensure your `HUGGING_FACE_API_TOKEN` is set in a `.env` file "
103
+ "in the root of your `ai-portfolio` project or as a secret if deploying on Hugging Face Spaces."
104
+ )
105
+ st.stop() # Stop further execution of the script if token is missing
106
+
107
+ # Layout columns
108
+ col1, col2 = st.columns([2, 1]) # Text area takes 2/3, selectbox takes 1/3
109
+
110
+ with col1:
111
+ text_input = st.text_area(
112
+ "Enter text to translate:",
113
+ height=200,
114
+ key="text_input_translate",
115
+ placeholder="Type or paste your text here...",
116
+ )
117
+
118
+ with col2:
119
+ selected_language_name = st.selectbox(
120
+ "Select target language:",
121
+ options=list(TRANSLATION_MODELS.keys()),
122
+ index=0, # Default to the first language in the list
123
+ key="lang_select",
124
+ )
125
+ model_id_to_use = TRANSLATION_MODELS[selected_language_name]
126
+ st.caption(f"Using model: `{model_id_to_use}`")
127
+
128
+
129
+ if st.button("Translate Text", key="translate_button", type="primary"):
130
+ if text_input:
131
+ if not API_TOKEN: # Redundant check, but good for safety
132
+ st.error("API Token is missing. Cannot proceed.")
133
+ else:
134
+ with st.spinner(f"Translating to {selected_language_name}... Please wait."):
135
+ translation_result = query_translation(text_input, model_id_to_use)
136
+
137
+ if translation_result:
138
+ # The API returns a list with a dictionary inside
139
+ if (
140
+ isinstance(translation_result, list)
141
+ and len(translation_result) > 0
142
+ and "translation_text" in translation_result[0]
143
+ ):
144
+ translated_text = translation_result[0]["translation_text"]
145
+ st.subheader("📜 Translation:")
146
+ st.success(translated_text)
147
+ # Sometimes the API might return a dictionary directly with an error
148
+ elif isinstance(translation_result, dict) and translation_result.get(
149
+ "error"
150
+ ):
151
+ # Error is already displayed by the query_translation function
152
+ st.warning("Translation failed. See error message above.")
153
+ else:
154
+ st.error(
155
+ "Translation failed or the API returned an unexpected format."
156
+ )
157
+ st.json(translation_result) # Show the raw response for debugging
158
+ # If translation_result is None, query_translation already showed an error
159
+ else:
160
+ st.warning("Please enter some text to translate.")
161
+
162
+ st.divider()
163
+ st.sidebar.header("ℹ️ About This App")
164
+ st.sidebar.info(
165
+ "This tool demonstrates the use of the Hugging Face Inference API "
166
+ "for text translation. It allows users to input text and select a target "
167
+ "language, then displays the translated output."
168
+ "\n\n**Key Skills Showcased:**"
169
+ "\n- Python & Streamlit for UI"
170
+ "\n- Hugging Face API Integration"
171
+ "\n- Handling API responses & errors"
172
+ "\n- Basic NLP application"
173
+ )
174
+ st.sidebar.markdown("---")
175
+ st.sidebar.markdown("Project for **AI Project Portfolio (4 Weeks)**")
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ streamlit
2
+ requests
3
+ python-dotenv