ChandimaPrabath commited on
Commit
8725d0d
·
1 Parent(s): cf54400
Files changed (7) hide show
  1. .gitignore +10 -0
  2. README.md +301 -5
  3. app.py +377 -0
  4. hf_scrapper.py +249 -0
  5. indexer.py +32 -0
  6. requirements.txt +5 -0
  7. tvdb.py +70 -0
.gitignore ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ #.env
2
+ .env
3
+ # cache
4
+ tmp
5
+ # pycache
6
+ __pycache__
7
+ # stream-test.py
8
+ stream-test.py
9
+ #test
10
+ test.py
README.md CHANGED
@@ -1,12 +1,308 @@
1
  ---
2
  title: Load Balancer
3
- emoji: 🐨
4
- colorFrom: indigo
5
- colorTo: pink
6
  sdk: gradio
7
- sdk_version: 4.39.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Load Balancer
3
+ emoji: 🚀
4
+ colorFrom: purple
5
+ colorTo: red
6
  sdk: gradio
7
+ sdk_version: 4.36.1
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+
13
+ ## Scripts
14
+ ```
15
+ app.py -> main script that run flask server
16
+ hf_scrapper.py -> script for interacting with huggingface
17
+ indexer.py -> script to index the repo structure
18
+ tvdb.py -> script to interact with TheTVDB
19
+ ```
20
+ ## Film and TV API
21
+
22
+ This API provides endpoints for accessing and managing film and TV show data, including downloading, caching, and retrieving metadata.
23
+
24
+ ## Table of Contents
25
+
26
+
27
+ - [Base URL](#base-url)
28
+ - [Endpoints](#endpoints)
29
+ - [Film Endpoints](#film-endpoints)
30
+ - [TV Show Endpoints](#tv-show-endpoints)
31
+ - [Cache Endpoints](#cache-endpoints)
32
+ - [Metadata Endpoints](#metadata-endpoints)
33
+ - [Miscellaneous Endpoints](#miscellaneous-endpoints)
34
+ - [Error Handling](#error-handling)
35
+ - [Running the Server](#running-the-server)
36
+
37
+
38
+ ## Base URL
39
+
40
+ All endpoints are accessed through the base URL:
41
+
42
+ ```markdown
43
+ http://<server-address>:7860
44
+ ```
45
+
46
+ Replace `<server-address>` with your server's address.
47
+
48
+ ## Endpoints
49
+
50
+ ### Film Endpoints
51
+
52
+ #### `GET /api/film`
53
+
54
+ **Description:** Starts the download of a film if it's not already cached.
55
+
56
+ **Query Parameters:**
57
+ - `title` (string): The title of the film.
58
+
59
+ **Responses:**
60
+ - `200 OK`: Download started successfully.
61
+ ```json
62
+ {
63
+ "status": "Download started",
64
+ "film_id": "film_id_here"
65
+ }
66
+ ```
67
+ - `400 Bad Request`: Title parameter is required.
68
+ ```json
69
+ {
70
+ "error": "Title parameter is required"
71
+ }
72
+ ```
73
+ - `404 Not Found`: Movie not found.
74
+
75
+ #### `GET /api/film/store`
76
+
77
+ **Description:** Retrieves the JSON data for the film store.
78
+
79
+ **Responses:**
80
+ - `200 OK`: Returns the film store JSON data.
81
+ ```json
82
+ {
83
+ "film_title": "cache_path_here"
84
+ }
85
+ ```
86
+ - `404 Not Found`: Film store JSON not found.
87
+
88
+ #### `GET /api/film/metadata`
89
+
90
+ **Description:** Retrieves metadata for a film by title.
91
+
92
+ **Query Parameters:**
93
+ - `title` (string): The title of the film.
94
+
95
+ **Responses:**
96
+ - `200 OK`: Returns the metadata JSON for the film.
97
+ ```json
98
+ {
99
+ "title": "Film Title",
100
+ "year": 2024,
101
+ "metadata": { ... }
102
+ }
103
+ ```
104
+ - `400 Bad Request`: No title provided.
105
+ ```json
106
+ {
107
+ "error": "No title provided"
108
+ }
109
+ ```
110
+ - `404 Not Found`: Metadata not found.
111
+
112
+ ### TV Show Endpoints
113
+
114
+ #### `GET /api/tv`
115
+
116
+ **Description:** Starts the download of a TV show episode if it's not already cached.
117
+
118
+ **Query Parameters:**
119
+ - `title` (string): The title of the TV show.
120
+ - `season` (string): The season number.
121
+ - `episode` (string): The episode number.
122
+
123
+ **Responses:**
124
+ - `200 OK`: Download started successfully.
125
+ ```json
126
+ {
127
+ "status": "Download started",
128
+ "episode_id": "episode_id_here"
129
+ }
130
+ ```
131
+ - `400 Bad Request`: Title, season, and episode parameters are required.
132
+ ```json
133
+ {
134
+ "error": "Title, season, and episode parameters are required"
135
+ }
136
+ ```
137
+ - `404 Not Found`: TV show or episode not found.
138
+
139
+ #### `GET /api/tv/store`
140
+
141
+ **Description:** Retrieves the JSON data for the TV store.
142
+
143
+ **Responses:**
144
+ - `200 OK`: Returns the TV store JSON data.
145
+ ```json
146
+ {
147
+ "show_title": {
148
+ "season": {
149
+ "episode": "cache_path_here"
150
+ }
151
+ }
152
+ }
153
+ ```
154
+ - `404 Not Found`: TV store JSON not found.
155
+
156
+ #### `GET /api/tv/metadata`
157
+
158
+ **Description:** Retrieves metadata for a TV show by title.
159
+
160
+ **Query Parameters:**
161
+ - `title` (string): The title of the TV show.
162
+
163
+ **Responses:**
164
+ - `200 OK`: Returns the metadata JSON for the TV show.
165
+ ```json
166
+ {
167
+ "title": "TV Show Title",
168
+ "seasons": [ ... ],
169
+ "metadata": { ... }
170
+ }
171
+ ```
172
+ - `400 Bad Request`: No title provided.
173
+ ```json
174
+ {
175
+ "error": "No title provided"
176
+ }
177
+ ```
178
+ - `404 Not Found`: Metadata not found.
179
+
180
+ ### Cache Endpoints
181
+
182
+ #### `GET /api/cache/size`
183
+
184
+ **Description:** Retrieves the total size of the cache.
185
+
186
+ **Responses:**
187
+ - `200 OK`: Returns the cache size in a human-readable format.
188
+ ```json
189
+ {
190
+ "cache_size": "10.5 MB"
191
+ }
192
+ ```
193
+
194
+ #### `POST /api/cache/clear`
195
+
196
+ **Description:** Clears the entire cache.
197
+
198
+ **Responses:**
199
+ - `200 OK`: Cache cleared successfully.
200
+ ```json
201
+ {
202
+ "status": "Cache cleared"
203
+ }
204
+ ```
205
+
206
+ ### Metadata Endpoints
207
+
208
+ #### `GET /api/filmid`
209
+
210
+ **Description:** Retrieves the film ID by title.
211
+
212
+ **Query Parameters:**
213
+ - `title` (string): The title of the film.
214
+
215
+ **Responses:**
216
+ - `200 OK`: Returns the film ID.
217
+ ```json
218
+ {
219
+ "film_id": "film_id_here"
220
+ }
221
+ ```
222
+ - `400 Bad Request`: Title parameter is required.
223
+ ```json
224
+ {
225
+ "error": "Title parameter is required"
226
+ }
227
+ ```
228
+
229
+ #### `GET /api/episodeid`
230
+
231
+ **Description:** Retrieves the episode ID by title, season, and episode.
232
+
233
+ **Query Parameters:**
234
+ - `title` (string): The title of the TV show.
235
+ - `season` (string): The season number.
236
+ - `episode` (string): The episode number.
237
+
238
+ **Responses:**
239
+ - `200 OK`: Returns the episode ID.
240
+ ```json
241
+ {
242
+ "episode_id": "episode_id_here"
243
+ }
244
+ ```
245
+ - `400 Bad Request`: Title, season, and episode parameters are required.
246
+ ```json
247
+ {
248
+ "error": "Title, season, and episode parameters are required"
249
+ }
250
+ ```
251
+
252
+ ### Miscellaneous Endpoints
253
+
254
+ #### `GET /api/film/all`
255
+
256
+ **Description:** Retrieves a list of all films.
257
+
258
+ **Responses:**
259
+ - `200 OK`: Returns a list of film paths.
260
+ ```json
261
+ [
262
+ "film_path_1",
263
+ "film_path_2"
264
+ ]
265
+ ```
266
+
267
+ #### `GET /api/tv/all`
268
+
269
+ **Description:** Retrieves a list of all TV shows.
270
+
271
+ **Responses:**
272
+ - `200 OK`: Returns a list of TV shows with their episodes.
273
+ ```json
274
+ {
275
+ "show_title": [
276
+ {
277
+ "season": "season_number",
278
+ "episode": "episode_title"
279
+ }
280
+ ]
281
+ }
282
+ ```
283
+
284
+ ## Error Handling
285
+
286
+ All endpoints return standard HTTP status codes:
287
+ - `200 OK` for successful requests.
288
+ - `400 Bad Request` for invalid requests.
289
+ - `404 Not Found` for missing resources.
290
+
291
+ Errors are returned in the following format:
292
+ ```json
293
+ {
294
+ "error": "Error message here"
295
+ }
296
+ ```
297
+
298
+ ## Running the Server
299
+
300
+ To run the server, ensure you have all required dependencies installed and use the following command:
301
+
302
+ ```bash
303
+ python app.py
304
+ ```
305
+
306
+ The server will start on `http://0.0.0.0:7860` by default.
307
+
308
+ ---
app.py ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, jsonify, request, send_from_directory
2
+ from flask_cors import CORS
3
+ import os
4
+ import json
5
+ import threading
6
+ import urllib.parse
7
+ from hf_scrapper import download_film, download_episode, get_system_proxies, get_download_progress
8
+ from indexer import indexer
9
+ from tvdb import fetch_and_cache_json
10
+ import re
11
+
12
+ app = Flask(__name__)
13
+ CORS(app)
14
+
15
+ # Constants and Configuration
16
+ CACHE_DIR = os.getenv("CACHE_DIR")
17
+ INDEX_FILE = os.getenv("INDEX_FILE")
18
+ TOKEN = os.getenv("TOKEN")
19
+ FILM_STORE_JSON_PATH = os.path.join(CACHE_DIR, "film_store.json")
20
+ TV_STORE_JSON_PATH = os.path.join(CACHE_DIR, "tv_store.json")
21
+ REPO = os.getenv("REPO")
22
+ download_threads = {}
23
+
24
+ # Ensure CACHE_DIR exists
25
+ if not os.path.exists(CACHE_DIR):
26
+ os.makedirs(CACHE_DIR)
27
+
28
+ for path in [FILM_STORE_JSON_PATH, TV_STORE_JSON_PATH]:
29
+ if not os.path.exists(path):
30
+ with open(path, 'w') as json_file:
31
+ json.dump({}, json_file)
32
+
33
+ # Index the file structure
34
+ indexer()
35
+
36
+ # Load the file structure JSON
37
+ if not os.path.exists(INDEX_FILE):
38
+ raise FileNotFoundError(f"{INDEX_FILE} not found. Please make sure the file exists.")
39
+
40
+ with open(INDEX_FILE, 'r') as f:
41
+ file_structure = json.load(f)
42
+
43
+ # Function Definitions
44
+
45
+ def load_json(file_path):
46
+ """Load JSON data from a file."""
47
+ with open(file_path, 'r') as file:
48
+ return json.load(file)
49
+
50
+ def find_movie_path(json_data, title):
51
+ """Find the path of the movie in the JSON data based on the title."""
52
+ for directory in json_data:
53
+ if directory['type'] == 'directory' and directory['path'] == 'films':
54
+ for sub_directory in directory['contents']:
55
+ if sub_directory['type'] == 'directory':
56
+ for item in sub_directory['contents']:
57
+ if item['type'] == 'file' and title.lower() in item['path'].lower():
58
+ return item['path']
59
+ return None
60
+
61
+ def find_tv_path(json_data, title):
62
+ """Find the path of the TV show in the JSON data based on the title."""
63
+ for directory in json_data:
64
+ if directory['type'] == 'directory' and directory['path'] == 'tv':
65
+ for sub_directory in directory['contents']:
66
+ if sub_directory['type'] == 'directory' and title.lower() in sub_directory['path'].lower():
67
+ return sub_directory['path']
68
+ return None
69
+
70
+ def get_tv_structure(json_data,title):
71
+ """Find the path of the TV show in the JSON data based on the title."""
72
+ for directory in json_data:
73
+ if directory['type'] == 'directory' and directory['path'] == 'tv':
74
+ for sub_directory in directory['contents']:
75
+ if sub_directory['type'] == 'directory' and title.lower() in sub_directory['path'].lower():
76
+ return sub_directory
77
+ return None
78
+
79
+ def get_film_id(title):
80
+ """Generate a film ID based on the title."""
81
+ return title.replace(" ", "_").lower()
82
+
83
+ def prefetch_metadata():
84
+ """Prefetch metadata for all items in the file structure."""
85
+ for item in file_structure:
86
+ if 'contents' in item:
87
+ for sub_item in item['contents']:
88
+ original_title = sub_item['path'].split('/')[-1]
89
+ media_type = 'series' if item['path'].startswith('tv') else 'movie'
90
+ title = original_title
91
+ year = None
92
+
93
+ # Extract year from the title if available
94
+ match = re.search(r'\((\d{4})\)', original_title)
95
+ if match:
96
+ year_str = match.group(1)
97
+ if year_str.isdigit() and len(year_str) == 4:
98
+ title = original_title[:match.start()].strip()
99
+ year = int(year_str)
100
+ else:
101
+ parts = original_title.rsplit(' ', 1)
102
+ if len(parts) > 1 and parts[-1].isdigit() and len(parts[-1]) == 4:
103
+ title = parts[0].strip()
104
+ year = int(parts[-1])
105
+
106
+ fetch_and_cache_json(original_title, title, media_type, year)
107
+
108
+ def bytes_to_human_readable(num, suffix="B"):
109
+ for unit in ["", "K", "M", "G", "T", "P", "E", "Z"]:
110
+ if abs(num) < 1024.0:
111
+ return f"{num:3.1f} {unit}{suffix}"
112
+ num /= 1024.0
113
+ return f"{num:.1f} Y{suffix}"
114
+
115
+ def encode_episodeid(title,season,episode):
116
+ return f"{title}_{season}_{episode}"
117
+
118
+ def get_all_tv_shows(indexed_cache):
119
+ """Get all TV shows from the indexed cache structure JSON file."""
120
+ tv_shows = {}
121
+ for directory in indexed_cache:
122
+ if directory['type'] == 'directory' and directory['path'] == 'tv':
123
+ for sub_directory in directory['contents']:
124
+ if sub_directory['type'] == 'directory':
125
+ show_title = sub_directory['path'].split('/')[-1]
126
+ tv_shows[show_title] = []
127
+ for season_directory in sub_directory['contents']:
128
+ if season_directory['type'] == 'directory':
129
+ season = season_directory['path'].split('/')[-1]
130
+ for episode in season_directory['contents']:
131
+ if episode['type'] == 'file':
132
+ tv_shows[show_title].append({
133
+ "season": season,
134
+ "episode": episode['path'].split('/')[-1],
135
+ "path": episode['path']
136
+ })
137
+ return tv_shows
138
+
139
+ def get_all_films(indexed_cache):
140
+ """Get all films from the indexed cache structure JSON file."""
141
+ films = []
142
+ for directory in indexed_cache:
143
+ if directory['type'] == 'directory' and directory['path'] == 'films':
144
+ for sub_directory in directory['contents']:
145
+ if sub_directory['type'] == 'directory':
146
+ films.append(sub_directory['path'])
147
+ return films
148
+
149
+ def start_prefetching():
150
+ """Start the metadata prefetching in a separate thread."""
151
+ prefetch_metadata()
152
+
153
+ # Start prefetching metadata
154
+ thread = threading.Thread(target=start_prefetching)
155
+ thread.daemon = True
156
+ thread.start()
157
+
158
+ # API Endpoints
159
+
160
+ @app.route('/api/film', methods=['GET'])
161
+ def get_movie_api():
162
+ """Endpoint to get the movie by title."""
163
+ title = request.args.get('title')
164
+ if not title:
165
+ return jsonify({"error": "Title parameter is required"}), 400
166
+
167
+ # Load the film store JSON
168
+ with open(FILM_STORE_JSON_PATH, 'r') as json_file:
169
+ film_store_data = json.load(json_file)
170
+
171
+ # Check if the film is already cached
172
+ if title in film_store_data:
173
+ cache_path = film_store_data[title]
174
+ if os.path.exists(cache_path):
175
+ return send_from_directory(os.path.dirname(cache_path), os.path.basename(cache_path))
176
+
177
+ movie_path = find_movie_path(file_structure, title)
178
+
179
+ if not movie_path:
180
+ return jsonify({"error": "Movie not found"}), 404
181
+
182
+ cache_path = os.path.join(CACHE_DIR, movie_path)
183
+ file_url = f"https://huggingface.co/{REPO}/resolve/main/{movie_path}"
184
+ proxies = get_system_proxies()
185
+ film_id = get_film_id(title)
186
+
187
+ # Start the download in a separate thread if not already downloading
188
+ if film_id not in download_threads or not download_threads[film_id].is_alive():
189
+ thread = threading.Thread(target=download_film, args=(file_url, TOKEN, cache_path, proxies, film_id, title))
190
+ download_threads[film_id] = thread
191
+ thread.start()
192
+
193
+ return jsonify({"status": "Download started", "film_id": film_id})
194
+
195
+ @app.route('/api/tv', methods=['GET'])
196
+ def get_tv_show_api():
197
+ """Endpoint to get the TV show by title, season, and episode."""
198
+ title = request.args.get('title')
199
+ season = request.args.get('season')
200
+ episode = request.args.get('episode')
201
+
202
+ if not title or not season or not episode:
203
+ return jsonify({"error": "Title, season, and episode parameters are required"}), 400
204
+
205
+ # Load the TV store JSON
206
+ with open(TV_STORE_JSON_PATH, 'r') as json_file:
207
+ tv_store_data = json.load(json_file)
208
+
209
+ # Check if the episode is already cached
210
+ if title in tv_store_data and season in tv_store_data[title]:
211
+ for ep in tv_store_data[title][season]:
212
+ if episode in ep:
213
+ cache_path = tv_store_data[title][season][ep]
214
+ if os.path.exists(cache_path):
215
+ return send_from_directory(os.path.dirname(cache_path), os.path.basename(cache_path))
216
+
217
+ tv_path = find_tv_path(file_structure, title)
218
+
219
+ if not tv_path:
220
+ return jsonify({"error": "TV show not found"}), 404
221
+
222
+ episode_path = None
223
+ for directory in file_structure:
224
+ if directory['type'] == 'directory' and directory['path'] == 'tv':
225
+ for sub_directory in directory['contents']:
226
+ if sub_directory['type'] == 'directory' and title.lower() in sub_directory['path'].lower():
227
+ for season_dir in sub_directory['contents']:
228
+ if season_dir['type'] == 'directory' and season in season_dir['path']:
229
+ for episode_file in season_dir['contents']:
230
+ if episode_file['type'] == 'file' and episode in episode_file['path']:
231
+ episode_path = episode_file['path']
232
+ break
233
+
234
+ if not episode_path:
235
+ return jsonify({"error": "Episode not found"}), 404
236
+
237
+ cache_path = os.path.join(CACHE_DIR, episode_path)
238
+ file_url = f"https://huggingface.co/{REPO}/resolve/main/{episode_path}"
239
+ proxies = get_system_proxies()
240
+ episode_id = encode_episodeid(title,season,episode)
241
+
242
+ # Start the download in a separate thread if not already downloading
243
+ if episode_id not in download_threads or not download_threads[episode_id].is_alive():
244
+ thread = threading.Thread(target=download_episode, args=(file_url, TOKEN, cache_path, proxies, episode_id, title))
245
+ download_threads[episode_id] = thread
246
+ thread.start()
247
+
248
+ return jsonify({"status": "Download started", "episode_id": episode_id})
249
+
250
+
251
+ @app.route('/api/progress/<id>', methods=['GET'])
252
+ def get_progress_api(id):
253
+ """Endpoint to get the download progress of a movie or TV show episode."""
254
+ progress = get_download_progress(id)
255
+ return jsonify({"id": id, "progress": progress})
256
+
257
+ @app.route('/api/cache/size', methods=['GET'])
258
+ def get_cache_size_api():
259
+ total_size = 0
260
+ for dirpath, dirnames, filenames in os.walk(CACHE_DIR):
261
+ for f in filenames:
262
+ fp = os.path.join(dirpath, f)
263
+ total_size += os.path.getsize(fp)
264
+ readable_size = bytes_to_human_readable(total_size)
265
+ return jsonify({"cache_size": readable_size})
266
+
267
+ @app.route('/api/cache/clear', methods=['POST'])
268
+ def clear_cache_api():
269
+ for dirpath, dirnames, filenames in os.walk(CACHE_DIR):
270
+ for f in filenames:
271
+ fp = os.path.join(dirpath, f)
272
+ os.remove(fp)
273
+ return jsonify({"status": "Cache cleared"})
274
+
275
+ @app.route('/api/tv/store', methods=['GET'])
276
+ def get_tv_store_api():
277
+ """Endpoint to get the TV store JSON."""
278
+ if os.path.exists(TV_STORE_JSON_PATH):
279
+ with open(TV_STORE_JSON_PATH, 'r') as json_file:
280
+ tv_store_data = json.load(json_file)
281
+ return jsonify(tv_store_data)
282
+ return jsonify({}), 404
283
+
284
+ @app.route('/api/film/store', methods=['GET'])
285
+ def get_film_store_api():
286
+ """Endpoint to get the TV store JSON."""
287
+ if os.path.exists(FILM_STORE_JSON_PATH):
288
+ with open(FILM_STORE_JSON_PATH, 'r') as json_file:
289
+ tv_store_data = json.load(json_file)
290
+ return jsonify(tv_store_data)
291
+ return jsonify({}), 404
292
+
293
+ #################################################
294
+ # No change needed
295
+
296
+ @app.route('/api/filmid', methods=['GET'])
297
+ def get_film_id_by_title_api():
298
+ """Endpoint to get the film ID by providing the movie title."""
299
+ title = request.args.get('title')
300
+ if not title:
301
+ return jsonify({"error": "Title parameter is required"}), 400
302
+ film_id = get_film_id(title)
303
+ return jsonify({"film_id": film_id})
304
+
305
+ @app.route('/api/episodeid', methods=['GET'])
306
+ def get_episode_id_api():
307
+ """Endpoint to get the episode ID by providing the TV show title, season, and episode."""
308
+ title = request.args.get('title')
309
+ season = request.args.get('season')
310
+ episode = request.args.get('episode')
311
+ if not title or not season or not episode:
312
+ return jsonify({"error": "Title, season, and episode parameters are required"}), 400
313
+ episode_id = encode_episodeid(title,season,episode)
314
+ return jsonify({"episode_id": episode_id})
315
+
316
+ @app.route('/api/film/metadata', methods=['GET'])
317
+ def get_film_metadata_api():
318
+ """Endpoint to get the film metadata by title."""
319
+ title = request.args.get('title')
320
+ if not title:
321
+ return jsonify({'error': 'No title provided'}), 400
322
+
323
+ json_cache_path = os.path.join(CACHE_DIR, f"{urllib.parse.quote(title)}.json")
324
+
325
+ if os.path.exists(json_cache_path):
326
+ with open(json_cache_path, 'r') as f:
327
+ data = json.load(f)
328
+ return jsonify(data)
329
+
330
+ return jsonify({'error': 'Metadata not found'}), 404
331
+
332
+ @app.route('/api/tv/metadata', methods=['GET'])
333
+ def get_tv_metadata_api():
334
+ """Endpoint to get the TV show metadata by title."""
335
+ title = request.args.get('title')
336
+ if not title:
337
+ return jsonify({'error': 'No title provided'}), 400
338
+
339
+ json_cache_path = os.path.join(CACHE_DIR, f"{urllib.parse.quote(title)}.json")
340
+
341
+ if os.path.exists(json_cache_path):
342
+ with open(json_cache_path, 'r') as f:
343
+ data = json.load(f)
344
+
345
+ # Add the file structure to the metadata
346
+ tv_structure_data = get_tv_structure(file_structure, title)
347
+ if tv_structure_data:
348
+ data['file_structure'] = tv_structure_data
349
+
350
+ return jsonify(data)
351
+
352
+ return jsonify({'error': 'Metadata not found'}), 404
353
+
354
+
355
+ @app.route("/api/film/all")
356
+ def get_all_films_api():
357
+ return get_all_films(file_structure)
358
+
359
+ @app.route("/api/tv/all")
360
+ def get_all_tvshows_api():
361
+ return get_all_tv_shows(file_structure)
362
+
363
+ #############################################################
364
+ # unique api's
365
+ @app.route('/api/register/<instanceid>',methodes=['POST'])
366
+ def register_instance(instanceid):
367
+ # need to add instance registration logic
368
+ return jsonify({f'{instanceid} registered'})
369
+
370
+ # Routes
371
+ @app.route('/')
372
+ def index():
373
+ return "Load Balancer is Running ..."
374
+
375
+ # Main entry point
376
+ if __name__ == "__main__":
377
+ app.run(debug=True, host="0.0.0.0", port=7860)
hf_scrapper.py ADDED
@@ -0,0 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import requests
3
+ import json
4
+ import urllib.request
5
+ import time
6
+ from requests.exceptions import RequestException
7
+ from tqdm import tqdm
8
+
9
+ CACHE_DIR = os.getenv("CACHE_DIR")
10
+ CACHE_JSON_PATH = os.path.join(CACHE_DIR, "cached_films.json")
11
+
12
+ download_progress = {}
13
+
14
+ def get_system_proxies():
15
+ """
16
+ Retrieves the system's HTTP and HTTPS proxies.
17
+
18
+ Returns:
19
+ dict: A dictionary containing the proxies.
20
+ """
21
+ try:
22
+ proxies = urllib.request.getproxies()
23
+ print("System proxies:", proxies)
24
+ return {
25
+ "http": proxies.get("http"),
26
+ "https": proxies.get("http")
27
+ }
28
+ except Exception as e:
29
+ print(f"Error getting system proxies: {e}")
30
+ return {}
31
+
32
+ def download_film(file_url, token, cache_path, proxies, film_id, title, chunk_size=100 * 1024 * 1024):
33
+ """
34
+ Downloads a file from the specified URL and saves it to the cache path.
35
+ Tracks the download progress.
36
+
37
+ Args:
38
+ file_url (str): The URL of the file to download.
39
+ token (str): The authorization token for the request.
40
+ cache_path (str): The path to save the downloaded file.
41
+ proxies (dict): Proxies for the request.
42
+ film_id (str): Unique identifier for the film download.
43
+ title (str): The title of the film.
44
+ chunk_size (int): Size of each chunk to download.
45
+ """
46
+ print(f"Downloading file from URL: {file_url} to {cache_path} with proxies: {proxies}")
47
+ headers = {'Authorization': f'Bearer {token}'}
48
+ try:
49
+ response = requests.get(file_url, headers=headers, proxies=proxies, stream=True)
50
+ response.raise_for_status()
51
+
52
+ total_size = int(response.headers.get('content-length', 0))
53
+ download_progress[film_id] = {"total": total_size, "downloaded": 0, "status": "Downloading", "start_time": time.time()}
54
+
55
+ os.makedirs(os.path.dirname(cache_path), exist_ok=True)
56
+ with open(cache_path, 'wb') as file, tqdm(total=total_size, unit='B', unit_scale=True, desc=cache_path) as pbar:
57
+ for data in response.iter_content(chunk_size=chunk_size):
58
+ file.write(data)
59
+ pbar.update(len(data))
60
+ download_progress[film_id]["downloaded"] += len(data)
61
+
62
+ print(f'File cached to {cache_path} successfully.')
63
+ update_film_store_json(title, cache_path)
64
+ download_progress[film_id]["status"] = "Completed"
65
+ except RequestException as e:
66
+ print(f"Error downloading file: {e}")
67
+ download_progress[film_id]["status"] = "Failed"
68
+ except IOError as e:
69
+ print(f"Error writing file {cache_path}: {e}")
70
+ download_progress[film_id]["status"] = "Failed"
71
+ finally:
72
+ if download_progress[film_id]["status"] != "Downloading":
73
+ download_progress[film_id]["end_time"] = time.time()
74
+
75
+ def get_download_progress(id):
76
+ """
77
+ Gets the download progress for a specific film.
78
+
79
+ Args:
80
+ film_id (str): The unique identifier for the film download.
81
+
82
+ Returns:
83
+ dict: A dictionary containing the total size, downloaded size, progress percentage, status, and ETA.
84
+ """
85
+ if id in download_progress:
86
+ total = download_progress[id]["total"]
87
+ downloaded = download_progress[id]["downloaded"]
88
+ status = download_progress[id].get("status", "In Progress")
89
+ progress = (downloaded / total) * 100 if total > 0 else 0
90
+
91
+ eta = None
92
+ if status == "Downloading" and downloaded > 0:
93
+ elapsed_time = time.time() - download_progress[id]["start_time"]
94
+ estimated_total_time = elapsed_time * (total / downloaded)
95
+ eta = estimated_total_time - elapsed_time
96
+ elif status == "Completed":
97
+ eta = 0
98
+
99
+ return {"total": total, "downloaded": downloaded, "progress": progress, "status": status, "eta": eta}
100
+ return {"total": 0, "downloaded": 0, "progress": 0, "status": "Not Found", "eta": None}
101
+
102
+ def update_film_store_json(title, cache_path):
103
+ """
104
+ Updates the film store JSON with the new file.
105
+
106
+ Args:
107
+ title (str): The title of the film.
108
+ cache_path (str): The local path where the file is saved.
109
+ """
110
+ FILM_STORE_JSON_PATH = os.path.join(CACHE_DIR, "film_store.json")
111
+
112
+ film_store_data = {}
113
+ if os.path.exists(FILM_STORE_JSON_PATH):
114
+ with open(FILM_STORE_JSON_PATH, 'r') as json_file:
115
+ film_store_data = json.load(json_file)
116
+
117
+ film_store_data[title] = cache_path
118
+
119
+ with open(FILM_STORE_JSON_PATH, 'w') as json_file:
120
+ json.dump(film_store_data, json_file, indent=2)
121
+ print(f'Film store updated with {title}.')
122
+
123
+
124
+ ###############################################################################
125
+ def download_episode(file_url, token, cache_path, proxies, episode_id, title, chunk_size=100 * 1024 * 1024):
126
+ """
127
+ Downloads a file from the specified URL and saves it to the cache path.
128
+ Tracks the download progress.
129
+
130
+ Args:
131
+ file_url (str): The URL of the file to download.
132
+ token (str): The authorization token for the request.
133
+ cache_path (str): The path to save the downloaded file.
134
+ proxies (dict): Proxies for the request.
135
+ film_id (str): Unique identifier for the film download.
136
+ title (str): The title of the film.
137
+ chunk_size (int): Size of each chunk to download.
138
+ """
139
+ print(f"Downloading file from URL: {file_url} to {cache_path} with proxies: {proxies}")
140
+ headers = {'Authorization': f'Bearer {token}'}
141
+ try:
142
+ response = requests.get(file_url, headers=headers, proxies=proxies, stream=True)
143
+ response.raise_for_status()
144
+
145
+ total_size = int(response.headers.get('content-length', 0))
146
+ download_progress[episode_id] = {"total": total_size, "downloaded": 0, "status": "Downloading", "start_time": time.time()}
147
+
148
+ os.makedirs(os.path.dirname(cache_path), exist_ok=True)
149
+ with open(cache_path, 'wb') as file, tqdm(total=total_size, unit='B', unit_scale=True, desc=cache_path) as pbar:
150
+ for data in response.iter_content(chunk_size=chunk_size):
151
+ file.write(data)
152
+ pbar.update(len(data))
153
+ download_progress[episode_id]["downloaded"] += len(data)
154
+
155
+ print(f'File cached to {cache_path} successfully.')
156
+ update_tv_store_json(title, cache_path)
157
+ download_progress[episode_id]["status"] = "Completed"
158
+ except RequestException as e:
159
+ print(f"Error downloading file: {e}")
160
+ download_progress[episode_id]["status"] = "Failed"
161
+ except IOError as e:
162
+ print(f"Error writing file {cache_path}: {e}")
163
+ download_progress[episode_id]["status"] = "Failed"
164
+ finally:
165
+ if download_progress[episode_id]["status"] != "Downloading":
166
+ download_progress[episode_id]["end_time"] = time.time()
167
+
168
+
169
+ def update_tv_store_json(title, cache_path):
170
+ """
171
+ Updates the TV store JSON with the new file, organizing by title, season, and episode.
172
+
173
+ Args:
174
+ title (str): The title of the TV show.
175
+ cache_path (str): The local path where the file is saved.
176
+ """
177
+ TV_STORE_JSON_PATH = os.path.join(CACHE_DIR, "tv_store.json")
178
+
179
+ tv_store_data = {}
180
+ if os.path.exists(TV_STORE_JSON_PATH):
181
+ with open(TV_STORE_JSON_PATH, 'r') as json_file:
182
+ tv_store_data = json.load(json_file)
183
+
184
+ # Extract season and episode information from the cache_path
185
+ season_part = os.path.basename(os.path.dirname(cache_path)) # Extracts 'Season 1'
186
+ episode_part = os.path.basename(cache_path) # Extracts 'Grand Blue Dreaming - S01E01 - Deep Blue HDTV-720p.mp4'
187
+
188
+ # Create the structure if not already present
189
+ if title not in tv_store_data:
190
+ tv_store_data[title] = {}
191
+
192
+ if season_part not in tv_store_data[title]:
193
+ tv_store_data[title][season_part] = {}
194
+
195
+ # Assuming episode_part is unique for each episode within a season
196
+ tv_store_data[title][season_part][episode_part] = cache_path
197
+
198
+ with open(TV_STORE_JSON_PATH, 'w') as json_file:
199
+ json.dump(tv_store_data, json_file, indent=2)
200
+
201
+ print(f'TV store updated with {title}, {season_part}, {episode_part}.')
202
+
203
+ ###############################################################################
204
+ def get_file_structure(repo, token, path="", proxies=None):
205
+ """
206
+ Fetches the file structure of a specified Hugging Face repository.
207
+
208
+ Args:
209
+ repo (str): The name of the repository.
210
+ token (str): The authorization token for the request.
211
+ path (str, optional): The specific path in the repository. Defaults to "".
212
+ proxies (dict, optional): The proxies to use for the request. Defaults to None.
213
+
214
+ Returns:
215
+ list: A list of file structure information.
216
+ """
217
+ api_url = f"https://huggingface.co/api/models/{repo}/tree/main/{path}"
218
+ headers = {'Authorization': f'Bearer {token}'}
219
+ print(f"Fetching file structure from URL: {api_url} with proxies: {proxies}")
220
+ try:
221
+ response = requests.get(api_url, headers=headers, proxies=proxies)
222
+ response.raise_for_status()
223
+ return response.json()
224
+ except RequestException as e:
225
+ print(f"Error fetching file structure: {e}")
226
+ return []
227
+
228
+ def write_file_structure_to_json(file_structure, file_path):
229
+ """
230
+ Writes the file structure to a JSON file.
231
+
232
+ Args:
233
+ file_structure (list): The file structure data.
234
+ file_path (str): The path where the JSON file will be saved.
235
+ """
236
+ try:
237
+ with open(file_path, 'w') as json_file:
238
+ json.dump(file_structure, json_file, indent=2)
239
+ print(f'File structure written to {file_path}')
240
+ except IOError as e:
241
+ print(f"Error writing file structure to JSON: {e}")
242
+
243
+ if __name__ == "__main__":
244
+ file_url = "https://huggingface.co/Unicone-Studio/jellyfin_media/resolve/main/films/Funky%20Monkey%202004/Funky%20Monkey%20(2004)%20Web-dl%201080p.mp4"
245
+ token = os.getenv("TOKEN")
246
+ cache_path = os.path.join(CACHE_DIR, "films/Funky Monkey 2004/Funky Monkey (2004) Web-dl 1080p.mp4")
247
+ proxies = get_system_proxies()
248
+ film_id = "funky_monkey_2004" # Unique identifier for the film download
249
+ download_film(file_url, token, cache_path, proxies=proxies, film_id=film_id)
indexer.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from hf_scrapper import get_system_proxies, get_file_structure, write_file_structure_to_json
3
+ from dotenv import load_dotenv
4
+ import os
5
+
6
+ load_dotenv()
7
+
8
+ def index_repository(token, repo, current_path="", proxies=None):
9
+ file_structure = get_file_structure(repo, token, current_path, proxies)
10
+ full_structure = []
11
+ for item in file_structure:
12
+ if item['type'] == 'directory':
13
+ sub_directory_structure = index_repository(token, repo, item['path'], proxies)
14
+ full_structure.append({
15
+ "type": "directory",
16
+ "path": item['path'],
17
+ "contents": sub_directory_structure
18
+ })
19
+ else:
20
+ full_structure.append(item)
21
+ return full_structure
22
+
23
+ def indexer():
24
+ token = os.getenv("TOKEN")
25
+ repo = os.getenv("REPO")
26
+ output_path = os.getenv("INDEX_FILE")
27
+
28
+ proxies = get_system_proxies()
29
+ full_structure = index_repository(token, repo, "", proxies)
30
+ write_file_structure_to_json(full_structure, output_path)
31
+ print(f"Full file structure for repository '{repo}' has been indexed and saved to {output_path}")
32
+
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ flask
2
+ Flask-Cors
3
+ requests
4
+ python-dotenv
5
+ tqdm
tvdb.py ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # tvdb.py
2
+ import os
3
+ import requests
4
+ import urllib.parse
5
+ from datetime import datetime, timedelta
6
+ from dotenv import load_dotenv
7
+ import json
8
+ from hf_scrapper import get_system_proxies
9
+
10
+ load_dotenv()
11
+ THETVDB_API_KEY = os.getenv("THETVDB_API_KEY")
12
+ THETVDB_API_URL = os.getenv("THETVDB_API_URL")
13
+ CACHE_DIR = os.getenv("CACHE_DIR")
14
+ TOKEN_EXPIRY = None
15
+ THETVDB_TOKEN = None
16
+
17
+
18
+ proxies = get_system_proxies()
19
+
20
+ def authenticate_thetvdb():
21
+ global THETVDB_TOKEN, TOKEN_EXPIRY
22
+ auth_url = f"{THETVDB_API_URL}/login"
23
+ auth_data = {
24
+ "apikey": THETVDB_API_KEY
25
+ }
26
+ try:
27
+ response = requests.post(auth_url, json=auth_data, proxies=proxies)
28
+ response.raise_for_status()
29
+ response_data = response.json()
30
+ THETVDB_TOKEN = response_data['data']['token']
31
+ TOKEN_EXPIRY = datetime.now() + timedelta(days=30)
32
+ except requests.RequestException as e:
33
+ print(f"Authentication failed: {e}")
34
+ THETVDB_TOKEN = None
35
+ TOKEN_EXPIRY = None
36
+
37
+ def get_thetvdb_token():
38
+ global THETVDB_TOKEN, TOKEN_EXPIRY
39
+ if not THETVDB_TOKEN or datetime.now() >= TOKEN_EXPIRY:
40
+ authenticate_thetvdb()
41
+ return THETVDB_TOKEN
42
+
43
+ def fetch_and_cache_json(original_title, title, media_type, year=None):
44
+ if year:
45
+ search_url = f"{THETVDB_API_URL}/search?query={urllib.parse.quote(title)}&type={media_type}&year={year}"
46
+ else:
47
+ search_url = f"{THETVDB_API_URL}/search?query={urllib.parse.quote(title)}&type={media_type}"
48
+
49
+ token = get_thetvdb_token()
50
+ if not token:
51
+ print("Authentication failed")
52
+ return
53
+
54
+ headers = {
55
+ "Authorization": f"Bearer {token}",
56
+ "accept": "application/json",
57
+ }
58
+
59
+ try:
60
+ response = requests.get(search_url, headers=headers, proxies=proxies)
61
+ response.raise_for_status()
62
+ data = response.json()
63
+
64
+ if 'data' in data and data['data']:
65
+ json_cache_path = os.path.join(CACHE_DIR, f"{urllib.parse.quote(original_title)}.json")
66
+ with open(json_cache_path, 'w') as f:
67
+ json.dump(data, f)
68
+
69
+ except requests.RequestException as e:
70
+ print(f"Error fetching data: {e}")