Spaces:

Sverd
/

image_captioner

Runtime error

App Files Files Community

Sverd commited on Jan 10

Commit

1352a28

•

1 Parent(s): 9501099

upload from local pc

Browse files

Files changed (12) hide show

Dockerfile +12 -0
README.md +2 -12
aicovers_topics.csv +97 -0
caption.py +43 -0
image_processing.py +23 -0
img_upload.py +58 -0
main.py +65 -0
moderator.py +53 -0
moderator_mc.py +39 -0
requirements.txt +99 -0
test.ann +0 -0
vector_search.py +48 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,12 @@

+FROM python:3.10
+WORKDIR /app
+ADD . /app
+RUN pip install --no-cache-dir -r requirements.txt
+EXPOSE 8000
+# Run main.py when the container launches
+CMD ["python", "main.py"]

README.md CHANGED Viewed

@@ -1,12 +1,2 @@
----
-title: Image Captioner
-emoji: 🏢
-colorFrom: purple
-colorTo: indigo
-sdk: docker
-app_port: 8000
-pinned: false
-license: apache-2.0
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference


1	+ # beam-image-captioning
2	+ Repo for the Image Captioning task for Beam campaign

aicovers_topics.csv ADDED Viewed

	@@ -0,0 +1,97 @@

+Topic,topic_cleaned,Topic group
+- Urban skyline at sunset,Urban skyline at sunset,City/Street Views
+- Historical district cityscape,Historical district cityscape,City/Street Views
+- City under the night lights,City under the night lights,City/Street Views
+- Busy city main street,Busy city main street,City/Street Views
+- A quiet pedestrian street,A quiet pedestrian street,City/Street Views
+- Festive decorations on a city street,Festive decorations on a city street,City/Street Views
+- Calm morning in a residential street,Calm morning in a residential street,City/Street Views
+- City street under night lights,City street under night lights,City/Street Views
+- Sunrise over the city panorama,Sunrise over the city panorama,City/Street Views
+- City at twilight,City at twilight,City/Street Views
+- Majestic historical building,Majestic historical building,Architecture: Classic and Modern
+- Classic museum exterior,Classic museum exterior,Architecture: Classic and Modern
+- Colorful cathedral,Colorful cathedral,Architecture: Classic and Modern
+- Modern skyscraper tower,Modern skyscraper tower,Architecture: Classic and Modern
+- Futuristic business center,Futuristic business center,Architecture: Classic and Modern
+- Ornate grand cathedral,Ornate grand cathedral,Architecture: Classic and Modern
+- Imperial palace exterior,Imperial palace exterior,Architecture: Classic and Modern
+- Contemporary design building,Contemporary design building,Architecture: Classic and Modern
+- Modern airport exterior,Modern airport exterior,Architecture: Classic and Modern
+- Innovative technology park,Innovative technology park,Architecture: Classic and Modern
+- Tranquil lake scenery,Tranquil lake scenery,Nature and Landscapes
+- Beautiful botanic garden,Beautiful botanic garden,Nature and Landscapes
+- Majestic mountains,Majestic mountains,Nature and Landscapes
+- Winter in pine forest,Winter in pine forest,Nature and Landscapes
+- Sunrise over a field,Sunrise over a field,Nature and Landscapes
+- Traditional breakfast spread,Traditional breakfast spread,Cuisine and Dining
+- Scene in a cafe,Scene in a cafe,Cuisine and Dining
+- Instagram food shot,Instagram food shot,Cuisine and Dining
+- Local market produce,Local market produce,Cuisine and Dining
+- Exotic regional dishes,Exotic regional dishes,Cuisine and Dining
+- Ice hockey game action,Ice hockey game action,Sports Activities
+- Snowy ski tracks,Snowy ski tracks,Sports Activities
+- Football match,Football match,Sports Activities
+- Rafting on a river,Rafting on a river,Sports Activities
+- Scenic railway journey / Train scene,Scenic railway journey / Train scene,Sports Activities
+- City's annual carnival,City's annual carnival,Social Gatherings
+- Networking event,Networking event,Social Gatherings
+- Open-air concert crowd,Open-air concert crowd,Social Gatherings
+- Friends in a bar,Friends in a bar,Social Gatherings
+- Family gathering,Family gathering,Social Gatherings
+- Iconic craft making,Iconic craft making,Traditional and Folklore
+- Traditional folk dance,Traditional folk dance,Traditional and Folklore
+- Tea ceremony with a traditional kettle,Tea ceremony with a traditional kettle,Traditional and Folklore
+- New Year celebration / Fireworks,New Year celebration / Fireworks,Traditional and Folklore
+- Day out in a traditional architecture complex,Day out in a traditional architecture complex,Traditional and Folklore
+- Playful dog in a park,Playful dog in a park,Pets
+- Cat lounging in a cozy home,Cat lounging in a cozy home,Pets
+- Parakeet in a colorful cage,Parakeet in a colorful cage,Pets
+- A child feeding her hamster,A child feeding her hamster,Pets
+- Aquarium scene with exotic fishes,Aquarium scene with exotic fishes,Pets
+- Trendy street style,Trendy street style,Fashion and Lifestyle
+- High-end fashion boutique,High-end fashion boutique,Fashion and Lifestyle
+- Eclectic vintage clothing store,Eclectic vintage clothing store,Fashion and Lifestyle
+- Chic home decor,Chic home decor,Fashion and Lifestyle
+- Lively beauty salon interior,Lively beauty salon interior,Fashion and Lifestyle
+- Tranquil seaside panorama,Tranquil seaside panorama,Travel and Adventure
+- Rustic camping site,Rustic camping site,Travel and Adventure
+- Snapshot of a road trip,Snapshot of a road trip,Travel and Adventure
+- Exciting amusement park,Exciting amusement park,Travel and Adventure
+- Captivating hiking trail,Captivating hiking trail,Travel and Adventure
+- Inspiring street mural,Inspiring street mural,Art and Creativity
+- Quaint pottery studio,Quaint pottery studio,Art and Creativity
+- Gallery exhibition,Gallery exhibition,Art and Creativity
+- Creative DIY craft project,Creative DIY craft project,Art and Creativity
+- Dramatic theater scene,Dramatic theater scene,Art and Creativity
+- Modern workspace with tech gadgets,Modern workspace with tech gadgets,Technology and Gaming
+- Immersive virtual reality gaming,Immersive virtual reality gaming,Technology and Gaming
+- E-sports event,E-sports event,Technology and Gaming
+- Robots,Robots,Technology and Gaming
+- Drone flying against city skyline,Drone flying against city skyline,Technology and Gaming
+- Outdoor yoga session,Outdoor yoga session,Health and Well-being
+- Running scene,Running scene,Health and Well-being
+- Fitness class,Fitness class,Health and Well-being
+"- Group sports (football, hockey)","Group sports (football, hockey)",Health and Well-being
+- Buzzing train station,Buzzing train station,Transportation
+- Airport with airplanes,Airport with airplanes,Transportation
+- Cars in a busy city,Cars in a busy city,Transportation
+- Busy harbor with ships,Busy harbor with ships,Transportation
+- Metro ride during peak hours,Metro ride during peak hours,Transportation
+"- Kitchenware: pots, pans, cutlery","Kitchenware: pots, pans, cutlery",Home Categories
+"- Bathroom: skincare, cosmetics, bath accessories","Bathroom: skincare, cosmetics, bath accessories",Home Categories
+"- Interior: décor elements, types of furniture","Interior: décor elements, types of furniture",Home Categories
+"- New Year theme: decorations, gifts, New Year parties","New Year theme: decorations, gifts, New Year parties",Home Categories
+"- People in the frame: home comfort, family scenes, domestic life","People in the frame: home comfort, family scenes, domestic life",Home Categories
+"- Hobbies: art tools, musical instruments, hobbies","Hobbies: art tools, musical instruments, hobbies",Home Categories
+"- Appliances: house appliances, cleaning, home maintenance","Appliances: house appliances, cleaning, home maintenance",Home Categories
+"- Workstation: computers, office supplies, workstations","Workstation: computers, office supplies, workstations",Office
+"- Team moments: meetings, brainstorming, team events","Team moments: meetings, brainstorming, team events",Office
+"- Office space: office interior, space design, working atmosphere","Office space: office interior, space design, working atmosphere",Office
+"- Coffee break: coffee breaks, lunchtime, informal communication","Coffee break: coffee breaks, lunchtime, informal communication",Office
+"- Fruits and vegetables: fresh produce, farmers market, vegetarian products","Fruits and vegetables: fresh produce, farmers market, vegetarian products",Grocery Store
+"- Dairy products: dairy production, cheese, milk","Dairy products: dairy production, cheese, milk",Grocery Store
+"- Meats and seafood: meat products, fish, deli","Meats and seafood: meat products, fish, deli",Grocery Store
+"- Grains and pasta: variety of grains, cereals, pasta","Grains and pasta: variety of grains, cereals, pasta",Grocery Store
+"- Waters and other drinks: water, non-alcoholic beverages, carbonated drinks","Waters and other drinks: water, non-alcoholic beverages, carbonated drinks",Grocery Store
+"- Beauty and hygiene: personal care, cosmetic products, hygiene products","Beauty and hygiene: personal care, cosmetic products, hygiene products",Grocery Store

caption.py ADDED Viewed

	@@ -0,0 +1,43 @@

+import os
+import requests
+from dotenv import load_dotenv
+load_dotenv()
+def caption_from_url(image_url):
+    """
+    Generates a caption for an image using the Azure Computer Vision API.
+    Parameters:
+    image_url (str): The URL of the image for which a caption should be generated.
+    Returns:
+    str: The generated caption for the image.
+    Raises:
+    requests.exceptions.HTTPError: If the request to the Azure API fails.
+    """
+    subscription_key = os.getenv('AZURE_SUBSCRIPTION_KEY')
+    endpoint = 'https://icmvp.cognitiveservices.azure.com/'
+    analyze_url = endpoint + "computervision/imageanalysis:analyze?api-version=2023-10-01"
+    headers = {
+        "Content-Type": "application/json",
+        'Ocp-Apim-Subscription-Key': subscription_key
+    }
+    params = {
+        'features': 'caption'
+    }
+    data = {'url': image_url}
+    response = requests.post(analyze_url, headers=headers, params=params, json=data)
+    response.raise_for_status()
+    analysis = response.json()
+    # Extract the description from the returned JSON
+    description = analysis['captionResult']['text']
+    return description

image_processing.py ADDED Viewed

	@@ -0,0 +1,23 @@

+# works with uploaded image URLs
+from moderator_mc import moderate_image  # uses moderate-content api
+from caption import caption_from_url  # generates captions
+from vector_search import topic_from_caption
+def process_image(image_url):
+    # Call the moderation function
+    moderation_result = moderate_image(image_url)
+    # If the moderator returns True, return "moderated"
+    # if moderation_result:  #for azure
+    if moderation_result == 3:  # mc api
+        return "moderated"
+    # If the moderator returns False, pass the URL to the captioner function
+    else:
+        image_caption = caption_from_url(image_url)
+        topic = topic_from_caption(image_caption)
+        answer = f"Caption: {image_caption}. Topic: {topic}"
+        return answer

img_upload.py ADDED Viewed

	@@ -0,0 +1,58 @@

+from azure.storage.blob import BlobServiceClient,  BlobClient, ContentSettings, generate_blob_sas, BlobSasPermissions
+from datetime import datetime, timedelta
+import os
+import dotenv
+dotenv.load_dotenv()
+storage_account_name = os.environ['AZURE_STORAGE_ACCOUNT_NAME']
+storage_account_key = os.environ['AZURE_STORAGE_KEY']
+connection_string = os.environ['AZURE_STORAGE_CONNECTION_STRING']
+container_name = os.environ['AZURE_STORAGE_CONTAINER_NAME']
+def upload_image_to_blob(image_data, image_name):
+    # Create a BlobServiceClient
+    blob_service_client = BlobServiceClient(account_url=f"https://{storage_account_name}.blob.core.windows.net",
+                                            credential=storage_account_key)
+    # Get the container client
+    container_client = blob_service_client.get_container_client(container_name)
+    # get the extension
+    # extension = os.path.splitext(image_name)[1]
+    # Get the blob client for the image
+    blob_name = image_name
+    blob_client = container_client.get_blob_client(blob_name)
+    # Determine the content type from the image name
+    content_type = "image/jpeg"  # Default to JPEG
+    if image_name.lower().endswith(".png"):
+        content_type = "image/png"
+    elif image_name.lower().endswith(".gif"):
+        content_type = "image/gif"
+    # Create the content settings with the determined content type
+    content_settings = ContentSettings(content_type=content_type)
+    # Set the content settings for the blob
+    # blob_client.set_http_headers(content_settings)
+    # Upload the image
+    blob_client.upload_blob(image_data, content_settings=content_settings)
+    # Generate a SAS token for the blob
+    sas_token = generate_blob_sas(
+        account_name=storage_account_name,
+        container_name=container_name,
+        blob_name=blob_name,
+        account_key=storage_account_key,
+        permission=BlobSasPermissions(read=True),
+        expiry=datetime.utcnow() + timedelta(hours=10)  # The SAS token will be valid for 1 hour
+    )
+    # Create a SAS URL for the blob
+    sas_url = f"https://{storage_account_name}.blob.core.windows.net/{container_name}/{blob_name}?{sas_token}"
+    return sas_url

main.py ADDED Viewed

	@@ -0,0 +1,65 @@

+# works with gradio file upload, not image upload
+import base64
+from fastapi import FastAPI #, UploadFile, File
+from img_upload import upload_image_to_blob
+from image_processing import process_image
+from pydantic import BaseModel, validator
+from PIL import Image
+import io
+import gradio as gr
+import uuid
+app = FastAPI()
+class FileUpload(BaseModel):
+    filename: str
+    data: str
+    # @validator('data')
+    # def validate_image(cls, data: str):
+    #     try:
+    #         image_data = base64.b64decode(data)
+    #         image = Image.open(BytesIO(image_data))
+    #         if image.format not in ['JPEG', 'PNG']:
+    #             raise ValueError('Invalid file type')
+    #         if max(image.size) > 5000:
+    #             raise ValueError('Image dimensions are too large')
+    #         if len(data) > 5000 * 5000:  # adjust this value based on your needs
+    #             raise ValueError('File size is too large')
+    #         return data
+    #     except Exception as e:
+    #         raise ValueError('Invalid image') from e
+    #
+class Response(BaseModel):
+    result: str
+@app.post("/upload", response_model=Response)
+async def create_upload_file(file: FileUpload):
+    data = base64.b64decode(file.data)
+    sas_url = upload_image_to_blob(data, file.filename)
+    result = process_image(sas_url)
+    return result
+async def gradio_interface(image: Image.Image):
+    # Convert PIL Image to bytes
+    img_byte_arr = io.BytesIO()
+    image.save(img_byte_arr, format="JPEG")
+    img_byte_arr = img_byte_arr.getvalue()
+    # Encode bytes to base64
+    data = base64.b64encode(img_byte_arr).decode()
+    # Generate a unique ID for the image
+    unique_id = str(uuid.uuid4())
+    response = await create_upload_file(FileUpload(filename=unique_id + ".jpg", data=data))
+    return response
+iface = gr.Interface(fn=gradio_interface, inputs=gr.Image(type="pil"), outputs="text")
+app = gr.mount_gradio_app(app, iface, "/gradio")
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000, )

moderator.py ADDED Viewed

	@@ -0,0 +1,53 @@

+import os
+import requests
+from dotenv import load_dotenv
+load_dotenv()
+def moderate_image(image_url):
+    """
+        Uses Microsoft Azure Content Moderator API to evaluate an image's content.
+        Args:
+        - image_url (str): The URL of the image to be moderated.
+        Returns:
+        - str: Returns "Moderated" if the image is classified as adult or racy,
+               otherwise returns "Passed".
+        """
+    subscription_key = os.getenv('AZURE_SUBSCRIPTION_KEY')
+    endpoint = "https://eastus.api.cognitive.microsoft.com"
+    moderator_url = endpoint + "/contentmoderator/moderate/v1.0/ProcessImage/Evaluate"
+    # Define the headers for the HTTP request
+    headers = {
+        "Content-Type": "application/json",
+        "Ocp-Apim-Subscription-Key": subscription_key
+    }
+    data = {
+        "DataRepresentation": 'URL',
+        'Value': image_url
+    }
+    # Send the image to the API
+    response = requests.post(moderator_url, headers=headers, json=data)
+    # Parse the response
+    response_json = response.json()
+    # Check if the image is classified as adult or racy
+    if response_json["IsImageAdultClassified"] or response_json["IsImageRacyClassified"]:
+        return True
+    else:
+        return False
+# Example usage
+#
+#
+# url = "https://www.rainforest-alliance.org/wp-content/uploads/2021/06/capybara-square-1-400x400.jpg.webp"
+# result = moderate_image(url)
+# print(result)

moderator_mc.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import os
+# import json
+import requests
+from dotenv import load_dotenv
+load_dotenv()
+def moderate_image(image_url):
+    """
+        Process an image by moderating it and extracting a caption if not moderated.
+        Args:
+        - image_url (str): URL of the image to be processed.
+        Returns:
+        - str: If the image is moderated, returns "moderated".
+               If not moderated, returns the extracted caption.
+        """
+    mc_key = os.getenv('MODERATE_CONTENT_KEY')
+    payload = {
+        'key': mc_key,
+        'url': image_url
+    }
+    endpoint = 'https://api.moderatecontent.com/moderate/'
+    response = requests.post(endpoint, data=payload)
+    if response.status_code == 200:
+        response_json = response.json()
+        return response_json['rating_index']
+    else:
+        print(response.status_code)
+        return None
+# Example usage
+# url = "https://www.rainforest-alliance.org/wp-content/uploads/2021/06/capybara-square-1-400x400.jpg.webp"
+# result = moderate_image(url)
+# print(result)

requirements.txt ADDED Viewed

	@@ -0,0 +1,99 @@

+aiofiles==23.2.1
+aiohttp==3.9.1
+aiosignal==1.3.1
+altair==5.2.0
+annotated-types==0.6.0
+annoy==1.17.3
+anyio==3.7.1
+asks==3.0.0
+async-generator==1.10
+async-timeout==4.0.3
+attrs==23.1.0
+azure-core==1.29.6
+azure-identity==1.15.0
+azure-storage-blob==12.19.0
+backoff==2.2.1
+beautifulsoup4==4.12.2
+certifi==2023.11.17
+cffi==1.16.0
+charset-normalizer==3.3.2
+click==8.1.7
+cohere==4.39
+colorama==0.4.6
+contourpy==1.2.0
+cryptography==41.0.7
+cycler==0.12.1
+exceptiongroup==1.2.0
+fastapi==0.105.0
+fastavro==1.9.2
+ffmpy==0.3.1
+filelock==3.13.1
+fonttools==4.46.0
+frozenlist==1.4.1
+fsspec==2023.12.2
+gradio==4.10.0
+gradio_client==0.7.3
+h11==0.14.0
+httpcore==1.0.2
+httpx==0.25.2
+huggingface-hub==0.19.4
+idna==3.6
+importlib-metadata==6.11.0
+importlib-resources==6.1.1
+isodate==0.6.1
+Jinja2==3.1.2
+jsonschema==4.20.0
+jsonschema-specifications==2023.11.2
+kiwisolver==1.4.5
+lxml==5.0.0
+markdown-it-py==3.0.0
+MarkupSafe==2.1.3
+matplotlib==3.8.2
+mdurl==0.1.2
+msal==1.26.0
+msal-extensions==1.1.0
+multidict==6.0.4
+numpy==1.26.2
+orjson==3.9.10
+outcome==1.3.0.post0
+packaging==23.2
+pandas==2.1.4
+Pillow==10.1.0
+ply==3.11
+portalocker==2.8.2
+pycparser==2.21
+pydantic==2.5.2
+pydantic_core==2.14.5
+pydub==0.25.1
+Pygments==2.17.2
+PyJWT==2.8.0
+pyparsing==3.1.1
+python-dateutil==2.8.2
+python-dotenv==1.0.0
+python-multipart==0.0.6
+pytz==2023.3.post1
+PyYAML==6.0.1
+referencing==0.32.0
+requests==2.31.0
+rich==13.7.0
+rpds-py==0.14.1
+semantic-version==2.10.0
+shellingham==1.5.4
+six==1.16.0
+sniffio==1.3.0
+sortedcontainers==2.4.0
+soupsieve==2.5
+starlette==0.27.0
+stone==3.3.1
+tomlkit==0.12.0
+toolz==0.12.0
+tqdm==4.66.1
+trio==0.23.2
+typer==0.9.0
+typing_extensions==4.9.0
+tzdata==2023.3
+urllib3==2.1.0
+uvicorn==0.24.0.post1
+websockets==11.0.3
+yarl==1.9.4
+zipp==3.17.0

test.ann ADDED Viewed

Binary file (477 kB). View file

vector_search.py ADDED Viewed

	@@ -0,0 +1,48 @@

+import cohere
+from annoy import AnnoyIndex
+import numpy as np
+import dotenv
+import os
+import pandas as pd
+dotenv.load_dotenv()
+model_name = "embed-english-v3.0"
+api_key = os.environ['COHERE_API_KEY']
+input_type_embed = "search_document"
+# Set up the cohere client
+co = cohere.Client(api_key)
+# Get the dataset of topics
+topics = pd.read_csv("aicovers_topics.csv")
+# Get the embeddings
+list_embeds = co.embed(texts=list(topics['topic_cleaned']), model=model_name, input_type=input_type_embed).embeddings
+# Create the search index, pass the size of embedding
+search_index = AnnoyIndex(np.array(list_embeds).shape[1], metric='angular')
+# Add vectors to the search index
+for i in range(len(list_embeds)):
+    search_index.add_item(i, list_embeds[i])
+search_index.build(10)  # 10 trees
+search_index.save('test.ann')
+def topic_from_caption(caption):
+    """
+        Returns a topic from an uploaded list that is semantically similar to the input caption.
+        Args:
+        - caption (str): The image caption generated by MS Azure.
+        Returns:
+        - str: The extracted topic based on the provided caption.
+        """
+    input_type_query = "search_query"
+    caption_embed = co.embed(texts=[caption], model=model_name, input_type=input_type_query).embeddings  # embeds a caption
+    topic_ids = search_index.get_nns_by_vector(caption_embed[0], n=1, include_distances=True)  # retrieves the nearest category
+    topic = topics.iloc[topic_ids[0]]['topic_cleaned'].to_string(index=False, header=False)
+    return topic