Spaces:
Paused
Paused
metadata
title: Brightly Ai
emoji: π
colorFrom: blue
colorTo: pink
sdk: gradio
python_version: 3.9.6
sdk_version: 4.36.1
app_file: app.py
pinned: false
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Brightly AI
AI Algorithms to classify words provided by food rescue organizations into a predefined dictionary given by the USDA.
Overview
This script processes a list of input words, classifies them as food or non-food items, and finds the most similar words from a predefined dictionary. It uses various techniques, including fast and slow similarity searches, GPT-3 queries, and a custom pluralizer.
Running
docker build -t brightly-ai .
docker run -p 7860:7860 brightly-ai
How It Works
- Initialization:
- Database Connection: Connects to a database to store and retrieve word mappings.
- Similarity Models: Initializes models to quickly and accurately find similar words.
- Pluralizer: Handles singular and plural forms of words.
- Processing Input Words:
- Reading Input: The script reads input words, either from a file or a predefined list.
- Handling Multiple Items: If an input contains multiple items (separated by commas or slashes), it splits them and processes each item separately.
- Mapping Words:
- Fast Similarity Search: Quickly finds the most similar word from the dictionary.
- Slow Similarity Search: If the fast search is inconclusive, it performs a more thorough search.
- Reverse Mapping: Attempts to find similar words by reversing the input word order.
- GPT-3 Query: If all else fails, queries GPT-3 for recommendations.
- Classifying as Food or Non-Food:
- Classification: Determines if the word is a food item.
- Confidence Score: Assigns a score based on the confidence of the classification.
- Storing Results:
- Database Storage: Stores the results in the database for future reference.
- CSV Export: Saves the final results to a CSV file for easy access.
TODO
[ ] Add requirements.txt file [ ] Add instructions re: each file in repo
Files and their purpose
Here's a markdown table of the filename, and a brief description of what it does.
Filename | Description |
---|---|
run.py | The main file to run the program. You pass it an array of words, and it'll process each word, store the results to a CSV file in the results folder, and stores any new mappings in the sqlite database |
algo_fast.py | Uses a fast version of our LLM to encode word embeddings, and use cosine similarity to determine if they are similar. |
algo_slow.py | A similar version of the algorithm, however, it has more a larger amount of embeddings from the dictionary. |
multi_food_item_detector.py | Determines if the given string of text is multiple food items, or a single food item. |
update_pickle.py | Updates the dictionary pickle file with any new words that have been added to the dictionary/additions.csv file. |
add_mappings_to_embeddings.py | This takes all the reviewed mappings in the mappings database, and adds them to the embeddings file. |