--- title: Brightly Ai emoji: 👁 colorFrom: blue colorTo: pink sdk: gradio python_version: 3.9.6 sdk_version: 4.36.1 app_file: app.py pinned: false --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Brightly AI AI Algorithms to classify words provided by food rescue organizations into a predefined dictionary given by the USDA. ## Overview This script processes a list of input words, classifies them as food or non-food items, and finds the most similar words from a predefined dictionary. It uses various techniques, including fast and slow similarity searches, GPT-3 queries, and a custom pluralizer. ## Running ``` docker build -t brightly-ai . docker run -p 7860:7860 brightly-ai ``` ### How It Works 1. Initialization: - Database Connection: Connects to a database to store and retrieve word mappings. - Similarity Models: Initializes models to quickly and accurately find similar words. - Pluralizer: Handles singular and plural forms of words. 2. Processing Input Words: - Reading Input: The script reads input words, either from a file or a predefined list. - Handling Multiple Items: If an input contains multiple items (separated by commas or slashes), it splits them and processes each item separately. 3. Mapping Words: - Fast Similarity Search: Quickly finds the most similar word from the dictionary. - Slow Similarity Search: If the fast search is inconclusive, it performs a more thorough search. - Reverse Mapping: Attempts to find similar words by reversing the input word order. - GPT-3 Query: If all else fails, queries GPT-3 for recommendations. 4. Classifying as Food or Non-Food: - Classification: Determines if the word is a food item. - Confidence Score: Assigns a score based on the confidence of the classification. 5. Storing Results: - Database Storage: Stores the results in the database for future reference. - CSV Export: Saves the final results to a CSV file for easy access. # TODO [ ] Add requirements.txt file [ ] Add instructions re: each file in repo ## Files and their purpose Here's a markdown table of the filename, and a brief description of what it does. | Filename | Description | | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | run.py | The main file to run the program. You pass it an array of words, and it'll process each word, store the results to a CSV file in the results folder, and stores any new mappings in the sqlite database | | algo_fast.py | Uses a fast version of our LLM to encode word embeddings, and use cosine similarity to determine if they are similar. | | algo_slow.py | A similar version of the algorithm, however, it has more a larger amount of embeddings from the dictionary. | | multi_food_item_detector.py | Determines if the given string of text is multiple food items, or a single food item. | | update_pickle.py | Updates the dictionary pickle file with any new words that have been added to the dictionary/additions.csv file. | | add_mappings_to_embeddings.py | This takes all the reviewed mappings in the mappings database, and adds them to the embeddings file. |