Spaces:

madebybread
/

brightly-ai

Paused

App Files Files Community

brightly-ai / README.md

beweinreich

update version of python

be02bbe about 1 month ago

preview code

raw

history blame

No virus

4.07 kB

	---
	title: Brightly Ai
	emoji: 👁
	colorFrom: blue
	colorTo: pink
	sdk: gradio
	python_version: 3.9.6
	sdk_version: 4.36.1
	app_file: app.py
	pinned: false
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# Brightly AI

	AI Algorithms to classify words provided by food rescue organizations into a predefined dictionary given by the USDA.

	## Overview

	This script processes a list of input words, classifies them as food or non-food items, and finds the most similar words from a predefined dictionary. It uses various techniques, including fast and slow similarity searches, GPT-3 queries, and a custom pluralizer.

	## Running

	```
	docker build -t brightly-ai .
	docker run -p 7860:7860 brightly-ai
	```

	### How It Works

	1. Initialization:

	- Database Connection: Connects to a database to store and retrieve word mappings.
	- Similarity Models: Initializes models to quickly and accurately find similar words.
	- Pluralizer: Handles singular and plural forms of words.

	2. Processing Input Words:

	- Reading Input: The script reads input words, either from a file or a predefined list.
	- Handling Multiple Items: If an input contains multiple items (separated by commas or slashes), it splits them and processes each item separately.

	3. Mapping Words:

	- Fast Similarity Search: Quickly finds the most similar word from the dictionary.
	- Slow Similarity Search: If the fast search is inconclusive, it performs a more thorough search.
	- Reverse Mapping: Attempts to find similar words by reversing the input word order.
	- GPT-3 Query: If all else fails, queries GPT-3 for recommendations.

	4. Classifying as Food or Non-Food:

	- Classification: Determines if the word is a food item.
	- Confidence Score: Assigns a score based on the confidence of the classification.

	5. Storing Results:

	- Database Storage: Stores the results in the database for future reference.
	- CSV Export: Saves the final results to a CSV file for easy access.

	# TODO

	[ ] Add requirements.txt file
	[ ] Add instructions re: each file in repo

	## Files and their purpose

	Here's a markdown table of the filename, and a brief description of what it does.

	\| Filename \| Description \|
	\| ----------------------------- \| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- \|
	\| run.py \| The main file to run the program. You pass it an array of words, and it'll process each word, store the results to a CSV file in the results folder, and stores any new mappings in the sqlite database \|
	\| algo_fast.py \| Uses a fast version of our LLM to encode word embeddings, and use cosine similarity to determine if they are similar. \|
	\| algo_slow.py \| A similar version of the algorithm, however, it has more a larger amount of embeddings from the dictionary. \|
	\| multi_food_item_detector.py \| Determines if the given string of text is multiple food items, or a single food item. \|
	\| update_pickle.py \| Updates the dictionary pickle file with any new words that have been added to the dictionary/additions.csv file. \|
	\| add_mappings_to_embeddings.py \| This takes all the reviewed mappings in the mappings database, and adds them to the embeddings file. \|