Spaces:

meraGPT
/

meraKB

Running

App Files Files Community

Asankhaya Sharma commited on Oct 3, 2023

Commit

9f028c2

1 Parent(s): 5cbd50f

Revert "i"

Browse files

This reverts commit 5cbd50f709fb6bdcf9e42497e9f8877068f05194.

Files changed (2) hide show

README.md +13 -174
app.py +0 -135

README.md CHANGED Viewed

@@ -1,174 +1,13 @@
-# Quivr
-<p align="center">
-<img src="../logo.png" alt="Quivr-logo" width="30%">
-<p align="center">
-<a href="https://discord.gg/HUpRgp2HG8">
-  <img src="https://img.shields.io/badge/discord-join%20chat-blue.svg" alt="Join our Discord" height="40">
-</a>
-Quivr is your second brain in the cloud, designed to easily store and retrieve unstructured information. It's like Obsidian but powered by generative AI.
-## Features
-- **Store Anything**: Quivr can handle almost any type of data you throw at it. Text, images, code snippets, you name it.
-- **Generative AI**: Quivr uses advanced AI to help you generate and retrieve information.
-- **Fast and Efficient**: Designed with speed and efficiency in mind. Quivr makes sure you can access your data as quickly as possible.
-- **Secure**: Your data is stored securely in the cloud and is always under your control.
-- **Compatible Files**:
-  - **Text**
-  - **Markdown**
-  - **PDF**
-  - **Audio**
-  - **Video**
-- **Open Source**: Quivr is open source and free to use.
-## Demo
-### Demo with GPT3.5
-https://github.com/StanGirard/quivr/assets/19614572/80721777-2313-468f-b75e-09379f694653
-### Demo with Claude 100k context
-https://github.com/StanGirard/quivr/assets/5101573/9dba918c-9032-4c8d-9eea-94336d2c8bd4
-## Getting Started
-These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
-### Prerequisites
-Make sure you have the following installed before continuing:
-- Python 3.10 or higher
-- Pip
-- Virtualenv
-You'll also need a [Supabase](https://supabase.com/) account for:
-- A new Supabase project
-- Supabase Project API key
-- Supabase Project URL
-### Installing
-- Clone the repository
-```bash
-git clone git@github.com:StanGirard/Quivr.git && cd Quivr
-```
-- Create a virtual environment
-```bash
-virtualenv venv
-```
-- Activate the virtual environment
-```bash
-source venv/bin/activate
-```
-- Install the dependencies
-```bash
-pip install -r requirements.txt
-```
-- Copy the streamlit secrets.toml example file
-```bash
-cp .streamlit/secrets.toml.example .streamlit/secrets.toml
-```
-- Add your credentials to .streamlit/secrets.toml file
-```toml
-supabase_url = "SUPABASE_URL"
-supabase_service_key = "SUPABASE_SERVICE_KEY"
-openai_api_key = "OPENAI_API_KEY"
-anthropic_api_key = "ANTHROPIC_API_KEY" # Optional
-```
-_Note that the `supabase_service_key` is found in your Supabase dashboard under Project Settings -> API. Use the `anon` `public` key found in the `Project API keys` section._
-- Run the following migration scripts on the Supabase database via the web interface (SQL Editor -> `New query`)
-```sql
--- Enable the pgvector extension to work with embedding vectors
-       create extension vector;
-       -- Create a table to store your documents
-       create table documents (
-       id bigserial primary key,
-       content text, -- corresponds to Document.pageContent
-       metadata jsonb, -- corresponds to Document.metadata
-       embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed
-       );
-       CREATE FUNCTION match_documents(query_embedding vector(1536), match_count int)
-           RETURNS TABLE(
-               id bigint,
-               content text,
-               metadata jsonb,
-               -- we return matched vectors to enable maximal marginal relevance searches
-               embedding vector(1536),
-               similarity float)
-           LANGUAGE plpgsql
-           AS $$
-           # variable_conflict use_column
-       BEGIN
-           RETURN query
-           SELECT
-               id,
-               content,
-               metadata,
-               embedding,
-               1 -(documents.embedding <=> query_embedding) AS similarity
-           FROM
-               documents
-           ORDER BY
-               documents.embedding <=> query_embedding
-           LIMIT match_count;
-       END;
-       $$;
-```
-and
-```sql
-create table
-  stats (
-    -- A column called "time" with data type "timestamp"
-    time timestamp,
-    -- A column called "details" with data type "text"
-    chat boolean,
-    embedding boolean,
-    details text,
-    metadata jsonb,
-    -- An "integer" primary key column called "id" that is generated always as identity
-    id integer primary key generated always as identity
-  );
-```
-- Run the app
-```bash
-streamlit run main.py
-```
-## Built With
-* [NextJS](https://nextjs.org/) - The React framework used.
-* [FastAPI](https://fastapi.tiangolo.com/) - The API framework used.
-* [Supabase](https://supabase.io/) - The open source Firebase alternative.
-## Contributing
-Open a pull request and we'll review it as soon as possible.
-## Star History
-[![Star History Chart](https://api.star-history.com/svg?repos=StanGirard/quivr&type=Date)](https://star-history.com/#StanGirard/quivr&Date)

+---
+title: MeraKB
+emoji: 📚
+colorFrom: purple
+colorTo: red
+sdk: streamlit
+sdk_version: 1.27.1
+app_file: app.py
+pinned: false
+license: apache-2.0
+---
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

app.py DELETED Viewed

@@ -1,135 +0,0 @@
-# main.py
-import os
-import tempfile
-import streamlit as st
-from files import file_uploader, url_uploader
-from question import chat_with_doc
-from brain import brain
-from langchain.embeddings import HuggingFaceInferenceAPIEmbeddings
-from langchain.vectorstores import SupabaseVectorStore
-from supabase import Client, create_client
-from explorer import view_document
-from stats import get_usage_today
-supabase_url = st.secrets.supabase_url
-supabase_key = st.secrets.supabase_service_key
-openai_api_key = st.secrets.openai_api_key
-anthropic_api_key = st.secrets.anthropic_api_key
-hf_api_key = st.secrets.hf_api_key
-supabase: Client = create_client(supabase_url, supabase_key)
-self_hosted = st.secrets.self_hosted
-# embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
-embeddings = HuggingFaceInferenceAPIEmbeddings(
-    api_key=hf_api_key,
-    model_name="BAAI/bge-large-en-v1.5"
-)
-vector_store = SupabaseVectorStore(supabase, embeddings, query_name='match_documents', table_name="documents")
-models = ["llama-2"]
-if openai_api_key:
-    models += ["gpt-3.5-turbo", "gpt-4"]
-if anthropic_api_key:
-    models += ["claude-v1", "claude-v1.3",
-               "claude-instant-v1-100k", "claude-instant-v1.1-100k"]
-# Set the theme
-st.set_page_config(
-    page_title="meraKB",
-    layout="wide",
-    initial_sidebar_state="expanded",
-)
-st.title("🧠 meraKB - Your digital brain 🧠")
-st.markdown("Store your knowledge in a vector store and chat with it.")
-if self_hosted == "false":
-    st.markdown('**📢 Note: In the public demo, access to functionality is restricted. You can only use the GPT-3.5-turbo model and upload files up to 1Mb. To use more models and upload larger files, consider self-hosting meraKB.**')
-st.markdown("---\n\n")
-st.session_state["overused"] = False
-if self_hosted == "false":
-    usage = get_usage_today(supabase)
-    if usage > st.secrets.usage_limit:
-        st.markdown(
-            f"<span style='color:red'>You have used {usage} tokens today, which is more than your daily limit of {st.secrets.usage_limit} tokens. Please come back later or consider self-hosting.</span>", unsafe_allow_html=True)
-        st.session_state["overused"] = True
-    else:
-        st.markdown(f"<span style='color:blue'>Usage today: {usage} tokens out of {st.secrets.usage_limit}</span>", unsafe_allow_html=True)
-    st.write("---")
-# Initialize session state variables
-if 'model' not in st.session_state:
-    st.session_state['model'] = "llama-2"
-if 'temperature' not in st.session_state:
-    st.session_state['temperature'] = 0.1
-if 'chunk_size' not in st.session_state:
-    st.session_state['chunk_size'] = 500
-if 'chunk_overlap' not in st.session_state:
-    st.session_state['chunk_overlap'] = 0
-if 'max_tokens' not in st.session_state:
-    st.session_state['max_tokens'] = 500
-# Create a radio button for user to choose between adding knowledge or asking a question
-user_choice = st.radio(
-    "Choose an action", ('Add Knowledge', 'Chat with your Brain', 'Forget', "Explore"))
-st.markdown("---\n\n")
-if user_choice == 'Add Knowledge':
-    # Display chunk size and overlap selection only when adding knowledge
-    st.sidebar.title("Configuration")
-    st.sidebar.markdown(
-        "Choose your chunk size and overlap for adding knowledge.")
-    st.session_state['chunk_size'] = st.sidebar.slider(
-        "Select Chunk Size", 100, 1000, st.session_state['chunk_size'], 50)
-    st.session_state['chunk_overlap'] = st.sidebar.slider(
-        "Select Chunk Overlap", 0, 100, st.session_state['chunk_overlap'], 10)
-    # Create two columns for the file uploader and URL uploader
-    col1, col2 = st.columns(2)
-    with col1:
-        file_uploader(supabase, vector_store)
-    with col2:
-        url_uploader(supabase, vector_store)
-elif user_choice == 'Chat with your Brain':
-    # Display model and temperature selection only when asking questions
-    st.sidebar.title("Configuration")
-    st.sidebar.markdown(
-        "Choose your model and temperature for asking questions.")
-    if self_hosted != "false":
-        st.session_state['model'] = st.sidebar.selectbox(
-        "Select Model", models, index=(models).index(st.session_state['model']))
-    else:
-        st.sidebar.write("**Model**: gpt-3.5-turbo")
-        st.sidebar.write("**Self Host to unlock more models such as claude-v1 and GPT4**")
-        st.session_state['model'] = "gpt-3.5-turbo"
-    st.session_state['temperature'] = st.sidebar.slider(
-        "Select Temperature", 0.1, 1.0, st.session_state['temperature'], 0.1)
-    if st.secrets.self_hosted != "false":
-        st.session_state['max_tokens'] = st.sidebar.slider(
-            "Select Max Tokens", 500, 4000, st.session_state['max_tokens'], 500)
-    else:
-        st.session_state['max_tokens'] = 500
-    chat_with_doc(st.session_state['model'], vector_store, stats_db=supabase)
-elif user_choice == 'Forget':
-    st.sidebar.title("Configuration")
-    brain(supabase)
-elif user_choice == 'Explore':
-    st.sidebar.title("Configuration")
-    view_document(supabase)
-st.markdown("---\n\n")