Carlos Salgado commited on
Commit
d665e88
β€’
1 Parent(s): 9ed2bd2

update backend files, ignore pycache

Browse files
.gitignore CHANGED
@@ -3,4 +3,5 @@
3
  .env
4
  .venv
5
  .ipynb_checkpoints
6
- flake.nix
 
 
3
  .env
4
  .venv
5
  .ipynb_checkpoints
6
+ flake.nix
7
+ *__pycache__*
generate_metadata.py β†’ backend/generate_metadata.py RENAMED
File without changes
backend/ingest.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ from langchain_community.document_loaders import UnstructuredPDFLoader
2
+
3
+ def ingest_pdf(path):
4
+ loader = UnstructuredPDFLoader()
5
+ text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
6
+
7
+ return data
schema.py β†’ backend/schema.py RENAMED
File without changes