chore: increase of numbers to scrape; disabled PDF check in scholar model a6fbfb6 eljanmahammadli commited on Sep 26, 2024
added pagintion to google search, now retrieving more sites 5650543 eljanmahammadli commited on Sep 26, 2024
#feat: added YouTube as RAG input; removed standard humanizer 744d9e3 eljanmahammadli commited on Sep 24, 2024
#perf: quality improvements to website scrape + PDF detect logic d904dd4 eljanmahammadli commited on Sep 23, 2024
#perf added hybrid search using bm25 + semantic, minor change to text, splitter, and retrieval hyperparameters 8b9c9ff eljanmahammadli commited on Sep 23, 2024
#bugfix response 301 is solved, as we should explicitly set follow_redirects for httpx 6f4a113 eljanmahammadli commited on Sep 20, 2024
fixes to inline citation, html; added debug true, chroma db clean cache ba91632 eljanmahammadli commited on Sep 6, 2024
fix double space on generated text + changed humanizer to batched 24a0ba5 minko186 commited on Sep 4, 2024
cleaned up output format + switch all records of text to new format c1769c1 minko186 commited on Aug 30, 2024
Added 5 times bigger embedding model all-mpnet-base-v2 95168db eljanmahammadli commited on Aug 24, 2024
remove content_string (not used) + clean unicode non-printable chars + add pymupdf reading for pdf urls a62cc34 minko186 commited on Aug 23, 2024
added new decoder only LM as a humanizer + UI suport e2a79fa eljanmahammadli commited on Aug 23, 2024
Added MC model to UI and removed some unnecessary code 5534eb0 eljanmahammadli commited on Aug 19, 2024
adding mail as format type and plain text to structure d09cdf3 eljanmahammadli commited on Aug 19, 2024
history auto refresh + rag search term includes topic and context 2a53cb7 minko186 commited on Aug 13, 2024
changed split logic to resolve short generated text, more search website and some logging 59fbf6a eljanmahammadli commited on Aug 13, 2024
merge main + multi pdfs + updated html cleaning + better references 43d4e83 minko186 commited on Aug 7, 2024