Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
nicolay-r 
posted an update Sep 30, 2024
Post
1037
πŸ“’ Having a massive amount of data to bulk the remotely accessed LLM πŸ€– with Chain-of-Though (CoT) πŸ”— might result in connection loss.
The latter may lead to Python Exception πŸ’₯ and challenges with generated content restoration.
To address on this problem, sharing the no-strings / tiny framework that exploits SQLite3 for caching each query.
Such caching allows smooth relaunch in the case of any data loss. β˜•
With that, happy to share bulk-chain project and more on that within links below:

⭐ github: https://github.com/nicolay-r/bulk-chain
πŸ“¦ PyPI: https://pypi.org/project/bulk-chain/

There are three steps to quickstart
(see them in attachment πŸ‘‡):
βœ… 1. Install library
βœ… 2. Declare CoT-schema in json file πŸ“„
βœ… 3. Wrap your transformer or use existed adapters
https://github.com/nicolay-r/bulk-chain/tree/master/ext

For example, here is the provider for Replicate IO service (https://replicate.com/):
https://github.com/nicolay-r/bulk-chain/blob/master/ext/replicate.py
that supports one of the largers publicly available LLaMA-3.1-405B:
meta-llama/Llama-3.1-405B-Instruct

I dont understand how this works but i will give it a try.

Β·

Hi @jharshraj ,

Thanks for your interest! That's fine, let me clarify a bit more.
This script will cache your CoT responses into sqlite3 database.
This database will be instantly availabe in the same folder of your csv data and becoming populated while at LLM handles your data from CSV.
If you're ended up into exceptions raised by connection issues, the script will stop.
Since cache is saved, all you have to do is to simply relaunch script to proceed so your progress is secured in SQlite3 database.