Updated from using random.choices to random.sample throughout where I need a random distinct set as choices does replacement so you can get the same item twice. Discovered in pricing testing. b897a48 alfraser commited on Feb 5
Added seoarate key for the question count as getting weird results in the counts f3f6cf6 alfraser commited on Feb 5
Added ability to set the number of testing threads dynamically from the UI fc8884e alfraser commited on Feb 5
Modified test runner to dispatch requests in parallel to make use of the fact that there is a lot of wait time for the LLM. Defaulting to 16 threads. bb7db2c alfraser commited on Feb 1
Added runner for pricing fact checks to assess the level of fact embedding in the latest model c319c31 alfraser commited on Feb 1
Added a test runner page which allows you to run a batch of test from the UI ab87be2 alfraser commited on Jan 24