rhesis (Rhesis AI GmbH)

Organization Card

Open-source test generation SDK for LLM applications.

Rhesis AI provides curated and dynamically generated test sets to evaluate LLM applications under diverse conditions. These datasets help assess robustness, reliability, and compliance in real-world scenarios.

Using our datasets

Our datasets are designed to test various aspects of LLM application behavior, from reliability to safety and bias detection. To get started:

Browse the available test sets here on Hugging Face.
Select the dataset that aligns with your evaluation needs.
Load and apply the test cases to assess your application’s behavior.

For more advanced testing and seamless integration, the Rhesis SDK provides tools to automate dataset handling, generate structured test cases, and streamline evaluation workflows.

Key features

Curated Test Sets – Pre-built datasets covering diverse evaluation criteria.
Dynamic Test Generation – Generate custom test sets tailored to specific use cases.
Scalability – Use datasets for one-off evaluations or integrate them into automated testing pipelines.

For questions or custom datasets, reach out at hello@rhesis.ai.

Example use cases:

AI Financial Advisor:
Evaluate the reliability and accuracy of financial guidance provided by LLM applications, ensuring sound advice for users.
AI Claim Processing:
Test for and eliminate biases in LLM-supported claim decisions, ensuring fair and compliant processing of insurance claims.
AI Sales Advisor:
Validate the accuracy of product recommendations, enhancing customer satisfaction and driving more successful sales.
AI Support Chatbot:
Ensure that your chatbot consistently delivers helpful, accurate, and empathetic responses across various scenarios.

Disclaimer

Some test cases may contain sensitive, challenging, or potentially upsetting content. These cases are included to ensure thorough and realistic assessments. Users should review test cases carefully and exercise discretion when utilizing them.

Connect with us

For more details about our testing platform, datasets, and solutions, including the Rhesis AI SDK, visit Rhesis AI.
Join our Discord community to connect with other AI engineers, discuss best practices, and stay updated on new test sets.

models

None public yet

datasets 62

Rhesis AI GmbH

AI & ML interests

Recent Activity

Using our datasets

Key features

Example use cases:

Disclaimer

Connect with us

models

datasets 62

rhesis/Insurance-Chatbot-Finance-Jailbreak

rhesis/European-E-commerce-Chatbot-Social-Norms-Toxic

rhesis/Insurance-Chatbot-Homeowner-Fraud-Harmful

rhesis/Insurance-Chatbot-Regulatory-Requirements-Harmless

rhesis/Insurance-Chatbot-Business-Strategy-Jailbreak

rhesis/Insurance-Chatbot-Homeowner-Fraud-Jailbreak

rhesis/Insurance-Chatbot-Health-Care-Fraud-Jailbreak

rhesis/Telecom-Chatbot-Privacy-and-Data-Protection-Harmless

rhesis/Rhesis-European-E-commerce-Chatbot-Benchmark

rhesis/Insurance-Chatbot-Cost-and-Charges-Harmless

AI & ML interests

Recent Activity

Team members 2

Using our datasets

Key features

Example use cases:

Disclaimer

Connect with us

models

datasets 62 Sort: Recently updated

datasets 62