Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.27.0
title: Mock JSON Generator
emoji: π
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.20.0
app_file: app.py
pinned: false
Mock JSON Generator
Generate test data for a JSON schema.
- User inputs JSON schema
- Use Zero-Shot Classification model to map schema properties to faker functions
- Call faker functions and return mock JSON
Why
Although mock data can be generated by chatting with any popular large language model (LLM), generating massive amounts of mock data for performance testing or load testing would be an inefficient use of an LLM. With this approach, once the classification model has mapped the schema properties, mock data can be generated instantly, for free, with no significant CPU load.
Check the case_study
directory to see why I decided to use Zero-Shot Classification for this solution.
Tradeoffs
Using a Zero-Shot Classification model rather than a full LLM introduces a flexibility vs. scalability tradeoff. A full LLM would provide greater flexibility in understanding complex schemas and generating more nuanced mock data. However, this approach offers superior scalability - once the classification model has mapped schema properties to faker functions, you can generate virtually unlimited mock data instantly without the computational overhead or costs associated with repeated LLM calls.
Optimizations
- more faker function mappings. for the current small model, more mappings for faker functions will most likely be needed.
- custom-trained model. A targeted task like this should use a custom model.
- custom extensions. For example specify a list of hobbies to randomly select during mock generation.