mock-json-generator / README.md
edodso2's picture
Use smaller model
fa1f9f7

A newer version of the Gradio SDK is available: 5.27.0

Upgrade
metadata
title: Mock JSON Generator
emoji: πŸƒ
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.20.0
app_file: app.py
pinned: false

Mock JSON Generator

Generate test data for a JSON schema.

  1. User inputs JSON schema
  2. Use Zero-Shot Classification model to map schema properties to faker functions
  3. Call faker functions and return mock JSON

Why

Although mock data can be generated by chatting with any popular large language model (LLM), generating massive amounts of mock data for performance testing or load testing would be an inefficient use of an LLM. With this approach, once the classification model has mapped the schema properties, mock data can be generated instantly, for free, with no significant CPU load.

Check the case_study directory to see why I decided to use Zero-Shot Classification for this solution.

Tradeoffs

Using a Zero-Shot Classification model rather than a full LLM introduces a flexibility vs. scalability tradeoff. A full LLM would provide greater flexibility in understanding complex schemas and generating more nuanced mock data. However, this approach offers superior scalability - once the classification model has mapped schema properties to faker functions, you can generate virtually unlimited mock data instantly without the computational overhead or costs associated with repeated LLM calls.

Optimizations

  • more faker function mappings. for the current small model, more mappings for faker functions will most likely be needed.
  • custom-trained model. A targeted task like this should use a custom model.
  • custom extensions. For example specify a list of hobbies to randomly select during mock generation.