YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Code-Instruction Pairs Dataset

Overview

10,000 synthetic code generation instruction-response pairs for LLM training.

Statistics

  • Size: 10,000 samples
  • Languages: Python, JavaScript, TypeScript, Go, Rust, Java, C++, C#, PHP, Ruby
  • Categories: Algorithm, Data Structure, API, Debugging, Refactoring, Testing
  • Difficulties: Easy, Medium, Hard

Format

JSONL with fields:

  • id: Unique identifier
  • instruction: Natural language coding task
  • response: Code solution
  • language: Programming language
  • difficulty: Difficulty level
  • category: Task category

Usage

import json

with open('code_instruction_pairs.jsonl') as f:
    for line in f:
        sample = json.loads(line)
        print(sample['instruction'])
        print(sample['response'])

License

CC-BY-4.0 - Free for commercial use with attribution.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support