chatan documentation

Create synthetic datasets with LLM generators and samplers.

Installation

pip install chatan

Quick Start

import chatan

# Create a generator
gen = chatan.generator("openai", "YOUR_API_KEY")

# Define a dataset schema
ds = chatan.dataset({
    "topic": chatan.sample.choice(["Python", "JavaScript", "Rust"]),
    "prompt": gen("write a programming question about {topic}"),
    "response": gen("answer this question: {prompt}")
})

# Generate the data with a progress bar
df = ds.generate(n=10)

Indices and tables