Quick Start =================================== Installation ------------ Install chatan from PyPI: .. code-block:: bash pip install chatan Basic Usage ----------- Chatan uses async/await for concurrent API calls, which speeds up dataset generation significantly. 1. **Create a generator** .. code-block:: python import chatan gen = chatan.generator("openai", "YOUR_OPENAI_API_KEY") # or for Anthropic # gen = chatan.generator("anthropic", "YOUR_ANTHROPIC_API_KEY") 2. **Define your dataset schema** .. code-block:: python ds = chatan.dataset({ "language": chatan.sample.choice(["Python", "JavaScript", "Rust"]), "prompt": gen("write a coding question about {language}"), "response": gen("answer this question: {prompt}") }) 3. **Generate data (async)** .. code-block:: python import asyncio async def main(): # Generate 100 samples with concurrent API calls df = await ds.generate(n=100) # Save to file ds.save("my_dataset.parquet") return df df = asyncio.run(main()) Basic Evaluation ---------------- You can measure quality while you generate data or after rows are produced. Inline evaluation ^^^^^^^^^^^^^^^^^ .. code-block:: python import asyncio from chatan import dataset, eval, sample async def main(): ds = dataset({ "col1": sample.choice(["a", "a", "b"]), "col2": "b", "exact_match": eval.exact_match("col1", "col2") }) df = await ds.generate(n=100) return df df = asyncio.run(main()) Aggregate evaluation ^^^^^^^^^^^^^^^^^^^^ .. code-block:: python # After generating data aggregate = ds.evaluate({ "exact_match": ds.eval.exact_match("col1", "col2"), }) print(aggregate) Next Steps ---------- - Check out :doc:`datasets_and_generators` for more complex use cases - Browse the :doc:`api` reference for all available functions