Examples

Working examples are available in the /examples directory of the repository.

Basic Usage

from delve import Delve

delve = Delve(sample_size=100, output_dir="./results")
result = delve.run_sync("data.csv", text_column="text")

# Access results
for category in result.taxonomy:
    print(f"{category.name}: {category.description}")

Working with DataFrames

import pandas as pd
from delve import Delve

df = pd.DataFrame({
    "id": ["1", "2", "3"],
    "text": [
        "How do I reset my password?",
        "What are your pricing plans?",
        "I love this product!"
    ]
})

delve = Delve(use_case="Categorize customer feedback")
result = delve.run_sync(df, text_column="text", id_column="id")

# Convert results back to DataFrame
results_df = pd.DataFrame([
    {"id": doc.id, "category": doc.category, "explanation": doc.explanation}
    for doc in result.labeled_documents
])
print(results_df['category'].value_counts())

Using Predefined Taxonomy

Skip taxonomy discovery and use your own categories:

from delve import Delve

# Define your taxonomy inline
taxonomy = [
    {"id": "1", "name": "Bug", "description": "Bug reports and issues"},
    {"id": "2", "name": "Feature", "description": "Feature requests"},
    {"id": "3", "name": "Question", "description": "General questions"},
]

delve = Delve(predefined_taxonomy=taxonomy)
result = delve.run_sync("issues.csv", text_column="description")

Use predefined taxonomy when you already know your categories, want consistent labeling across runs, or need to match an existing classification system.

Processing Documents Directly

Use Doc objects for programmatic input:

from delve import Delve, Doc

docs = [
    Doc(id="1", content="Fix authentication bug in login"),
    Doc(id="2", content="Add dark mode feature"),
    Doc(id="3", content="How do I export data?"),
]

delve = Delve(use_case="Categorize software issues")
result = delve.run_with_docs_sync(docs)

Async API

For async applications:

import asyncio
from delve import Delve

async def main():
    delve = Delve(sample_size=150)
    result = await delve.run("data.csv", text_column="text")
    print(f"Generated {len(result.taxonomy)} categories")

asyncio.run(main())

Analyzing Results

from delve import Delve

result = delve.run_sync("data.csv", text_column="text")

# Category distribution (from metadata)
for category, count in result.metadata["category_counts"].items():
    pct = (count / result.metadata["num_documents"]) * 100
    print(f"{category}: {count} ({pct:.1f}%)")

# Filter by category
bugs = [doc for doc in result.labeled_documents if doc.category == "Bug"]

# Access export paths
print(result.export_paths['taxonomy'])  # Path to taxonomy.json

Working with Metadata

Access comprehensive run statistics:

from delve import Delve, Verbosity

delve = Delve(verbosity=Verbosity.NORMAL)
result = delve.run_sync("data.csv", text_column="text")

# Run timing
print(f"Completed in {result.metadata['run_duration_seconds']:.1f} seconds")

# Labeling breakdown
llm = result.metadata["llm_labeled_count"]
classifier = result.metadata["classifier_labeled_count"]
print(f"LLM labeled: {llm}, Classifier labeled: {classifier}")

# Top categories
from collections import Counter
top_5 = Counter(result.metadata["category_counts"]).most_common(5)
for category, count in top_5:
    print(f"  {category}: {count}")

# Check for warnings
if result.metadata["warnings"]:
    print("Warnings:")
    for warning in result.metadata["warnings"]:
        print(f"  - {warning}")

Checking Classifier Performance

When the classifier is used (sample_size < total docs):

from delve import Delve

delve = Delve(sample_size=100)  # Will use classifier if > 100 docs
result = delve.run_sync("large_dataset.csv", text_column="text")

# Check if classifier was used
if "classifier_metrics" in result.metadata:
    metrics = result.metadata["classifier_metrics"]
    print(f"Classifier Performance:")
    print(f"  Test Accuracy: {metrics['test_accuracy']:.1%}")
    print(f"  Test F1 Score: {metrics['test_f1']:.3f}")
    print(f"  Train Accuracy: {metrics['train_accuracy']:.1%}")
else:
    print("All documents were labeled by LLM (no classifier needed)")

CLI Quick Examples

# Basic CSV
delve run data.csv --text-column message

# JSON with nested path
delve run data.json --json-path "$.items[*].content"

# LangSmith project
delve run langsmith://my-project --langsmith-key $LANGSMITH_API_KEY

# Custom configuration
delve run data.csv --text-column text --sample-size 200 --use-case "Categorize support tickets"

Handling Imbalanced Data

When your data has class imbalance (some categories much more common than others), you may need to adjust parameters to ensure good classifier performance.

from delve import Delve

# Run Delve and check for imbalance issues
delve = Delve(sample_size=100, output_dir="./results")
result = delve.run_sync("data.csv", text_column="text")

# Check classifier performance
metrics = result.metadata.get("classifier_metrics", {})
print(f"Test F1: {metrics.get('test_f1', 'N/A')}")

# Check sample distribution
sample_dist = result.metadata.get("sample_distribution", {})
zero_cats = result.metadata.get("zero_sample_categories", [])

if zero_cats:
    print(f"Warning: {len(zero_cats)} categories had no training examples")
    print(f"  Missing: {zero_cats}")

# Check per-class performance
per_class = metrics.get("per_class_f1", {})
for cat, f1 in sorted(per_class.items(), key=lambda x: x[1]):
    if f1 < 0.5:
        print(f"  Low F1 ({f1:.2f}): {cat}")

See the Handling Class Imbalance guide for a complete explanation of these metrics and how to tune them.

Production Workflow: Train Once, Classify Many

For cost-effective production use, train a classifier once and reuse it for new documents without LLM costs.

Step 1: Initial Run and Export

from delve import Delve, Verbosity

# Run taxonomy generation
delve = Delve(sample_size=200, verbosity=Verbosity.VERBOSE)
result = delve.run_sync("training_data.csv", text_column="content")

# Save the classifier for later
result.save_classifier("classifier.joblib")

# Export labeled docs for review
await result.export()  # Creates labeled_documents.csv

Step 2: Human Review (Optional)

Review labeled_documents.csv and correct mislabeled documents, focusing on “Other” categories and edge cases.

Step 3: Retrain from Corrected Data

# Train improved classifier from corrected labels
result = Delve.train_from_labeled(
    "corrected_labels.csv",
    text_column="content",
    label_column="category",
    taxonomy="taxonomy.json",  # Use original taxonomy
)

print(f"Improved Test F1: {result.metrics['test_f1']:.2%}")
result.save_classifier("production_classifier.joblib")

Step 4: Production Classification

# Classify new documents with no LLM cost
predictions = Delve.classify(
    "new_documents.csv",
    classifier_path="production_classifier.joblib",
    text_column="content",
)

# Access results
for doc in predictions.documents:
    print(f"{doc.id}: {doc.category} ({doc.confidence:.2%})")

# Export as DataFrame
df = predictions.to_dataframe()
df.to_csv("classified_documents.csv", index=False)

See the Classifier Export & Training guide for complete documentation on saving, loading, and training classifiers.

Running Examples

# Clone and setup
git clone https://github.com/anthropics/delve.git
cd delve
pip install -e .

# Set API keys
export ANTHROPIC_API_KEY="your-key"
export OPENAI_API_KEY="your-key"  # Required for classifier embeddings

# Run examples
cd examples
python basic_csv_example.py

Getting Started

Advanced Topics

CLI Usage

SDK Usage

Examples

Basic Usage

Working with DataFrames

Using Predefined Taxonomy

Processing Documents Directly

Async API

Analyzing Results

Working with Metadata

Checking Classifier Performance

CLI Quick Examples

Handling Imbalanced Data

Production Workflow: Train Once, Classify Many

Step 1: Initial Run and Export

Step 2: Human Review (Optional)

Step 3: Retrain from Corrected Data

Step 4: Production Classification

Running Examples

Getting Started

Advanced Topics

CLI Usage

SDK Usage

Examples

Documentation Index

​Basic Usage

​Working with DataFrames

​Using Predefined Taxonomy

​Processing Documents Directly

​Async API

​Analyzing Results

​Working with Metadata

​Checking Classifier Performance

​CLI Quick Examples

​Handling Imbalanced Data

​Production Workflow: Train Once, Classify Many

​Step 1: Initial Run and Export

​Step 2: Human Review (Optional)

​Step 3: Retrain from Corrected Data

​Step 4: Production Classification

​Running Examples

Basic Usage

Working with DataFrames

Using Predefined Taxonomy

Processing Documents Directly

Async API

Analyzing Results

Working with Metadata

Checking Classifier Performance

CLI Quick Examples

Handling Imbalanced Data

Production Workflow: Train Once, Classify Many

Step 1: Initial Run and Export

Step 2: Human Review (Optional)

Step 3: Retrain from Corrected Data

Step 4: Production Classification

Running Examples