Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wildcampstudio.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Get up and running with Delve in just a few minutes. Choose your preferred interface below.

Installation

pip install delve-taxonomy

Set API Key

Delve uses Claude for taxonomy generation. Set your Anthropic API key:
export ANTHROPIC_API_KEY="your-api-key-here"
Get your API key from the Anthropic Console

Run Your First Taxonomy Generation

Process a CSV file with text data:
delve run data.csv --text-column conversation
Results will be saved to the ./results/ directory with taxonomy, labeled documents, and reports.

View Results

Check your output directory for these files:
  • taxonomy.json - Machine-readable taxonomy
  • labeled_documents.json - Documents with categories
  • labeled_data.csv - Spreadsheet format
  • report.md - Human-readable summary

Next Steps

Common Use Cases

CSV Files

Process customer feedback, support tickets, or survey responses from CSV files.

JSON Data

Handle API responses, logs, or structured data with JSONPath support.

DataFrames

Work directly with pandas DataFrames for in-memory processing.

LangSmith

Analyze LangSmith project runs to categorize LLM interactions.

Quick Alternative: Binary Detection

If you already know the single category you’re looking for, use find_matches() for faster results:
from delve import Delve

# Find all refund-related traces in seconds, not minutes
result = Delve.find_matches(
    "data.csv",
    category={
        "name": "Refund Request",
        "description": "User asking for refund or money back",
        "keywords": ["refund", "money back", "cancel"],
    },
    text_column="content",
)

print(f"Found {result.stats['matches']} matches")
Binary detection is ~10x faster and ~5x cheaper than full taxonomy generation. Use it when you know what you’re looking for. See Binary Detection for details.

What Happens During Processing?

  1. Sampling - Delve samples your dataset (default: 100 documents)
  2. Summarization - Each document is summarized using Claude Haiku
  3. Clustering - Documents are grouped into minibatches and analyzed iteratively
  4. Taxonomy Generation - Categories are discovered and refined across batches
  5. Validation - The final taxonomy is reviewed for quality
  6. Labeling - All documents are categorized with explanations
  7. Export - Results are saved in multiple formats
For large datasets, Delve automatically samples documents to ensure efficient processing while maintaining representative coverage.

Need Help?