DocumentationPlaygroundClassification Strategies

Classification Strategies

Different business needs require different classification approaches. This guide explains three core strategies—when to use each, how to configure them, and what results to expect.

Understanding the Trade-offs

All classification strategies balance three factors:

Coverage : What percentage of your products get a category returned

Accuracy : How often the returned category is correct and appropriate

Specificity : Whether categories are detailed (leaf level) or general (may include parents)

No single configuration optimizes all three. Your strategy depends on which factors matter most for your use case.

Strategy 1: Maximum Coverage

Priority: Get a category for every product, even if confidence is lower

Best for:

  • Initial catalog imports where you’ll review results manually
  • High-volume operations where downstream processes need some categorization
  • Situations where having an approximate category is better than none
  • Products with limited or inconsistent descriptions

Configuration

Taxonomy: [your choice]
Custom instructions: Optional
Leaf only: OFF
Use top-ranked: ON

How It Works

With leaf-only disabled, the candidate set includes both specific and general categories. The AI can choose parents when uncertain about siblings. With top-ranked fallback enabled, even when the AI cannot decide, the system returns the highest-scored candidate.

This produces maximum coverage because you’ll get results in nearly every scenario:

  • Green indicators when the AI confidently selects a category
  • Yellow indicators when the system falls back to top-ranked scores
  • Black indicators only in rare cases where no relevant candidates exist

What to Expect

Coverage: Expect the highest coverage of all strategies—nearly every product will receive a category.

Accuracy: Green results will be fairly accurate, but expect lower accuracy for yellow results. The top-ranked fallback mechanism prioritizes coverage over precision.

Specificity: You’ll get a mix of leaf and parent categories. The AI chooses the appropriate level based on confidence—specific when certain, general when ambiguous.

Status indicators: Expect mostly green and yellow results with very few black results. The yellow results represent products where the system fell back to top-ranked scores rather than confident AI selection.

When to Use

Use maximum coverage when:

  • You’re importing a large catalog and will review classifications afterward
  • Downstream systems require categories but can handle imperfect data
  • You prefer to manually correct wrong classifications rather than have missing ones
  • Product descriptions are inconsistent or low-quality

Limitations

  • Yellow results have lower accuracy than green results
  • Some clearly wrong classifications will appear in yellow results
  • Requires manual review and correction for production use
  • Not suitable for platforms with strict taxonomy requirements

Strategy 2: Precision First

Priority: Only return categories when highly confident they’re correct

Best for:

  • Production systems where accuracy is critical
  • Platforms with strict taxonomy requirements
  • High-value products where misclassification has consequences
  • When you prefer to manually classify uncertain products

Configuration

Taxonomy: [your choice]
Custom instructions: Recommended for edge cases
Leaf only: OFF
Use top-ranked: OFF

How It Works

With top-ranked disabled, the system only returns categories the AI actively selects. No score-based fallback exists. With leaf-only disabled, the AI can choose parent categories when siblings are ambiguous, ensuring returned results are semantically appropriate.

This produces high accuracy but lower coverage:

  • Green indicators when the AI confidently selects any category (leaf or parent)
  • Black indicators when the AI cannot confidently decide
  • No yellow indicators (top-ranked fallback is disabled)

What to Expect

Coverage: Expect significantly lower coverage than maximum coverage strategy—a meaningful portion of products will return no category.

Accuracy: Green results should be highly accurate since the system only returns categories it confidently selects. No yellow results exist with this strategy.

Specificity: You’ll get a mix of leaf and parent categories, all actively chosen by the AI. The system chooses parents when sibling leaves are ambiguous, ensuring semantic correctness.

Status indicators: Expect only green and black results. Green indicates confident AI selection, black indicates the AI couldn’t make a confident determination.

When to Use

Use precision-first when:

  • Accuracy is more important than coverage
  • You’re willing to manually classify products that return no category
  • Your platform rejects incorrect categorizations
  • You have resources to improve descriptions for black-result products

Handling Black Results

Products that return black indicators need attention:

  1. Improve descriptions : Add context, features, or clarifying details
  2. Add custom instructions : Provide disambiguation rules for recurring patterns
  3. Manual classification : Classify by hand if description can’t be improved

Limitations

  • A significant portion of products return no category initially
  • Requires investment in description quality and custom instructions
  • Higher manual effort than maximum coverage strategy
  • Not suitable when complete coverage is required

Strategy 3: Specific Categories Required

Priority: Always return leaf-level categories for platform requirements

Best for:

  • Platforms that only accept leaf categories (reject parent categories)
  • Product feeds with strict specificity requirements
  • When business rules mandate detailed classifications
  • Catalogs where every product clearly fits a specific category

Configuration

Taxonomy: [your choice]
Custom instructions: Recommended
Leaf only: ON
Use top-ranked: ON

How It Works

With leaf-only enabled, only leaf categories appear in candidates—no parents. The AI must choose between specific options. With top-ranked enabled, if the AI cannot decide, the highest-scored leaf is returned.

This guarantees specific categories but sacrifices some accuracy:

  • Green indicators when the AI confidently selects a leaf
  • Yellow indicators when top-ranked leaf is used as fallback
  • Black indicators only when no leaf candidates exist (rare)

What to Expect

**Coverage:**Expect very high coverage—nearly every product will receive a leaf category. Black results should be rare.

Accuracy: Green results should be fairly accurate. Yellow results will have noticeably lower accuracy since they represent forced choices between ambiguous siblings. Expect a higher rate of wrong-but-specific classifications in yellow results.

Specificity: All returned categories will be leaf-level—no parents are possible with this configuration.

Status indicators: Expect a substantial number of yellow results alongside green results. The yellow results indicate products where the system had to choose between ambiguous leaf options using top-ranked scores.

When to Use

Use specific categories required when:

  • Your platform rejects parent categories
  • Business rules require leaf-level classification
  • You’re feeding product data to systems with strict taxonomy requirements
  • You prefer specific (even if occasionally wrong) over general categories

Managing Yellow Results

With a substantial number of yellow results, accuracy management is critical:

  1. Custom instructions reduce yellow results : Provide disambiguation rules for common ambiguous products
  2. Improve descriptions : Add details that distinguish between sibling leaves
  3. Accept some imprecision : Yellow leaf results are still leaf-level, meeting platform requirements
  4. Prioritize review : Check yellow results before finalizing imports

Limitations

  • High percentage of yellow results (lower confidence)
  • Some incorrect specific classifications in yellow results
  • Requires robust custom instructions to manage ambiguity
  • Cannot fall back to safe parent categories
  • Not ideal when semantic correctness matters more than specificity

Comparing the Strategies

FactorMaximum CoveragePrecision FirstSpecific Categories
CoverageHighest—nearly all products get categoriesLower—meaningful portion get no categoryVery high—nearly all products get categories
Status indicatorsMix of green and yellow resultsOnly green and black (no yellow)Mix of green and yellow, substantial yellow portion
Accuracy (green)Fairly accurateHighly accurateFairly accurate
Accuracy (yellow)Less accurate than greenN/A (no yellow results)Noticeably less accurate than green
SpecificityMix of leaf and parent categoriesMix of leaf and parent categoriesOnly leaf categories (no parents)
Manual reviewModerateHigh (for black results)Moderate to high
Custom instructionsOptionalRecommended for edge casesRecommended to reduce yellow results

Hybrid Approaches

You don’t need to apply one strategy to your entire catalog. Consider using different strategies for different product types:

By Product Value

  • High-value products : Precision First
  • Medium-value products : Maximum Coverage
  • Low-value products : Specific Categories

By Description Quality

  • Well-described products : Precision First or Specific Categories
  • Poorly-described products : Maximum Coverage

By Category Confidence

  1. First pass : Run Precision First to get high-confidence results
  2. Second pass : Run remaining products with Maximum Coverage
  3. Manual classification : Handle remaining black results

Implementation

// High-value products
if (product.price > 1000) {
  classify({ ...product, leafsOnly: false, fallbackToBestGuess: false })
}
 
// Standard products
else {
  classify({ ...product, leafsOnly: false, fallbackToBestGuess: true })
}

Choosing Your Strategy

Ask yourself these questions:

  1. Does my platform reject parent categories?
  • Yes → Specific Categories Required
  • No → Continue to question 2
  1. How important is accuracy vs coverage?
  • Accuracy critical → Precision First
  • Coverage critical → Maximum Coverage
  • Both important → Specific Categories Required
  1. Can I manually review and correct results?
  • Yes, extensive review → Maximum Coverage
  • Yes, some review → Specific Categories Required
  • No, must be accurate → Precision First
  1. How good are my product descriptions?
  • Excellent, detailed → Precision First or Specific Categories
  • Mixed quality → Maximum Coverage
  • Poor, inconsistent → Maximum Coverage
  1. What percentage of errors can I tolerate?
  • Very few errors → Precision First

  • Some errors acceptable → Specific Categories Required

  • More errors acceptable for coverage → Maximum Coverage

Testing Your Strategy

Before committing to a strategy:

  1. Test with representative products: Include diverse categories and quality levels
  2. Review all status indicators: Understand your green/yellow/black distribution
  3. Manually verify accuracy: Check a sample of results against expectations
  4. Observe patterns: Note which types of products succeed or fail with each strategy
  5. Compare against goals: Does this strategy meet your requirements?

If results don’t match your needs:

  • Try a different strategy
  • Improve product descriptions
  • Add custom instructions for edge cases
  • Use a hybrid approach

Next Steps

After selecting a strategy:

  1. Test in the Playground : Validate with your real products in the Playground
  2. Write custom instructions : Handle edge cases specific to your catalog (see Writing Custom Instructions)
  3. Understand the pipeline : Learn how settings interact (see Classification Pipeline and Settings)
  4. Integrate with API : Apply your configuration in production

Your strategy isn’t permanent—adjust as your catalog, descriptions, or requirements change.