Classification Strategies

Different business needs require different classification approaches. This guide explains three core strategies—when to use each, how to configure them, and what results to expect.

Understanding the Trade-offs

All classification strategies balance three factors:

Coverage : What percentage of your products get a category returned

Accuracy : How often the returned category is correct and appropriate

Specificity : Whether categories are detailed (leaf level) or general (may include parents)

No single configuration optimizes all three. Your strategy depends on which factors matter most for your use case.

Strategy 1: Maximum Coverage

Priority: Get a category for every product, even if confidence is lower

Best for:

Initial catalog imports where you’ll review results manually
High-volume operations where downstream processes need some categorization
Situations where having an approximate category is better than none
Products with limited or inconsistent descriptions

Configuration

Taxonomy: [your choice]
Custom instructions: Optional
Leaf only: OFF
Use top-ranked: ON

How It Works

With leaf-only disabled, the candidate set includes both specific and general categories. The AI can choose parents when uncertain about siblings. With top-ranked fallback enabled, even when the AI cannot decide, the system returns the highest-scored candidate.

This produces maximum coverage because you’ll get results in nearly every scenario:

Green indicators when the AI confidently selects a category
Yellow indicators when the system falls back to top-ranked scores
Black indicators only in rare cases where no relevant candidates exist

What to Expect

Coverage: Expect the highest coverage of all strategies—nearly every product will receive a category.

Accuracy: Green results will be fairly accurate, but expect lower accuracy for yellow results. The top-ranked fallback mechanism prioritizes coverage over precision.

Specificity: You’ll get a mix of leaf and parent categories. The AI chooses the appropriate level based on confidence—specific when certain, general when ambiguous.

Status indicators: Expect mostly green and yellow results with very few black results. The yellow results represent products where the system fell back to top-ranked scores rather than confident AI selection.

When to Use

Use maximum coverage when:

You’re importing a large catalog and will review classifications afterward
Downstream systems require categories but can handle imperfect data
You prefer to manually correct wrong classifications rather than have missing ones
Product descriptions are inconsistent or low-quality

Limitations

Yellow results have lower accuracy than green results
Some clearly wrong classifications will appear in yellow results
Requires manual review and correction for production use
Not suitable for platforms with strict taxonomy requirements

Strategy 2: Precision First

Priority: Only return categories when highly confident they’re correct

Best for:

Production systems where accuracy is critical
Platforms with strict taxonomy requirements
High-value products where misclassification has consequences
When you prefer to manually classify uncertain products

Configuration

Taxonomy: [your choice]
Custom instructions: Recommended for edge cases
Leaf only: OFF
Use top-ranked: OFF

How It Works

With top-ranked disabled, the system only returns categories the AI actively selects. No score-based fallback exists. With leaf-only disabled, the AI can choose parent categories when siblings are ambiguous, ensuring returned results are semantically appropriate.

This produces high accuracy but lower coverage:

Green indicators when the AI confidently selects any category (leaf or parent)
Black indicators when the AI cannot confidently decide
No yellow indicators (top-ranked fallback is disabled)

What to Expect

Coverage: Expect significantly lower coverage than maximum coverage strategy—a meaningful portion of products will return no category.

Accuracy: Green results should be highly accurate since the system only returns categories it confidently selects. No yellow results exist with this strategy.

Specificity: You’ll get a mix of leaf and parent categories, all actively chosen by the AI. The system chooses parents when sibling leaves are ambiguous, ensuring semantic correctness.

Status indicators: Expect only green and black results. Green indicates confident AI selection, black indicates the AI couldn’t make a confident determination.

When to Use

Use precision-first when:

Accuracy is more important than coverage
You’re willing to manually classify products that return no category
Your platform rejects incorrect categorizations
You have resources to improve descriptions for black-result products

Handling Black Results

Products that return black indicators need attention:

Improve descriptions : Add context, features, or clarifying details
Add custom instructions : Provide disambiguation rules for recurring patterns
Manual classification : Classify by hand if description can’t be improved

Limitations

A significant portion of products return no category initially
Requires investment in description quality and custom instructions
Higher manual effort than maximum coverage strategy
Not suitable when complete coverage is required

Strategy 3: Specific Categories Required

Priority: Always return leaf-level categories for platform requirements

Best for:

Platforms that only accept leaf categories (reject parent categories)
Product feeds with strict specificity requirements
When business rules mandate detailed classifications
Catalogs where every product clearly fits a specific category

Configuration

Taxonomy: [your choice]
Custom instructions: Recommended
Leaf only: ON
Use top-ranked: ON

How It Works

With leaf-only enabled, only leaf categories appear in candidates—no parents. The AI must choose between specific options. With top-ranked enabled, if the AI cannot decide, the highest-scored leaf is returned.

This guarantees specific categories but sacrifices some accuracy:

Green indicators when the AI confidently selects a leaf
Yellow indicators when top-ranked leaf is used as fallback
Black indicators only when no leaf candidates exist (rare)

What to Expect

**Coverage:**Expect very high coverage—nearly every product will receive a leaf category. Black results should be rare.

Accuracy: Green results should be fairly accurate. Yellow results will have noticeably lower accuracy since they represent forced choices between ambiguous siblings. Expect a higher rate of wrong-but-specific classifications in yellow results.

Specificity: All returned categories will be leaf-level—no parents are possible with this configuration.

Status indicators: Expect a substantial number of yellow results alongside green results. The yellow results indicate products where the system had to choose between ambiguous leaf options using top-ranked scores.

When to Use

Use specific categories required when:

Your platform rejects parent categories
Business rules require leaf-level classification
You’re feeding product data to systems with strict taxonomy requirements
You prefer specific (even if occasionally wrong) over general categories

Managing Yellow Results

With a substantial number of yellow results, accuracy management is critical:

Custom instructions reduce yellow results : Provide disambiguation rules for common ambiguous products
Improve descriptions : Add details that distinguish between sibling leaves
Accept some imprecision : Yellow leaf results are still leaf-level, meeting platform requirements
Prioritize review : Check yellow results before finalizing imports

Limitations

High percentage of yellow results (lower confidence)
Some incorrect specific classifications in yellow results
Requires robust custom instructions to manage ambiguity
Cannot fall back to safe parent categories
Not ideal when semantic correctness matters more than specificity

Comparing the Strategies

Factor	Maximum Coverage	Precision First	Specific Categories
Coverage	Highest—nearly all products get categories	Lower—meaningful portion get no category	Very high—nearly all products get categories
Status indicators	Mix of green and yellow results	Only green and black (no yellow)	Mix of green and yellow, substantial yellow portion
Accuracy (green)	Fairly accurate	Highly accurate	Fairly accurate
Accuracy (yellow)	Less accurate than green	N/A (no yellow results)	Noticeably less accurate than green
Specificity	Mix of leaf and parent categories	Mix of leaf and parent categories	Only leaf categories (no parents)
Manual review	Moderate	High (for black results)	Moderate to high
Custom instructions	Optional	Recommended for edge cases	Recommended to reduce yellow results

Hybrid Approaches

You don’t need to apply one strategy to your entire catalog. Consider using different strategies for different product types:

By Product Value

High-value products : Precision First
Medium-value products : Maximum Coverage
Low-value products : Specific Categories

By Description Quality

Well-described products : Precision First or Specific Categories
Poorly-described products : Maximum Coverage

By Category Confidence

First pass : Run Precision First to get high-confidence results
Second pass : Run remaining products with Maximum Coverage
Manual classification : Handle remaining black results

Implementation

// High-value products
if (product.price > 1000) {
  classify({ ...product, leafsOnly: false, fallbackToBestGuess: false })
}
 
// Standard products
else {
  classify({ ...product, leafsOnly: false, fallbackToBestGuess: true })
}

Choosing Your Strategy

Ask yourself these questions:

Does my platform reject parent categories?

Yes → Specific Categories Required
No → Continue to question 2

How important is accuracy vs coverage?

Accuracy critical → Precision First
Coverage critical → Maximum Coverage
Both important → Specific Categories Required

Can I manually review and correct results?

Yes, extensive review → Maximum Coverage
Yes, some review → Specific Categories Required
No, must be accurate → Precision First

How good are my product descriptions?

Excellent, detailed → Precision First or Specific Categories
Mixed quality → Maximum Coverage
Poor, inconsistent → Maximum Coverage

What percentage of errors can I tolerate?

Very few errors → Precision First
Some errors acceptable → Specific Categories Required
More errors acceptable for coverage → Maximum Coverage

Testing Your Strategy

Before committing to a strategy:

Test with representative products: Include diverse categories and quality levels
Review all status indicators: Understand your green/yellow/black distribution
Manually verify accuracy: Check a sample of results against expectations
Observe patterns: Note which types of products succeed or fail with each strategy
Compare against goals: Does this strategy meet your requirements?

If results don’t match your needs:

Try a different strategy
Improve product descriptions
Add custom instructions for edge cases
Use a hybrid approach

Next Steps

After selecting a strategy:

Test in the Playground : Validate with your real products in the Playground
Write custom instructions : Handle edge cases specific to your catalog (see Writing Custom Instructions)
Understand the pipeline : Learn how settings interact (see Classification Pipeline and Settings)
Integrate with API : Apply your configuration in production

Your strategy isn’t permanent—adjust as your catalog, descriptions, or requirements change.

Classification Pipeline & Settings Writing Custom Instructions