Classification Strategies
Different business needs require different classification approaches. This guide explains three core strategies—when to use each, how to configure them, and what results to expect.
Understanding the Trade-offs
All classification strategies balance three factors:
Coverage : What percentage of your products get a category returned
Accuracy : How often the returned category is correct and appropriate
Specificity : Whether categories are detailed (leaf level) or general (may include parents)
No single configuration optimizes all three. Your strategy depends on which factors matter most for your use case.
Strategy 1: Maximum Coverage
Priority: Get a category for every product, even if confidence is lower
Best for:
- Initial catalog imports where you’ll review results manually
- High-volume operations where downstream processes need some categorization
- Situations where having an approximate category is better than none
- Products with limited or inconsistent descriptions
Configuration
Taxonomy: [your choice]
Custom instructions: Optional
Leaf only: OFF
Use top-ranked: ONHow It Works
With leaf-only disabled, the candidate set includes both specific and general categories. The AI can choose parents when uncertain about siblings. With top-ranked fallback enabled, even when the AI cannot decide, the system returns the highest-scored candidate.
This produces maximum coverage because you’ll get results in nearly every scenario:
- Green indicators when the AI confidently selects a category
- Yellow indicators when the system falls back to top-ranked scores
- Black indicators only in rare cases where no relevant candidates exist
What to Expect
Coverage: Expect the highest coverage of all strategies—nearly every product will receive a category.
Accuracy: Green results will be fairly accurate, but expect lower accuracy for yellow results. The top-ranked fallback mechanism prioritizes coverage over precision.
Specificity: You’ll get a mix of leaf and parent categories. The AI chooses the appropriate level based on confidence—specific when certain, general when ambiguous.
Status indicators: Expect mostly green and yellow results with very few black results. The yellow results represent products where the system fell back to top-ranked scores rather than confident AI selection.
When to Use
Use maximum coverage when:
- You’re importing a large catalog and will review classifications afterward
- Downstream systems require categories but can handle imperfect data
- You prefer to manually correct wrong classifications rather than have missing ones
- Product descriptions are inconsistent or low-quality
Limitations
- Yellow results have lower accuracy than green results
- Some clearly wrong classifications will appear in yellow results
- Requires manual review and correction for production use
- Not suitable for platforms with strict taxonomy requirements
Strategy 2: Precision First
Priority: Only return categories when highly confident they’re correct
Best for:
- Production systems where accuracy is critical
- Platforms with strict taxonomy requirements
- High-value products where misclassification has consequences
- When you prefer to manually classify uncertain products
Configuration
Taxonomy: [your choice]
Custom instructions: Recommended for edge cases
Leaf only: OFF
Use top-ranked: OFFHow It Works
With top-ranked disabled, the system only returns categories the AI actively selects. No score-based fallback exists. With leaf-only disabled, the AI can choose parent categories when siblings are ambiguous, ensuring returned results are semantically appropriate.
This produces high accuracy but lower coverage:
- Green indicators when the AI confidently selects any category (leaf or parent)
- Black indicators when the AI cannot confidently decide
- No yellow indicators (top-ranked fallback is disabled)
What to Expect
Coverage: Expect significantly lower coverage than maximum coverage strategy—a meaningful portion of products will return no category.
Accuracy: Green results should be highly accurate since the system only returns categories it confidently selects. No yellow results exist with this strategy.
Specificity: You’ll get a mix of leaf and parent categories, all actively chosen by the AI. The system chooses parents when sibling leaves are ambiguous, ensuring semantic correctness.
Status indicators: Expect only green and black results. Green indicates confident AI selection, black indicates the AI couldn’t make a confident determination.
When to Use
Use precision-first when:
- Accuracy is more important than coverage
- You’re willing to manually classify products that return no category
- Your platform rejects incorrect categorizations
- You have resources to improve descriptions for black-result products
Handling Black Results
Products that return black indicators need attention:
- Improve descriptions : Add context, features, or clarifying details
- Add custom instructions : Provide disambiguation rules for recurring patterns
- Manual classification : Classify by hand if description can’t be improved
Limitations
- A significant portion of products return no category initially
- Requires investment in description quality and custom instructions
- Higher manual effort than maximum coverage strategy
- Not suitable when complete coverage is required
Strategy 3: Specific Categories Required
Priority: Always return leaf-level categories for platform requirements
Best for:
- Platforms that only accept leaf categories (reject parent categories)
- Product feeds with strict specificity requirements
- When business rules mandate detailed classifications
- Catalogs where every product clearly fits a specific category
Configuration
Taxonomy: [your choice]
Custom instructions: Recommended
Leaf only: ON
Use top-ranked: ONHow It Works
With leaf-only enabled, only leaf categories appear in candidates—no parents. The AI must choose between specific options. With top-ranked enabled, if the AI cannot decide, the highest-scored leaf is returned.
This guarantees specific categories but sacrifices some accuracy:
- Green indicators when the AI confidently selects a leaf
- Yellow indicators when top-ranked leaf is used as fallback
- Black indicators only when no leaf candidates exist (rare)
What to Expect
**Coverage:**Expect very high coverage—nearly every product will receive a leaf category. Black results should be rare.
Accuracy: Green results should be fairly accurate. Yellow results will have noticeably lower accuracy since they represent forced choices between ambiguous siblings. Expect a higher rate of wrong-but-specific classifications in yellow results.
Specificity: All returned categories will be leaf-level—no parents are possible with this configuration.
Status indicators: Expect a substantial number of yellow results alongside green results. The yellow results indicate products where the system had to choose between ambiguous leaf options using top-ranked scores.
When to Use
Use specific categories required when:
- Your platform rejects parent categories
- Business rules require leaf-level classification
- You’re feeding product data to systems with strict taxonomy requirements
- You prefer specific (even if occasionally wrong) over general categories
Managing Yellow Results
With a substantial number of yellow results, accuracy management is critical:
- Custom instructions reduce yellow results : Provide disambiguation rules for common ambiguous products
- Improve descriptions : Add details that distinguish between sibling leaves
- Accept some imprecision : Yellow leaf results are still leaf-level, meeting platform requirements
- Prioritize review : Check yellow results before finalizing imports
Limitations
- High percentage of yellow results (lower confidence)
- Some incorrect specific classifications in yellow results
- Requires robust custom instructions to manage ambiguity
- Cannot fall back to safe parent categories
- Not ideal when semantic correctness matters more than specificity
Comparing the Strategies
| Factor | Maximum Coverage | Precision First | Specific Categories |
|---|---|---|---|
| Coverage | Highest—nearly all products get categories | Lower—meaningful portion get no category | Very high—nearly all products get categories |
| Status indicators | Mix of green and yellow results | Only green and black (no yellow) | Mix of green and yellow, substantial yellow portion |
| Accuracy (green) | Fairly accurate | Highly accurate | Fairly accurate |
| Accuracy (yellow) | Less accurate than green | N/A (no yellow results) | Noticeably less accurate than green |
| Specificity | Mix of leaf and parent categories | Mix of leaf and parent categories | Only leaf categories (no parents) |
| Manual review | Moderate | High (for black results) | Moderate to high |
| Custom instructions | Optional | Recommended for edge cases | Recommended to reduce yellow results |
Hybrid Approaches
You don’t need to apply one strategy to your entire catalog. Consider using different strategies for different product types:
By Product Value
- High-value products : Precision First
- Medium-value products : Maximum Coverage
- Low-value products : Specific Categories
By Description Quality
- Well-described products : Precision First or Specific Categories
- Poorly-described products : Maximum Coverage
By Category Confidence
- First pass : Run Precision First to get high-confidence results
- Second pass : Run remaining products with Maximum Coverage
- Manual classification : Handle remaining black results
Implementation
// High-value products
if (product.price > 1000) {
classify({ ...product, leafsOnly: false, fallbackToBestGuess: false })
}
// Standard products
else {
classify({ ...product, leafsOnly: false, fallbackToBestGuess: true })
}Choosing Your Strategy
Ask yourself these questions:
- Does my platform reject parent categories?
- Yes → Specific Categories Required
- No → Continue to question 2
- How important is accuracy vs coverage?
- Accuracy critical → Precision First
- Coverage critical → Maximum Coverage
- Both important → Specific Categories Required
- Can I manually review and correct results?
- Yes, extensive review → Maximum Coverage
- Yes, some review → Specific Categories Required
- No, must be accurate → Precision First
- How good are my product descriptions?
- Excellent, detailed → Precision First or Specific Categories
- Mixed quality → Maximum Coverage
- Poor, inconsistent → Maximum Coverage
- What percentage of errors can I tolerate?
-
Very few errors → Precision First
-
Some errors acceptable → Specific Categories Required
-
More errors acceptable for coverage → Maximum Coverage
Testing Your Strategy
Before committing to a strategy:
- Test with representative products: Include diverse categories and quality levels
- Review all status indicators: Understand your green/yellow/black distribution
- Manually verify accuracy: Check a sample of results against expectations
- Observe patterns: Note which types of products succeed or fail with each strategy
- Compare against goals: Does this strategy meet your requirements?
If results don’t match your needs:
- Try a different strategy
- Improve product descriptions
- Add custom instructions for edge cases
- Use a hybrid approach
Next Steps
After selecting a strategy:
- Test in the Playground : Validate with your real products in the Playground
- Write custom instructions : Handle edge cases specific to your catalog (see Writing Custom Instructions)
- Understand the pipeline : Learn how settings interact (see Classification Pipeline and Settings)
- Integrate with API : Apply your configuration in production
Your strategy isn’t permanent—adjust as your catalog, descriptions, or requirements change.