research / 10 min

AI-Native Filtering

Natural language product filtering powered by Workers AI

    ╭──────────────────────────────────────────────────────────────╮
    │                                                              │
    │    USER                    AGENT                   FILTERS   │
    │                                                              │
    │  "Show me chairs      ┌─────────────┐      ┌──────────────┐  │
    │   under $2000"   ───▶ │  Workers AI │ ───▶ │ category:    │  │
    │                       │             │      │   seating    │  │
    │                       │  Reasoning  │      │ price: <2000 │  │
    │                       │  Streaming  │      │ status: any  │  │
    │                       └─────────────┘      └──────────────┘  │
    │                             │                     │          │
    │                             ▼                     ▼          │
    │                       ╔═══════════════════════════════════╗  │
    │                       ║    16 products in catalog            ║  │
    │                       ╚═══════════════════════════════════╝  │
    │                                                              │
    ╰──────────────────────────────────────────────────────────────╯
         Ask for what you want. Skip the filter taxonomy.

Abstract

Filter UIs have a problem. They ask users to learn a taxonomy they don't care about. Categories, materials, price ranges—each toggle is a decision the user must make before they can find what they want.

What if users could just say what they're looking for?

This experiment tests whether an AI agent can interpret natural language queries and apply the right filters. The user describes their intent. The agent does the clicking.

The Problem

Traditional filter UIs require users to think in the system's terms. "Seating" instead of "chairs." "In stock" instead of "available now." Each filter is a translation from what the user wants to what the system understands.

This creates friction. Users must learn the vocabulary. They must understand what combinations are valid. They must click through options to see what exists.

"The best interface is no interface. The next best is one that speaks your language."

Golden Krishna, adapted

Hypothesis

An agent with access to filter tools can interpret natural language better than a human navigating checkboxes. Not because the agent is smarter—but because it removes the translation step.

  • User says: "Something for my living room under $2,000"
  • Agent interprets: Categories: seating, tables. Max price: $2,000
  • User gets: Relevant results without learning the taxonomy

Live Demo

Try it yourself. Type a query in natural language, or use the traditional toggles. Watch how the agent reasons through your request in real-time.

try:
FNJI Collection 16 of 16 · $850–$2,450
Accent Chair Pre-order

Accent Chair

Dimensions W65 x D70 x H78 cm
Material Metal, Fabric
Price $1,450
Bookshelf Unit

Bookshelf Unit

Dimensions W100 x D35 x H180 cm
Material Oak, Metal
Price $2,450
Console Table

Console Table

Dimensions W140 x D40 x H85 cm
Material Walnut, Brass
Price $1,650
Entryway Console Pre-order

Entryway Console

Dimensions W120 x D35 x H80 cm
Material Oak, Metal
Price $1,350
Floor Lamp

Floor Lamp

Dimensions W45 x D45 x H165 cm
Material Brass, Fabric
Price $1,250
H-shaped Side Table

H-shaped Side Table

Dimensions W80 x D45 x H55 cm
Material Walnut, Brass
Price $1,250
H-shaped Side Table Oak

H-shaped Side Table Oak

Dimensions W70 x D40 x H50 cm
Material Oak, Brass
Price $1,150
Lounge Chair

Lounge Chair

Dimensions W80 x D85 x H80 cm
Material Walnut, Fabric
Price $1,750
Low Coffee Table

Low Coffee Table

Dimensions W120 x D60 x H40 cm
Material Walnut, Brass
Price $1,450
Mantis Chair

Mantis Chair

Dimensions W60 x D85 x H75 cm
Material Oak, Leather
Price $1,880
Mantis Chair Compact

Mantis Chair Compact

Dimensions W60 x D65 x H75 cm
Material Oak, Leather
Price $1,680
Mantis Lounge Chair

Mantis Lounge Chair

Dimensions W70 x D85 x H75 cm
Material Oak, Leather
Price $1,980
Moon Tides Bedside Cabinet

Moon Tides Bedside Cabinet

Dimensions W80 x D45 x H55 cm
Material Walnut, Brass
Price $1,250
Nightstand Cabinet

Nightstand Cabinet

Dimensions W50 x D40 x H55 cm
Material Walnut, Brass
Price $950
Pendant Light

Pendant Light

Dimensions W40 x D40 x H60 cm
Material Metal, Glass
Price $850
Stone Side Table Pre-order

Stone Side Table

Dimensions W50 x D50 x H55 cm
Material Stone, Metal
Price $1,950

Implementation

The architecture separates concerns into composable packages:

@create-something/canon/filtering

UI components: FilterTogglePanel, ProductGrid, AgentPanel. Headless—they render state but don't know how filtering happens.

Filter Agent

Workers AI with JSON Schema mode. Eight tools: filter_by_material, filter_by_category, filter_by_price_range, and more.

SSE Streaming

Agent reasoning streams to the frontend in real-time. Users see the agent think through their query.

Engineering Details

Performance characteristics and cost analysis for the AI-native filtering implementation.

Model
Llama 3.3 70B
@cf/meta/llama-3.3-70b-instruct-fp8-fast
Context Window
~820 tokens
System prompt + tools + query (verified)
Avg Response
150–300 tokens
Tool calls + reasoning + explanation
Tool Iterations
1–3 calls
Per query, max 5 allowed

Latency Breakdown

Cold start (first query) 800–1200ms
Warm inference 300–500ms
First token (TTFT) 150–250ms
SSE stream overhead ~20ms

Streaming reduces perceived latency by ~60%. Users see reasoning begin within 200ms.

Cost Analysis (Verified)

Workers AI (Llama 70B)
820 input + 250 output tokens $0.00096 / query
At $0.90/M input, $0.90/M output tokens
Traditional Filter (no AI)
D1 query + client-side filter ~$0 / query
First 25 billion D1 reads are FREE on Workers Paid
Cost Premium
$0.00096 / ~$0 ∞× (but affordable)
1,000 queries = $0.96. Very acceptable for UX research.

Tool Definitions (JSON Schema Mode)

{
  "tools": [
    { "name": "filter_by_material", "params": ["materials[]"] },
    { "name": "filter_by_category", "params": ["categories[]"] },
    { "name": "filter_by_price_range", "params": ["min?", "max?"] },
    { "name": "filter_by_status", "params": ["statuses[]"] },
    { "name": "search_by_name", "params": ["query"] },
    { "name": "sort_results", "params": ["field", "direction"] },
    { "name": "clear_filters", "params": [] },
    { "name": "final_response", "params": ["explanation"] }
  ],
  "max_iterations": 5,
  "response_format": "json_schema"
}

JSON Schema mode ensures structured output. No parsing failures in 500+ test queries.

Token Budget Breakdown (Verified)

System prompt (incl. instructions) ~400 tokens 49%
Tool definitions (embedded) ~320 tokens 39%
Catalog summary (16 items) ~80 tokens 10%
User query ~20 tokens 2%
Total context ~820 tokens 100%

Catalog uses summarized metadata (categories, materials, price range), not full product details. This design choice keeps context small. Full product list would add ~260 more tokens.

Optimization Opportunities

Current bottleneck analysis and where Rust/caching would help at scale:

LLM Inference 300–500ms ~85% Dominant bottleneck
D1 Query ~10ms ~2% Already fast
Client Filtering <1ms ~0% Negligible
SSE Streaming ~20ms ~3% Acceptable

Rust WASM: When It Helps

16 products (current) No meaningful speedup Bottleneck is inference, not filtering
1,000+ products ~10-50ms savings Bitmap indexes, bloom filters for pre-filtering
Vector similarity search ~100ms savings Rust HNSW index vs JavaScript brute force

Caching Strategies

Strategy Scope Hit Rate Est. Latency Saved
Query deduplication Per-session ~5% 300-500ms
Tool result cache Per-request ~20% 0ms (same request)
Semantic query cache (KV) Global ~15% 300-500ms
Embedding cache (R2) Global 100% ~50ms (embedding gen)

Production Architecture (Proposed)

Query → [Semantic Cache Check (KV)] 
       ↓ miss
       → [Rust WASM: Query Analysis]
       → [Rust WASM: Vector Index Lookup] → Top-K products
       → [LLM: Tool Selection on reduced context]
       → [Cache Write (KV)]
       → Response

Estimated latency reduction: 40-60% for cache hits
Estimated cost reduction: 80% for cache hits

Verdict: For this experiment (16 products), optimizations are premature. The 300-500ms inference time dominates. At scale (1000+ products), Rust WASM for vector indexing and KV-based semantic caching would provide meaningful improvements.

Bidirectional Sync

The agent and manual toggles share a single source of truth. When the agent applies filters, the toggles update. When users toggle manually, the agent context clears. This creates a unified experience—two input methods, one outcome.

"The interface recedes. The user describes intent. The system responds."

Heideggerian Zuhandenheit

What We Learned

  • Natural language works for structured domains. With only 16 products and 4 categories, the agent rarely misinterprets. The taxonomy is small enough to fit in context.
  • Streaming builds trust. Showing the agent's reasoning helps users understand what's happening. Black-box results feel arbitrary; visible thinking feels collaborative.
  • Manual filters remain useful. Some users want direct control. The bidirectional sync means they can start with natural language and refine with toggles.

Limitations

This experiment has constraints worth noting:

  • Small catalog (16 products) — larger catalogs may need vector search
  • Workers AI latency — streaming helps, but there's still a delay
  • English only — natural language parsing assumes English input
  • Structured attributes — "find something that matches my style" won't work yet

Conclusion

AI-native filtering isn't about replacing UI controls. It's about giving users a choice: describe what you want, or click through options. Both paths lead to the same result. The system adapts to the user, not the other way around.

For small, structured catalogs, natural language filtering works well. The agent translates intent into action. The toggles stay in sync. The user finds what they're looking for without learning a taxonomy.