Skip to main content
← Back to list
01Issue
BugOpenSwamp Club
AssigneesNone

Relationships

#736 Extension search returns edit-distance noise for short queries ("asdl" → "AWS DEADLINE")

Opened by keeb · 6/21/2026

Description

Registry extension search returns semantically unrelated results for short queries because the Atlas $search stage relies on aggressive fuzzy matching with no exact/prefix-match boosting tier.

Searching asdl returns @swamp/aws/deadline ("AWS DEADLINE infrastructure models") and @easel/ruckus — neither is a sensible match. The user was trying to reach asdlc-prefixed extensions.

Steps to reproduce

  1. swamp extension search "asdl" --json
  2. Observe top hits are @swamp/aws/deadline and @easel/ruckus.

Root cause

In buildAtlasSearchStage (lib/infrastructure/mongo-extension-repository.ts), the compound query uses:

  • autocomplete on name with fuzzy: { maxEdits: 1, prefixLength: 1 }
  • text on description with fuzzy: { maxEdits: 1 }
  • minimumShouldMatch: 1

With a 4-char query, edit-distance-1 substrings dominate recall (e.g. aseleasel), and there is no higher-scoring clause for exact or true prefix matches on name, so noise outranks/replaces real prefix matches.

Proposed direction

Add a boosted exact/prefix-match clause on name (e.g. a high-boost phrase/autocomplete with no fuzz, plus a non-fuzzy text on name) ranked above the fuzzy fallback, and/or disable fuzz below a minimum query length. Confirm the regex fallback (buildRegexFilter) ranking stays substring-literal.

Environment

swamp-club registry search (Atlas Search index extensions_search), prod.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED

Open

6/21/2026, 5:58:22 PM

No activity in this phase yet.

03Sludge Pulse

Sign in to post a ripple.