Indexer Capability
How a plugin identifies media files against an external metadata source, with rate-limit support for upstream API quotas.
An indexer plugin turns a discovered file into identified
metadata: a title, a year, a TMDb ID, a list of cast and crew, a
poster URL. It is the bridge between "Aviato found this file"
and "Aviato knows what this file is."
Aviato runs indexers as part of the ingestion pipeline. For each
discovered file it finds every plugin with the indexer
capability whose mediaTypes matches the library, asks each
indexer.supports(file) whether it can handle the file, and
calls indexer.index(file, options) on the first one that says
yes. The returned metadata is merged into the bundle and
persisted.
Reference plugins (all open source):
aviato-tmdb: movies and TV via The Movie Database.aviato-musicbrainz: music via MusicBrainz and Cover Art Archive.aviato-metadata-books: ebook metadata extraction (local, not remote, but uses the same indexer contract).aviato-indexer-photos: EXIF-based photo indexing.
Manifest
{
"id": "aviato-tmdb",
"name": "TMDb Indexer",
"version": "1.0.0",
"description": "Indexes movie and TV files via The Movie Database",
"author": "Aviato",
"license": "MIT",
"engine": "bun",
"entry": "src/index.ts",
"aviato": { "minVersion": "0.1.0" },
"capabilities": ["indexer"],
"mediaTypes": ["movies", "tv"],
"configuration": [
{ "key": "tmdbApiKey", "label": "TMDb API Key", "input": "text", "required": false },
{ "key": "language", "label": "Metadata language", "input": "text", "default": "en-US" }
],
"rateLimit": {
"maxConcurrency": 4,
"requests": [
{ "max": 40, "window": "10s" }
]
}
}mediaTypes is required for indexers. Aviato uses it to
short-list which indexers to consider for each library: a
["music"] indexer will never be called for a movie file.
configuration declares user-editable settings, typically API
keys and region preferences. Aviato passes the user's saved
values into the plugin process via the AVIATO_PLUGIN_CONFIG
environment variable as JSON.
RPC contract
Aviato calls these methods on every running indexer plugin
(indexer.* namespace, with types exported from
@aviato/plugin-sdk):
| Method | Called when | Returns |
|---|---|---|
supports | For every discovered file before calling index | boolean |
index | After supports returns true | IndexResult |
search | User-driven search (Add Library wizard, Fix Match) | SearchResult |
getMatchDetail | User picked a candidate from search, fetch the full record | IndexResult |
getEntityDetail (optional) | Opening an entity detail page (person, show, etc.) | EntityDetailResult |
supports
Synchronous gate. Decide whether this plugin should handle the
file. Cheap checks only, no network. Common patterns: extension
allow-list, filename regex, container probe of headers already
in metadata.
supports: ({ file }) => /\.(mkv|mp4|avi)$/i.test(file.path)If multiple indexers return true, Aviato picks based on plugin priority (currently registration order; explicit ordering is a planned feature).
index
The main entry point. Receives the discovered file, library
options (libraryId, libraryType, mediaType, forceRefresh,
certificationCountry), and the in-flight metadata bundle from
earlier hooks. Returns identified metadata or an error.
index: async ({ file, options, metadata }) => {
const candidate = await searchTmdb(parseTitle(file.path), parseYear(file.path))
if (!candidate) {
return { success: false, error: 'No match', retryable: true }
}
return {
success: true,
metadata: {
title: candidate.title,
fields: { overview: candidate.overview, releaseDate: candidate.release_date },
canonicalIds: [{ provider: 'tmdb', id: String(candidate.id) }],
artwork: [{ type: 'poster', url: posterUrl(candidate.poster_path) }],
entities: candidate.credits.cast.map(toPersonEntity),
}
}
}success: false with retryable: true tells the pipeline this
was a transient failure (network blip, upstream 503) and the job
will be retried according to retry policy. retryable: false is
terminal: the item ends up in the "needs review" bucket for the
user to fix manually.
search and getMatchDetail
These power the user-facing flows: the Add Library wizard
preview, and Fix Match. search returns lightweight
SearchCandidate rows (title, year, overview, image, canonical
IDs, optional confidence). getMatchDetail takes the canonical
IDs of a chosen candidate and returns the full IndexResult,
the same shape as index returns, just initiated by the user
instead of the pipeline.
getEntityDetail (optional)
Lets Aviato fetch a full record for an entity (person, show,
etc.) on demand. Returns artwork, biography, related works, and
so on. The library plugin's entityRenderers[type] declares
how Aviato renders the returned fields.
Returned metadata
The shape returned in IndexResult.metadata
(LibraryItemMetadataSchema from @aviato/plugin-sdk):
| Field | Purpose |
|---|---|
title | Canonical title, used for sorting, search, and display fallback. |
fields | Key/value bag matching the library plugin's itemSchema. |
canonicalIds | [{ provider, id, url? }]. At least one is strongly recommended. |
entities | EntityReference[]: people, shows, seasons, etc. Aviato reconciles them into the entity graph. |
artwork | ArtworkReference[]: posters, backdrops, banners. Aviato caches and serves them. |
fields keys must match the keys declared in the library
plugin's itemSchema. Anything else is silently dropped.
Rate limiting
External metadata providers enforce strict per-key quotas. TMDb caps free keys at around 40 requests per 10 s. MusicBrainz enforces 1 request per second per client. Coordinating those limits across the dozens of concurrent ingestion jobs Aviato runs is Aviato's job, not the plugin author's.
Declare quotas at the manifest level via rateLimit. Aviato
wraps every RPC call to your plugin in a per-plugin limiter
combining a concurrency semaphore and a time-window rate
limiter. Both constraints are enforced before any inbound RPC
reaches your handler.
"rateLimit": {
"maxConcurrency": 4,
"requests": [
{ "max": 40, "window": "10s" }
]
}| Field | Effect |
|---|---|
maxConcurrency | Maximum number of in-flight RPCs at once. Acquire-on-call, release-on-return. Defaults to Infinity when omitted. |
requests[] | Sliding-window quotas. Each entry is { max, window } where window is "<n>[smh]". Sub-second windows like "500ms" are not supported; use seconds or larger. Multiple windows AND together: [{ max: 40, window: "10s" }, { max: 1000, window: "1h" }] enforces both. |
How it works
Each plugin gets at most one limiter, created lazily on first RPC. Every inbound RPC call to the plugin does:
await limiter.acquire(). Resolves immediately if a slot is free; otherwise queues until concurrency or window space frees up.- Send the JSON-RPC request and await the reply.
- Release the slot, decrementing the in-flight count and letting the next waiter through.
Window counters reset on a sliding basis: when the first request in a window expires, the count drops by one. The limiter fairly serves waiters FIFO; there is no priority lane for any RPC type.
Choosing values
Real-world configurations from the bundled plugins:
// aviato-musicbrainz: strict 1 RPS upstream limit
"rateLimit": { "maxConcurrency": 1 }// aviato-tmdb: no global quota; rely on backoff
// (TMDb returns 429s with Retry-After, plugin handles them in-band)Rules of thumb:
- Upstream has a hard rate limit? Set
requeststo match it, with a margin of about 10%. Don't hit the upstream cap precisely; leave headroom for retries. - Single-flight upstream (sequential only)? Use
maxConcurrency: 1and drop the windows. The semaphore alone is enough. - Upstream times out under burst load? Set
maxConcurrencyto the steady-state target even if there's no documented quota. - No quota at all? Omit
rateLimit. Defaulting toInfinityis fine for indexers that talk to the local filesystem (EXIF, NFO, EPUB parsers).
A misconfigured limiter is worse than no limiter:
over-restrictive values stall the pipeline, and over-permissive
values get the upstream key banned. When in doubt, start strict
and loosen after observing real traffic in
/admin/plugins/<id>/logs.
Inside the plugin
You don't write any rate-limit code in the plugin itself. The limiter sits in front of the JSON-RPC inbound queue, so by the time your handler runs, the budget has already been allocated. Just write your handler as if it runs in isolation:
import { createPlugin } from '@aviato/plugin-sdk'
createPlugin({
indexer: {
supports: ({ file }) => /\.mp3$/i.test(file.path),
index: async ({ file, options, metadata }) => {
// No rate limiting here; Aviato has already gated this call.
const result = await fetch(`https://musicbrainz.org/...`)
return { success: true, metadata: toLibraryItemMetadata(result) }
},
search: async ({ query }) => { /* ... */ },
getMatchDetail: async ({ canonicalIds }) => { /* ... */ },
},
})If your upstream returns a 429 Too Many Requests, parse
Retry-After and either:
- Throw, in which case Aviato marks the job retryable, the limiter releases its slot, and the next attempt runs after the configured backoff.
- Await the retry-after duration in-handler.
Option 1 is preferred because it lets other plugins use the limiter slot in the meantime.
Lifecycle
The limiter belongs to the plugin process. When the plugin is
stopped or restarted, the limiter is destroyed, in-flight slots
are released, and queued waiters reject. After restart the
limiter is recreated from the current manifest, so editing
rateLimit and restarting the plugin applies the new values
without a server restart. All limiters tear down on graceful
Aviato shutdown.
See also
- Plugin system overview
- Hooks, events, and views covers
pipeline.index.afterProcess, which fires after an indexer returns. Use it to enrich the bundle with data from other sources. - Library capability declares the
itemSchemaandentitySchemasyour indexer'sfieldsandentitiesmust match. - All indexer types (
IndexRequest,IndexResult,SearchRequest,SearchResult,LibraryItemMetadata,EntityReference,ArtworkReference) are exported from@aviato/plugin-sdk.
UI Schemas
How plugins extend the UI by returning declarative schemas (forms, metadata blocks, and action buttons) that Aviato renders with platform-native components.
Media Scan Plugins
How a plugin participates in the post-ingestion media scan pipeline, returns chapters or other typed outputs, and uses the shared fingerprint cache to short-circuit expensive re-analysis.