How semantic search works in the collection

Semantic search connects images, text, and user queries through a shared representation of meaning.
Instead of matching exact words, the system compares how similar different pieces of content are.

This allows users to find relevant works even when the wording differs.

From image to searchable content

The system starts with the image.

A locally hosted vision–language model (Qwen 3.5, 9B) analyses the image and generates a textual description of what is visible. The description focuses on observable elements such as objects, composition, and colours.

This step makes visual content searchable as text.

From text to meaning (embeddings)

All text in the system is converted into vector representations, called embeddings.

This includes:

generated image descriptions
existing metadata
user queries

Embeddings represent meaning rather than exact wording.
Texts with similar meaning will have similar vector representations.

A locally hosted embedding model (BAAI BGE-M3) is used for this step.

How search works

When a user enters a query:

The query is converted into an embedding
The system compares this embedding with stored vectors
Results are ranked based on similarity

This makes it possible to retrieve relevant results even when:

the same words are not used
the query is broad or descriptive
the user does not know the exact terminology

What is semantic search

Semantic search retrieves results based on meaning rather than exact keyword matches.

Traditional search:

matches words directly
depends on predefined metadata

Semantic search:

compares meaning across text and images
supports natural language queries
retrieves results based on similarity

This approach improves discovery and makes the collection easier to explore.

What the system does – and does not do

The system:

generates descriptions of visible content
connects queries and artworks based on similarity
supports exploratory search

The system does not:

interpret meaning or intent
replace existing metadata
guarantee correct or complete descriptions

Generated text is treated as a machine-produced description and can be reviewed and edited.

Why this matters

Semantic search changes how users interact with the collection.

It reduces the need to:

know specific terms
understand internal cataloguing
search using exact wording

Instead, users can describe what they are looking for and explore results based on meaning.