How semantic search works in the collection

Semantic search connects images, text, and user queries through a shared representation of meaning.
Instead of matching exact words, the system compares how similar different pieces of content are.

This allows users to find relevant works even when the wording differs.


From image to searchable content

The system starts with the image.

A locally hosted vision–language model (Qwen 3.5, 9B) analyses the image and generates a textual description of what is visible. The description focuses on observable elements such as objects, composition, and colours.

This step makes visual content searchable as text.


From text to meaning (embeddings)

All text in the system is converted into vector representations, called embeddings.

This includes:

  • generated image descriptions
  • existing metadata
  • user queries

Embeddings represent meaning rather than exact wording.
Texts with similar meaning will have similar vector representations.

A locally hosted embedding model (BAAI BGE-M3) is used for this step.


How search works

When a user enters a query:

  1. The query is converted into an embedding
  2. The system compares this embedding with stored vectors
  3. Results are ranked based on similarity

This makes it possible to retrieve relevant results even when:

  • the same words are not used
  • the query is broad or descriptive
  • the user does not know the exact terminology

What is semantic search

Semantic search retrieves results based on meaning rather than exact keyword matches.

Traditional search:

  • matches words directly
  • depends on predefined metadata

Semantic search:

  • compares meaning across text and images
  • supports natural language queries
  • retrieves results based on similarity

This approach improves discovery and makes the collection easier to explore.


What the system does – and does not do

The system:

  • generates descriptions of visible content
  • connects queries and artworks based on similarity
  • supports exploratory search

The system does not:

  • interpret meaning or intent
  • replace existing metadata
  • guarantee correct or complete descriptions

Generated text is treated as a machine-produced description and can be reviewed and edited.


Why this matters

Semantic search changes how users interact with the collection.

It reduces the need to:

  • know specific terms
  • understand internal cataloguing
  • search using exact wording

Instead, users can describe what they are looking for and explore results based on meaning.

Related pages