Hi,

we all know that text embeddings (e.g., SBERT, simCSE, LLM embeddings) are very powerful. However, my little grudge with them was always that it’s hard to say what’s really in them. Okay, matching them gives some value of “relatedness” or “similarity”, but the value is kind of really hard to interpret. I mean text can be really diverse and is often similar in some categories, but not in others.

Here’s an example:

“The man builds a tent”

“Two men build a tent”

A text embedding model such as SBERT gives a high similarity score, which is fine, since the sentences are in fact quite similar. However, they’re similar because they’re mostly on the same stuff/topics, but they’re dissimilar in their use of number: in the first sentence there’s one man, in the second sentence there’s two!

My idea was to fine-tune the text embedding model such that we have multiple sub-embeddings of which we know what’s in them. This way, we can inspect how the overall score is regulated. E.g. in the example, we’d have a high score since the sentences have the same topic and the “topic” sub-embeddings match well, but we also modulate the score slightly downwards since our “number”-sub-embeddings that have the task of capturing quantification/number information are different.

I’ve written some code that allows you to structure text embeddings into interpretable semantic features according to your use-case. The basic steps are really simple:

  1. Define a few interpretable metrics that measure similarity wrt to certain aspects you’re interested in (e.g., polarity, negative/positive sentiment, topic… and so on, you can be creative!).

  2. Assign each metric some part of the embedding

  3. Fine tune some sentence embedding model on the metric scores, such that the information gets pushed to the assigned parts and your interpretable metrics get reflected.

During training we pay attention that we don’t mess up the model and control the information routing process by ensuring that the overall similarity of the embeddings stays about the same as the similarity when using a frozen embedding model.

In the end, the final text embedding is structured into different sub-embeddings. You can use these sub-embeddings for fine-grained semantic search or clustering, or simply to explain a similarity rating of the embedding model.

Here’s the code for structuring your custom embeddings:

https://github.com/flipz357/S3BERT

Code is under free public MIT license.