Clarification regarding dimensions for gtr-t5-large embedding model

#3
by ksridhar-123 - opened

Hello!

I have been using the sentence-transformers/gtr-t5-large embedding model for a while now. In the last couple of days, I have been using the embedding model in a linux docker container which I created and with that I am getting vector embeddings with 1024 dimension randomly instead of the stated 768. My dependencies in the docker container are as follows:

torch==2.6.0
transformers==4.44.2
sentence_transformers==3.0.1
fastapi==0.115.4
uvicorn==0.32.0

I download the model at container startup using the SentenceTransformers("sentence-transformers/gtr-t5-large") python API and I also find that the model hasn't changed on huggingface for about a year now. Using the model directly on my local host machine yields embeddings of 768 dimension properly.

Can anyone help me understand why I could be seeing this issue? If more information is needed, please let me know.

Thanks for the help!

@tomaarsen If you could help me understand the issue here, I would greatly appreciate it! Thanks!

Sentence Transformers org

Hello!

I really don't know why this is happening. It seems like the Dense layer isn't always used? https://huggingface.co/sentence-transformers/gtr-t5-large/tree/main/2_Dense
That one maps the 1024-dimensional raw outputs to 768 dimensions as that's faster to retrieve with: https://huggingface.co/sentence-transformers/gtr-t5-large/blob/main/2_Dense/config.json

But why it would sometimes not work? No clue, I also don't know what may have changed. You're pinned to a reasonable version, and it hasn't updated, and the model also hasn't updated :/

  • Tom Aarsen

Thanks @tomaarsen for the reply! Yes, it looks like the dense layer is not being used as you pointed. Curious, what does the truncate_dim parameter in the sentence_transformers constructor do? Would setting that to 768 help in any way? Thanks!

Sentence Transformers org
edited 8 days ago

It just truncates to 768. Sadly that is likely not the same as applying the dense linear layer, assuming that the Dense somehow not applying is indeed the issue. I think it requires more experiments to figure out what's happening.

You should be able to use model.get_sentence_embedding_dimension to get the dimensionality of the model, normally that should be 768 for this model.

  • Tom Aarsen

Thanks @tomaarsen . I did some digging and have found the issue. I had the TRANSFORMERS_CACHE environment variable set in the container to download the model to a specific location. Changing that to HF_HOME solved the issue. I did get deprecation warnings about the TRANSFORMERS_CACHE variable but I didn't think that would result in skipping the dense layer (maybe you know better!). Anyway, I could confirm that was indeed causing the issue by repeatedly switching between the 2 environment variables and the issue is gone with HF_HOME!

Thanks for your time and help on this!

Sign up or log in to comment