For my use-case, I wanted whole-paragraph embeddings (iirc these are trained on ...

For my use-case, I wanted whole-paragraph embeddings (iirc these are trained on max-256-token sentence pairs)

Their simple suggestion to extend to longer texts is to pool/avg sentence embeddings - but I'm not so sure that I want that; for instance eg that implies order between sentences didn't matter. If I were forced to use sentence transformers for my use case, then the real fix would be to train an actual pooling model atop of the sentence embedder, but I didn't want to do that either. At that point I stopped looking into it, but I'm certain there are newer models out nowadays that have both better encoders and handle much longer texts. The one nice thing about the sentence transformer models though is that they are much more lightweight than eg a 7B param language model