Text Embedder¶
The Task¶
This task consists of creating a Sentence Embedding. That is, a vector of sentence representations which can be used for a downstream task.
The TextEmbedder
implementation relies on components from sentence-transformers.
Example¶
Let’s look at an example of generating sentence embeddings.
We start by loading some sentences for prediction with the TextClassificationData
class.
Next, we create our TextEmbedder
with a pretrained backbone from the HuggingFace hub.
Finally, we create a Trainer
and generate sentence embeddings.
Here’s the full example:
import flash
import torch
from flash.text import TextClassificationData, TextEmbedder
# 1. Create the DataModule
datamodule = TextClassificationData.from_lists(
predict_data=[
"Turgid dialogue, feeble characterization - Harvey Keitel a judge?.",
"The worst movie in the history of cinema.",
"I come from Bulgaria where it 's almost impossible to have a tornado.",
],
batch_size=4,
)
# 2. Load a previously trained TextEmbedder
model = TextEmbedder(backbone="sentence-transformers/all-MiniLM-L6-v2")
# 3. Generate embeddings for the first 3 graphs
trainer = flash.Trainer(gpus=torch.cuda.device_count())
predictions = trainer.predict(model, datamodule=datamodule)
print(predictions)
To learn how to view the available backbones / heads for this task, see Backbones and Heads.