pe.embedding package

class pe.embedding.Embedding[source]

Bases: ABC

The abstract class that computes the embedding of samples.

property column_name

The column name to be used in the data frame.

abstract compute_embedding(data)[source]

Compute the embedding of samples.

Parameters:

data (pe.data.Data) – The data to compute the embedding

filter_uncomputed_rows(data)[source]
merge_computed_rows(data, computed_data)[source]
class pe.embedding.FLDInception(res=None)[source]

Bases: Embedding

Compute the Inception embedding of images using FLD library.

__init__(res=None)[source]

Constructor.

Parameters:

res (int, optional) – The resolution of the images. The images will be resized to (res, res) before computing the embedding. If None, the images will not be resized. Defaults to None

compute_embedding(data)[source]

Compute the Inception embedding of images.

Parameters:

data (pe.data.Data) – The data object containing the images

Returns:

The data object with the computed embedding

Return type:

pe.data.Data

class pe.embedding.Inception(res, device='cuda', batch_size=2000)[source]

Bases: Embedding

Compute the Inception embedding of images.

__init__(res, device='cuda', batch_size=2000)[source]

Constructor.

Parameters:
  • res (int) – The resolution of the images. The images will be resized to (res, res) before computing the embedding

  • device (str, optional) – The device to use for computing the embedding, defaults to “cuda”

  • batch_size (int, optional) – The batch size to use for computing the embedding, defaults to 2000

compute_embedding(data)[source]

Compute the Inception embedding of images.

Parameters:

data (pe.data.Data) – The data object containing the images

Returns:

The data object with the computed embedding

Return type:

pe.data.Data

class pe.embedding.SentenceTransformer(model, batch_size=2000)[source]

Bases: Embedding

Compute the Sentence Transformers embedding of text.

__init__(model, batch_size=2000)[source]

Constructor.

Parameters:
  • model (str) – The Sentence Transformers model to use

  • batch_size (int, optional) – The batch size to use for computing the embedding, defaults to 2000

property column_name

The column name to be used in the data frame.

compute_embedding(data)[source]

Compute the Sentence Transformers embedding of text.

Parameters:

data (pe.data.Data) – The data object containing the text

Returns:

The data object with the computed embedding

Return type:

pe.data.Data

Subpackages

Submodules