aac_metrics.classes.clap_sim module

class CLAPSim(
return_all_scores: True = True,
*,
clap_method: 'audio' | 'text' = 'text',
clap_model: str | CLAPWrapper = DEFAULT_CLAP_SIM_MODEL,
device: str | device | None = 'cuda_if_available',
batch_size: int | None = 32,
reset_state: bool = True,
seed: int | None = 42,
verbose: int = 0,
)[source]
class CLAPSim(
return_all_scores: False,
*,
clap_method: 'audio' | 'text' = 'text',
clap_model: str | CLAPWrapper = DEFAULT_CLAP_SIM_MODEL,
device: str | device | None = 'cuda_if_available',
batch_size: int | None = 32,
reset_state: bool = True,
seed: int | None = 42,
verbose: int = 0,
)

Bases: Generic[T_CLAPOut], AACMetric[T_CLAPOut]

Cosine-similarity of the Contrastive Language-Audio Pretraining (CLAP) embeddings.

The implementation is based on the msclap pypi package. Note: Instances of this class are not pickable.

For more information, see clap_sim().

compute() T_CLAPOut[source]
extra_repr() str[source]

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

full_state_update : ClassVar[bool | None] = False
get_output_names() tuple[str, ...][source]
higher_is_better : ClassVar[bool | None] = True
is_differentiable : ClassVar[bool | None] = False
max_value : ClassVar[float] = 1.0
min_value : ClassVar[float] = -1.0
reset() None[source]
training : bool
update(
candidates: list[str],
mult_references_or_audio_paths: list[list[str]] | list[str],
) None[source]