aac_metrics.classes.mace module¶

class MACE( return_all_scores: bool = True, *, mace_method: Literal['text', 'audio', 'combined'] = 'text', penalty: float = 0.3, clap_model: str | CLAPWrapper = 'MS-CLAP-2023', seed: int | None = 42, echecker: str | BERTFlatClassifier = 'echecker_clotho_audiocaps_base', echecker_tokenizer: AutoTokenizer | None = None, error_threshold: float = 0.97, device: str | device | None = 'cuda_if_available', batch_size: int | None = 32, reset_state: bool = True, return_probs: bool = False, verbose: int = 0, )[source]¶

Bases: AACMetric[tuple[MACEScores, MACEScores] | Tensor]

Multimodal Audio-Caption Evaluation class (MACE).

MACE is a metric designed for evaluating automated audio captioning (AAC) systems. Unlike metrics that compare machine-generated captions solely to human references, MACE uses both audio and text to improve evaluation. By integrating both audio and text, it produces assessments that align better with human judgments.

The implementation is based on the mace original implementation (original author have accepted to include their code in aac-metrics under the MIT license). Note: Instances of this class are not pickable.

Paper: https://arxiv.org/pdf/2411.00321
Original author: Satvik Dixit
Original implementation: https://github.com/satvik-dixit/mace/tree/main

For more information, see mace().

compute() → tuple[MACEScores, MACEScores] | Tensor[source]¶

extra_repr() → str[source]¶

Return the extra representation of the module.

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

full_state_update: ClassVar[bool | None] = False¶

get_output_names() → tuple[str, ...][source]¶

higher_is_better: ClassVar[bool | None] = True¶

is_differentiable: ClassVar[bool | None] = False¶

max_value: ClassVar[float] = 1.0¶

min_value: ClassVar[float] = -1.0¶

reset() → None[source]¶

training: bool¶

update( candidates: list[str], mult_references: list[list[str]] | None = None, audio_paths: list[str] | None = None, ) → None[source]¶