aac_metrics.functional.spider_fl module¶

class SPIDErFLScores¶

Bases: dict

cider_d: Tensor¶

fer: Tensor¶

spice: Tensor¶

spider: Tensor¶

spider_fl: Tensor¶

spider_fl(candidates: list[str], mult_references: list[list[str]], return_all_scores: bool = True, *, n: int = 4, sigma: float = 6.0, tokenizer: ~typing.Callable[[str], list[str]] = <method 'split' of 'str' objects>, return_tfidf: bool = False, cache_path: str | ~pathlib.Path | None = None, java_path: str | ~pathlib.Path | None = None, tmp_path: str | ~pathlib.Path | None = None, n_threads: int | None = None, java_max_memory: str = '8G', timeout: None | int | ~typing.Iterable[int] = None, echecker: str | ~aac_metrics.functional.fer.BERTFlatClassifier = 'echecker_clotho_audiocaps_base', echecker_tokenizer: ~transformers.models.auto.tokenization_auto.AutoTokenizer | None = None, error_threshold: float = 0.9, device: str | ~torch.device | None = 'cuda_if_available', batch_size: int | None = 32, reset_state: bool = True, return_probs: bool = True, penalty: float = 0.9, verbose: int = 0) → tuple[SPIDErFLScores, SPIDErFLScores] | Tensor[source]¶

Combinaison of SPIDEr with Fluency Error detector.

Original implementation: https://github.com/felixgontier/dcase-2023-baseline/blob/main/metrics.py#L48.

Warning

This metric requires at least 2 candidates with 2 sets of references, otherwise it will raises a ValueError.

Parameters:

candidates – The list of sentences to evaluate.
mult_references – The list of list of sentences used as target.
return_all_scores – If True, returns a tuple containing the globals and locals scores. Otherwise returns a scalar tensor containing the main global score. defaults to True.
n – Maximal number of n-grams taken into account. defaults to 4.
sigma – Standard deviation parameter used for gaussian penalty. defaults to 6.0.
tokenizer – The fast tokenizer used to split sentences into words. defaults to str.split.
return_tfidf – If True, returns the list of dictionaries containing the tf-idf scores of n-grams in the sents_score output. defaults to False.
cache_path – The path to the external code directory. defaults to the value returned by get_default_cache_path().
java_path – The path to the java executable. defaults to the value returned by get_default_java_path().
tmp_path – Temporary directory path. defaults to the value returned by get_default_tmp_path().
n_threads – Number of threads used to compute SPICE. None value will use the default value of the java program. defaults to None.
java_max_memory – The maximal java memory used. defaults to “8G”.
timeout – The number of seconds before killing the java subprogram. If a list is given, it will restart the program if the i-th timeout is reached. If None, no timeout will be used. defaults to None.
echecker – The echecker model used to detect fluency errors. Can be “echecker_clotho_audiocaps_base”, “echecker_clotho_audiocaps_tiny”, “none” or None. defaults to “echecker_clotho_audiocaps_base”.
echecker_tokenizer – The tokenizer of the echecker model. If None and echecker is not None, this value will be inferred with echecker.model_type. defaults to None.
error_threshold – The threshold used to detect fluency errors for echecker model. defaults to 0.9.
device – The PyTorch device used to run pre-trained models. If “cuda_if_available”, it will use cuda if available. defaults to “cuda_if_available”.
batch_size – The batch size of the sBERT and echecker models. defaults to 32.
reset_state – If True, reset the state of the PyTorch global generator after the initialization of the pre-trained models. defaults to True.
return_probs – If True, return each individual error probability given by the fluency detector model. defaults to True.
penalty – The penalty coefficient applied. Higher value means to lower the cos-sim scores when an error is detected. defaults to 0.9.
verbose – The verbose level. defaults to 0.

Returns:

A tuple of globals and locals scores or a scalar tensor with the main global score.