aac_metrics.functional.spider_fl module¶

class SPIDErFLScores¶

Bases: dict

cider_d : Tensor¶

fer : Tensor¶

spice : Tensor¶

spider : Tensor¶

spider_fl : Tensor¶

spider_fl(candidates: list[str], mult_references: list[list[str]], return_all_scores: bool = True, *, n: int = 4, sigma: float = 6.0, tokenizer: ~typing.Callable[[str], list[str]] = <method 'split' of 'str' objects>, return_tfidf: bool = False, cache_path: str | ~pathlib.Path | None = None, java_path: str | ~pathlib.Path | None = None, tmp_path: str | ~pathlib.Path | None = None, n_threads: int | None = None, java_max_memory: str = '8G', timeout: None | int | ~typing.Iterable[int] = None, echecker: str | ~aac_metrics.functional.fer.BERTFlatClassifier = 'echecker_clotho_audiocaps_base', echecker_tokenizer: ~transformers.models.auto.tokenization_auto.AutoTokenizer | None = None, error_threshold: float = 0.9, device: str | ~torch.device | None = 'cuda_if_available', batch_size: int | None = 32, reset_state: bool = True, return_probs: bool = True, penalty: float = 0.9, verbose: int = 0) → tuple[SPIDErFLScores, SPIDErFLScores] | Tensor[source]¶

Combinaison of SPIDEr with Fluency Error detector.

Original implementation: https://github.com/felixgontier/dcase-2023-baseline/blob/main/metrics.py#L48.

Warning

This metric requires at least 2 candidates with 2 sets of references, otherwise it will raises a ValueError.

Parameters:¶

candidates: The list of sentences to evaluate.
mult_references: The list of list of sentences used as target.
return_all_scores: If True, returns a tuple containing the globals and locals scores. Otherwise returns a scalar tensor containing the main global score. defaults to True.
n: Maximal number of n-grams taken into account. defaults to 4.
sigma: Standard deviation parameter used for gaussian penalty. defaults to 6.0.
tokenizer: The fast tokenizer used to split sentences into words. defaults to str.split.
return_tfidf: If True, returns the list of dictionaries containing the tf-idf scores of n-grams in the sents_score output. defaults to False.
cache_path: The path to the external code directory. defaults to the value returned by get_default_cache_path().
java_path: The path to the java executable. defaults to the value returned by get_default_java_path().
tmp_path: Temporary directory path. defaults to the value returned by get_default_tmp_path().
n_threads: Number of threads used to compute SPICE. None value will use the default value of the java program. defaults to None.
java_max_memory: The maximal java memory used. defaults to “8G”.
timeout: The number of seconds before killing the java subprogram. If a list is given, it will restart the program if the i-th timeout is reached. If None, no timeout will be used. defaults to None.
echecker: The echecker model used to detect fluency errors. Can be “echecker_clotho_audiocaps_base”, “echecker_clotho_audiocaps_tiny”, “none” or None. defaults to “echecker_clotho_audiocaps_base”.
echecker_tokenizer: The tokenizer of the echecker model. If None and echecker is not None, this value will be inferred with echecker.model_type. defaults to None.
error_threshold: The threshold used to detect fluency errors for echecker model. defaults to 0.9.
device: The PyTorch device used to run pre-trained models. If “cuda_if_available”, it will use cuda if available. defaults to “cuda_if_available”.
batch_size: The batch size of the sBERT and echecker models. defaults to 32.
reset_state: If True, reset the state of the PyTorch global generator after the initialization of the pre-trained models. defaults to True.
return_probs: If True, return each individual error probability given by the fluency detector model. defaults to True.
penalty: The penalty coefficient applied. Higher value means to lower the cos-sim scores when an error is detected. defaults to 0.9.
verbose: The verbose level. defaults to 0.

Returns:¶

A tuple of globals and locals scores or a scalar tensor with the main global score.