aac_metrics.functional.spider_fl module

class SPIDErFLScores

Bases: dict

cider_d: Tensor
fer: Tensor
spice: Tensor
spider: Tensor
spider_fl: Tensor
spider_fl(candidates: list[str], mult_references: list[list[str]], return_all_scores: bool = True, *, n: int = 4, sigma: float = 6.0, tokenizer: ~typing.Callable[[str], list[str]] = <method 'split' of 'str' objects>, return_tfidf: bool = False, cache_path: str | ~pathlib.Path | None = None, java_path: str | ~pathlib.Path | None = None, tmp_path: str | ~pathlib.Path | None = None, n_threads: int | None = None, java_max_memory: str = '8G', timeout: None | int | ~typing.Iterable[int] = None, echecker: str | ~aac_metrics.functional.fer.BERTFlatClassifier = 'echecker_clotho_audiocaps_base', echecker_tokenizer: ~transformers.models.auto.tokenization_auto.AutoTokenizer | None = None, error_threshold: float = 0.9, device: str | ~torch.device | None = 'cuda_if_available', batch_size: int | None = 32, reset_state: bool = True, return_probs: bool = True, penalty: float = 0.9, verbose: int = 0) tuple[SPIDErFLScores, SPIDErFLScores] | Tensor[source]

Combinaison of SPIDEr with Fluency Error detector.

Warning

This metric requires at least 2 candidates with 2 sets of references, otherwise it will raises a ValueError.

Parameters:
  • candidates – The list of sentences to evaluate.

  • mult_references – The list of list of sentences used as target.

  • return_all_scores – If True, returns a tuple containing the globals and locals scores. Otherwise returns a scalar tensor containing the main global score. defaults to True.

  • n – Maximal number of n-grams taken into account. defaults to 4.

  • sigma – Standard deviation parameter used for gaussian penalty. defaults to 6.0.

  • tokenizer – The fast tokenizer used to split sentences into words. defaults to str.split.

  • return_tfidf – If True, returns the list of dictionaries containing the tf-idf scores of n-grams in the sents_score output. defaults to False.

  • cache_path – The path to the external code directory. defaults to the value returned by get_default_cache_path().

  • java_path – The path to the java executable. defaults to the value returned by get_default_java_path().

  • tmp_path – Temporary directory path. defaults to the value returned by get_default_tmp_path().

  • n_threads – Number of threads used to compute SPICE. None value will use the default value of the java program. defaults to None.

  • java_max_memory – The maximal java memory used. defaults to “8G”.

  • timeout – The number of seconds before killing the java subprogram. If a list is given, it will restart the program if the i-th timeout is reached. If None, no timeout will be used. defaults to None.

  • echecker – The echecker model used to detect fluency errors. Can be “echecker_clotho_audiocaps_base”, “echecker_clotho_audiocaps_tiny”, “none” or None. defaults to “echecker_clotho_audiocaps_base”.

  • echecker_tokenizer – The tokenizer of the echecker model. If None and echecker is not None, this value will be inferred with echecker.model_type. defaults to None.

  • error_threshold – The threshold used to detect fluency errors for echecker model. defaults to 0.9.

  • device – The PyTorch device used to run pre-trained models. If “cuda_if_available”, it will use cuda if available. defaults to “cuda_if_available”.

  • batch_size – The batch size of the sBERT and echecker models. defaults to 32.

  • reset_state – If True, reset the state of the PyTorch global generator after the initialization of the pre-trained models. defaults to True.

  • return_probs – If True, return each individual error probability given by the fluency detector model. defaults to True.

  • penalty – The penalty coefficient applied. Higher value means to lower the cos-sim scores when an error is detected. defaults to 0.9.

  • verbose – The verbose level. defaults to 0.

Returns:

A tuple of globals and locals scores or a scalar tensor with the main global score.