Metrics
BLUE
Used to compare generated translations,
Compared n-grams generated translation to the n-grams of ground truth reference translation. n-grams essentially means chunks of n words.
Average of precision of n-grams over different values of n
ROUGE
Used for text summarization
Compared n-grams of generated summary to the n-grams of ground truth reference summary.
calculate recall and precision of n-grams then calculate the F1 score.
Can do this for difference value of n.
Perplexity
This is piece about relation between - perplexity and bits-per-character (from information in sentence perspective), this helps understand how perplexity/cross entropy relates to information-theoretic motivation.
Metric to evaluate language models.
Perplexity is an exponent of cross entropy loss of generated output.

Last updated