Metrics

BLUE

Used to compare generated translations,

Compared n-grams generated translation to the n-grams of ground truth reference translation. n-grams essentially means chunks of n words.

Average of precision of n-grams over different values of n

ROUGE

Used for text summarization

Compared n-grams of generated summary to the n-grams of ground truth reference summary.

calculate recall and precision of n-grams then calculate the F1 score.

Can do this for difference value of n.

Perplexity

This is piece about relation between - perplexity and bits-per-character (from information in sentence perspective), this helps understand how perplexity/cross entropy relates to information-theoretic motivation.

Metric to evaluate language models.

Perplexity is an exponent of cross entropy loss of generated output.

Last updated