BLEU Bilingual Evaluation Understudy
Tags: #nlp #BLEU #evaluationEquation
$$ \text{BLEU}_{w}(\hat{S},S)=BP(\hat{S};S) \times \exp{\sum^{\infty}_{n=1}w_{n} \ln p_{n}(\hat{S};S)}, p_{n}(\hat{S};S)=\frac{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}),\max_{y \in S_{i}} C(s,y))}{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, p_{n}(\hat{y};y)=\frac{\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}), C(s,y))}{\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, BP(\hat{S};S) = e^{-(r/c-1)^{+}}$$Latex Code
\text{BLEU}_{w}(\hat{S},S)=BP(\hat{S};S) \times \exp{\sum^{\infty}_{n=1}w_{n} \ln p_{n}(\hat{S};S)}, p_{n}(\hat{S};S)=\frac{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}),\max_{y \in S_{i}} C(s,y))}{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, p_{n}(\hat{y};y)=\frac{\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}), C(s,y))}{\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, BP(\hat{S};S) = e^{-(r/c-1)^{+}}
Have Fun
Let's Vote for the Most Difficult Equation!
Introduction
BLEU (Bilingual Evaluation Understudy)
Equation
$$\text{BLEU}_{w}(\hat{S},S)=BP(\hat{S};S) \times \exp{\sum^{\infty}_{n=1}w_{n} \ln p_{n}(\hat{S};S)}$$ $$p_{n}(\hat{S};S)=\frac{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}),\max_{y \in S_{i}} C(s,y))}{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}$$ $$p_{n}(\hat{y};y)=\frac{\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}), C(s,y))}{\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}$$ $$BP(\hat{S};S) = e^{-(r/c-1)^{+}}$$
Latex Code
\text{BLEU}_{w}(\hat{S},S)=BP(\hat{S};S) \times \exp{\sum^{\infty}_{n=1}w_{n} \ln p_{n}(\hat{S};S)}, p_{n}(\hat{S};S)=\frac{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}),\max_{y \in S_{i}} C(s,y))}{\sum^{M}_{i=1}\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, p_{n}(\hat{y};y)=\frac{\sum_{s \in G_{n}(\hat{y})} \min(C(s,\hat{y}), C(s,y))}{\sum_{s \in G_{n}(\hat{y})}C(s, \hat{y})}, BP(\hat{S};S) = e^{-(r/c-1)^{+}}
Explanation
$$\hat{S}$$: denotes the candidate corpus, $$S$$ denotes the reference corpus; $$P_{n}(\hat{S};S)$$: Modified N-Gram Precision, which is the generalization of single sentence pair n-gram precision as $$p_{n}(\hat{y};y)$$ $$C(s, \hat{y})$$: C denotes the substring count, which is the number of n-substrings in $$\hat{y}$$ that appears in y. $$G_{n}(y)$$: $$G_{n}(y)$$ denotes the set of N-Gram in the sentence y; $$BP(S;S)$$: denotes the brevity penalty, which penalize the condition that candidate string which contains the n-grams as few times as possible.
Reply