Perplexity of Language Model
Tags: #nlp #LLM #metricEquation
$$\text{PPL}(X) = \exp \{- \frac{1}{t} \sum^{t}_{i} \log p_{\theta} (x_{i} | x_{ \lt i}) \}$$Latex Code
\text{PPL}(X) = \exp \{- \frac{1}{t} \sum^{t}_{i} \log p_{\theta} (x_{i} | x_{ \lt i}) \}
Have Fun
Let's Vote for the Most Difficult Equation!
Introduction
$$ X $$: denotes the tokenized sequence of words with sequence length t, $$ X=(x_{0}, x_{1}, ..., x_{t}) $$
$$ \text{PPL}(X) $$: denotes the perplexity of a fixed length sequence of words.
$$ p_{\theta} (x_{i} | x_{ \lt i}) $$ : denotes the probability of language model calculating the next token $$x_{i}$$ given previous sequence of tokens preceding the i-th token $$ x_{ \lt i} $$.
References
Perplexity of fixed-length modelsWikipedia: Perplexity