X

Cheatsheet of Latex Code for Most Popular Machine Learning Equations

Navigation

In this blog, we will summarize the latex code for most popular machine learning equations, including multiple distance measures, generative models, etc. There are various distance measurements of different data distribution, including KL-Divergence, JS-Divergence, Wasserstein Distance(Optimal Transport), Maximum Mean Discrepancy(MMD) and so on. We will provide the latex code for machine learning models in the following sections. We will also provide latex code of Generative Adversarial Networks(GAN), Variational AutoEncoder(VAE), Diffusion Models(DDPM) for generative models in the second section.

    Generative Models

  • Generative Adversarial Networks(GAN)

    Equation


    Latex Code
            \min_{G} \max_{D} V(D,G)=\mathbb{E}_{x \sim p_{data}(x)}[\log D(x)]+\mathbb{E}_{z \sim p_{z}(z)}[\log(1-D(G(z)))]
            
    Explanation

    GAN latex code is illustrated above. See paper for more details Generative Adversarial Networks

  • Variational AutoEncoder(VAE)

    Estimating the Log-likelihood and Posterior
    Equation


    Latex Code
            \log p_{\theta}(x)=\mathbb{E}_{q_{\phi}(z|x)}[\log p_{\theta}(x)] \\
            =\mathbb{E}_{q_{\phi}(z|x)}[\log \frac{p_{\theta}(x,z)}{p_{\theta}(z|x)}] \\
            =\mathbb{E}_{q_{\phi}(z|x)}[\log [\frac{p_{\theta}(x,z)}{q_{\phi}(z|x)} \times \frac{q_{\phi}(z|x)}{p_{\theta}(z|x)}]] \\
            =\mathbb{E}_{q_{\phi}(z|x)}[\log [\frac{p_{\theta}(x,z)}{q_{\phi}(z|x)} ]] +D_{KL}(q_{\phi}(z|x) || p_{\theta}(z|x))\\
            
    Explanation

    Evidence Lower Bound
    Equation


    Latex Code
                \mathbb{L}_{\theta,\phi}(\mathbf{x})=\mathbb{E}_{q_{\phi}(\mathbf{z}|\mathbf{x})}[\log p_{\theta}(\mathbf{x},\mathbf{z})-\log q_{\phi}(\mathbf{z}|\mathbf{x}) ]
            
    Explanation

    Reparameterization trick
    Equation

    Latex Code
                z = \mu + \epsilon \cdot \sigma
            
    Explanation

    VAE latex code is illustrated above. See paper for more details Auto-Encoding Variational Bayes

  • Diffusion Models(DDPM)

    Explanation

    See paper Denoising Diffusion Probabilistic Models for more details. See reference of the following blogpost https://lilianweng.github.io/posts/2021-07-11-diffusion-models/

    1.1 Forward Process
    Equation


    Latex Code
                q(x_{t}|x_{t-1})=\mathcal{N}(x_{t};\sqrt{1-\beta_{t}}x_{t-1},\beta_{t}I) \\q(x_{1:T}|x_{0})=\prod_{t=1}^{T}q(x_{t}|x_{t-1})
            
    1.2 Forward Process Reparameterization Trick
    Equation


    Latex Code
                x_{t}=\sqrt{\alpha_{t}}x_{t-1}+\sqrt{1-\alpha_{t}} \epsilon_{t-1}\\=\sqrt{\alpha_{t}\alpha_{t-1}}x_{t-2} + \sqrt{1-\alpha_{t}\alpha_{t-1}} \bar{\epsilon}_{t-2}\\=\text{...}\\=\sqrt{\bar{\alpha}_{t}}x_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon \\\alpha_{t}=1-\beta_{t}, \bar{\alpha}_{t}=\prod_{t=1}^{T}\alpha_{t}
            
    1.3 Reverse Process


    Latex Code
                p_\theta(\mathbf{x}_{0:T}) = p(\mathbf{x}_T) \prod^T_{t=1} p_\theta(\mathbf{x}_{t-1} \vert \mathbf{x}_t) \\
                p_\theta(\mathbf{x}_{t-1} \vert \mathbf{x}_t) = \mathcal{N}(\mathbf{x}_{t-1}; \boldsymbol{\mu}_\theta(\mathbf{x}_t, t), \boldsymbol{\Sigma}_\theta(\mathbf{x}_t, t))
            
    1.4 Reverse Process Variational Lower Bound


    Latex Code
                \begin{aligned}
                - \log p_\theta(\mathbf{x}_0) 
                &\leq - \log p_\theta(\mathbf{x}_0) + D_\text{KL}(q(\mathbf{x}_{1:T}\vert\mathbf{x}_0) \| p_\theta(\mathbf{x}_{1:T}\vert\mathbf{x}_0) ) \\
                &= -\log p_\theta(\mathbf{x}_0) + \mathbb{E}_{\mathbf{x}_{1:T}\sim q(\mathbf{x}_{1:T} \vert \mathbf{x}_0)} \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T}) / p_\theta(\mathbf{x}_0)} \Big] \\
                &= -\log p_\theta(\mathbf{x}_0) + \mathbb{E}_q \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} + \log p_\theta(\mathbf{x}_0) \Big] \\
                &= \mathbb{E}_q \Big[ \log \frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \\
                \text{Let }L_\text{VLB} 
                &= \mathbb{E}_{q(\mathbf{x}_{0:T})} \Big[ \log \frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \geq - \mathbb{E}_{q(\mathbf{x}_0)} \log p_\theta(\mathbf{x}_0)
                \end{aligned}
            
    1.5 Reverse Process Variational Lower Bound Decomposition Multiple KL-Divergence

    $$begin{aligned}L_\text{VLB} &= \mathbb{E}_{q(\mathbf{x}_{0:T})} \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \\&= \mathbb{E}_q \Big[ \log\frac{\prod_{t=1}^T q(\mathbf{x}_t\vert\mathbf{x}_{t-1})}{ p_\theta(\mathbf{x}_T) \prod_{t=1}^T p_\theta(\mathbf{x}_{t-1} \vert\mathbf{x}_t) } \Big] \\&= \mathbb{E}_q [\underbrace{D_\text{KL}(q(\mathbf{x}_T \vert \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_T))}_{L_T} + \sum_{t=2}^T \underbrace{D_\text{KL}(q(\mathbf{x}_{t-1} \vert \mathbf{x}_t, \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_{t-1} \vert\mathbf{x}_t))}_{L_{t-1}} \underbrace{- \log p_\theta(\mathbf{x}_0 \vert \mathbf{x}_1)}_{L_0} ]\end{aligned}$$
    Latex Code
                \begin{aligned}L_\text{VLB} &= \mathbb{E}_{q(\mathbf{x}_{0:T})} \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \\&= \mathbb{E}_q \Big[ \log\frac{\prod_{t=1}^T q(\mathbf{x}_t\vert\mathbf{x}_{t-1})}{ p_\theta(\mathbf{x}_T) \prod_{t=1}^T p_\theta(\mathbf{x}_{t-1} \vert\mathbf{x}_t) } \Big] \\&= \mathbb{E}_q [\underbrace{D_\text{KL}(q(\mathbf{x}_T \vert \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_T))}_{L_T} + \sum_{t=2}^T \underbrace{D_\text{KL}(q(\mathbf{x}_{t-1} \vert \mathbf{x}_t, \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_{t-1} \vert\mathbf{x}_t))}_{L_{t-1}} \underbrace{- \log p_\theta(\mathbf{x}_0 \vert \mathbf{x}_1)}_{L_0} ]\end{aligned}
            
    1.6 Reverse Process Variational Lower Bound Loss Function


    Latex Code
                \begin{aligned}
                L_\text{VLB} &= L_T + L_{T-1} + \dots + L_0 \\
                \text{where } L_T &= D_\text{KL}(q(\mathbf{x}_T \vert \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_T)) \\
                L_t &= D_\text{KL}(q(\mathbf{x}_t \vert \mathbf{x}_{t+1}, \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_t \vert\mathbf{x}_{t+1})) \text{ for }1 \leq t \leq T-1 \\
                L_0 &= - \log p_\theta(\mathbf{x}_0 \vert \mathbf{x}_1)
                \end{aligned}
            

Comments


  • Victordwc 2025-07-01 01:38

    Where is admin? It is important. Thank.


  • Victorvnq 2025-07-01 07:03

    Can I contact admin?? I'ts important. Regards.


  • Victorvnq 2025-07-01 12:02

    Доброго времени суток . Ваш форум мне показался очень привлекательным и перспективным. Хочу приобрести рекламное место для баннера в шапке, за $1500 в месяц. Оплачивать буду через WebMoney, 50% сразу, а 50% через 2 недели. И еще, адрес моего сайта https://vika-service.by - он не будет противоречить тематике? Спасибо! Напишите о Вашем решении мне в ПМ или на почту zalevskija22201@gmail.com


  • Victorynz 2025-07-01 19:30

    Доброго времени суток . Ваш форум мне показался очень привлекательным и перспективным. Хочу приобрести рекламное место для баннера в шапке, за $1500 в месяц. Оплачивать буду через WebMoney, 50% сразу, а 50% через 2 недели. И еще, адрес моего сайта https://vika-service.by - он не будет противоречить тематике? Спасибо! Напишите о Вашем решении мне в ПМ или на почту zalevskija22201@gmail.com


  • Michealsem 2025-07-30 18:35

    DeflationCoin - Invest in ecosystem with automatic value growth Colleagues! Presenting a unique investment opportunity - DeflationCoin is raising Series A $1-6M to complete an ecosystem of 7 revenue-generating products. We're not just making another token, but building a full entertainment conglomerate with its own currency. Our model is brilliantly simple - each product (casino, premium dating, exchange, social network) generates profits in dollars, and 20% automatically goes to buying back our own tokens. This creates a closed growth loop: more users = more profits = more buybacks = higher token price = higher company valuation. We've already shown proof-of-concept - token grew from $0.000001 to $0.31 on trading fees alone, with market cap reaching $6.58M. Now launching core products and expecting exponential growth. By investing now, you get equity in a company at $10M valuation that could grow to $1B+ in 3-5 years. This is a rare case where a crypto project has clear revenue streams from proven industries - gambling, dating and finance always work. We're simply bringing them to crypto with innovative tokenomics. Inviting participation in the round before it closes. https://deflationcoin.com/

Write Your Comment

Upload Pictures and Videos