RECOMMEND
In this blog, we will summarize the latex code for most popular machine learning equations, including multiple distance measures, generative models, etc. There are various distance measurements of different data distribution, including KL-Divergence, JS-Divergence, Wasserstein Distance(Optimal Transport), Maximum Mean Discrepancy(MMD) and so on. We will provide the latex code for machine learning models in the following sections. We will also provide latex code of Generative Adversarial Networks(GAN), Variational AutoEncoder(VAE), Diffusion Models(DDPM) for generative models in the second section.
In this blog, we will give you a brief introduction of what are multimodal models and what can multimodal generative models accomplish. OpenAI just released their latest text-to-video multimodal generative model "SORA" in Feb, 2024 which becomes extremely popular. SORA can generate short videos of up to 1 minute's length. Before SORA, there are also many generative multi-modal models released by various companies, such as BLIP, BLIP2, FLAMINGO, FlaVA, etc. We will summarize a complete list of these time tested multi-modal generative models, introduce the model architures (text and image encoder), the training process, tasks, latex equation of loss functions, the Vision Language capabilities (such as text-to-image, text-to-video, text-to-audio, visual question answering), etc. Tag: Multimodal, AIGC, Large Language Model
OTHER
In this blog, we will summarize the latex code of equations for Diffusion Models, which are among the top-performing generative models, including GAN, VAE and flow-based models. The basic idea of diffusion models are to inject random noise to the feature vector in the forward process as markov chain models, and in the reverse process gradualy reconstruct the feature vector for generation. See below blogpost as reference for more details: Weng, Lilian. (Jul 2021). What are diffusion models? Lilâ??Log. lilianweng.github.io/posts/2021-07-11-diffusion-models/