Transformer

Tags: #machine learning #nlp #gpt

Equation

$$\text{Attention}(Q, K, V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V$$

Latex Code

                                 \text{Attention}(Q, K, V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V
                            

Have Fun

Let's Vote for the Most Difficult Equation!

Introduction

Equation



Latex Code

        \text{Attention}(Q, K, V) = \text{softmax}(\frac{QK^T}{\sqrt{d_k}})V
        

Explanation

Related Documents

Related Videos

Comments

Write Your Comment

Upload Pictures and Videos