Cross-Attention in Transformer Architecture
[https://vaclavkosar.com/ml/cross-attention-in-transformer-architecture] - - public:isaac
Merge two embedding sequences regardless of modality, e.g., image with text in Stable Diffusion U-Net with encoder-decoder attention.