Attention
Models

Writing

What are the most important components when you write a 5-paragraph essay?

What are some things you do not want to see in an essay?

Translation

What are the most important words this sentence for translation?

Simpson’s paradox is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined.

Attention Models

  • Attention Models

Attention Models

Attention models modify the Encoder-Decoder model by incorporating what is known as attention in the model.

Attention is defined as incorporating the inputs short-term memory from the encoder model to the decoders inputs.

Encoder-Decoder

An image displaying the Encoder-Decoder Architecture with corresponding LSTM cells.

LSTM Cell

An image displaying the three gates of a long short-term memory (LSTM) cell.

Attention Model

An image displaying an Encoder-Decoder architecture with attention being incorporated to the first (and only) LSTM cell.

Attention Model

An image displaying an Encoder-Decoder architecture with attention being incorporated to the second LSTM cell.

Attention Model

An image displaying an Encoder-Decoder architecture with attention being incorporated to the third LSTM cell.

Attention Model

An image displaying an Encoder-Decoder architecture with attention being incorporated to the third LSTM cell.

StatQuest

Youtube Video