I am trying to implement the explanation for attention in this video using pytorch (https://www.youtube.com/watch?v=yGTUuEx3GkA) . The main idea can be seen on the slide at 12:3