I am trying to implement zooming on a canvas which should focus on a pivot point. Zooming works fine, but afterwards the user should be able to select elements on the canvas
Okay, so you're basically trying to figure out where a certain screen X/Y coordinate corresponds to, after the view has been scaled (s) around a certain pivot point {Px, Py}.
So, let's try to break it down.
For the sake of argument, lets assume that Px & Py = 0, and that s = 2. This means the view was zoomed by a factor of 2, around the top left corner of the view.
In this case, the screen coordinate {0, 0} corresponds to {0, 0} in the view, because that point is the only point which hasn't changed. Generally speaking, if the screen coordinate is equal to the pivot point, then there is no change.
What happens if the user clicks on some other point, lets say {2, 3}? In this case, what was once {2, 3} has now moved by a factor of 2 from the pivot point (which is {0, 0}), and so the corresponding position is {4, 6}.
All this is easy when the pivot point is {0, 0}, but what happens when it's not?
Well, lets look at another case - the pivot point is now the bottom right corner of the view (Width = w, Height = h - {w, h}). Again, if the user clicks at the same position, then the corresponding position is also {w, h}, but lets say the user clicks on some other position, for example {w - 2, h - 3}? The same logic occurs here: The translated position is {w - 4, h - 6}.
To generalize, what we're trying to do is convert the screen coordinates to the translated coordinate. We need to perform the same action on this X/Y coordinate we received that we performed on every pixel in the zoomed view.
Step 1 - we'd like to translate the X/Y position according to the pivot point:
X = X - Px
Y = Y - Py
Step 2 - Then we scale X & Y:
X = X * s
Y = Y * s
Step 3 - Then we translate back:
X = X + Px
Y = Y + Py
If we apply this to the last example I gave (I will only demonstrate for X):
Original value: X = w - 2, Px = w
Step 1: X <-- X - Px = w - 2 - w = -2
Step 2: X <-- X * s = -2 * 2 = -4
Step 3: X <-- X + Px = -4 + w = w - 4
Once you apply this to any X/Y you receive which is relevant prior to the zoom, the point will be translated so that it is relative to the zoomed state.
Hope this helps.