Vector
Intro
Vector is the most fundamental concept in linear algebra. It is also used universally in machine learning algorithms. One simple application is to write our familiar simultaneous equation into its vector form.
Equation (1) can be written as,
This vector form of equation is what we will use most of the time to write out machine learning equations.
In addition, we can summarize an object’s features in a vector. For example, if we have a house of 120 square meter with 2 bedrooms and 1 bathroom located at city center, we can specify the house parameters as
The first entry represents the area of the house. The second and third entry represent the number of bedrooms and bathrooms respectively in the house. The last entry is a boolean (0 or 1) value that indicates if the house is at the city center (value 1) or outside the city center (value 0). This vectorized express can then be used as input to our machine learning algorithms.
Lastly, we also express our model parameters as a vector. In a normal distribution, we have two parameters and that specify its center and spread. So a normal distribution is represented as
In machine learning, we will keep on optimizing the model parameters so that they can better fit the actual data. During this process, we are effectively updating and of our model vector in an iterative manner.
In this chapter, we will cover some essential topics on vector. We will start with explaining the basic vector operations. Then we will introduce one of the most important vector operation called dot product. After that, we will see how to use dot product to calculate angles between two vectors as well as how to perform vector projections. Lastly, we will discuss the basis which vectors are referenced to and how to change the vector basis.
Basic Vector Operation
There are 4 basic operations on vectors namely, addition, subtraction, scaler multiplication and modulus.
To explain these operations, we first define two vectors and where
Plot and on a graph
Addition
To add vector and , we add each element of vector and together.
This can be shown graphically as follows. We shift vector parallel to the end of vector , connecting the end of vector to the beginning of the new vector . The resultant vector is then from the beginning of vector to the end of vector .
It is also worth noting that vector addition is associative, i.e. .
Subtraction
Vector subtraction is similar to vector addition. We subtract each element of a vector from that of the other vector.
To solve this graphically, there are essentially two steps involved. First, we calculate the negative of vector ,
Then, we perform a normal vector addition of vector and vector . Vector is shifted parallel to the end of vector to form the resultant vector. This is illustrated in the graph below.
Scalar Multiplication
Scalar multiplication calculates the multiple of a vector. Again, element-wise multiplication is performed.
It is equivalent to performing vector addition multiple times
Graphically, this is to extend the vector along the line where this vector lies.
Multiplying a vector by a negative scalar works almost the same way, except that now the vector is extending in the opposite direction.
Modulus
Lastly, modulus of a vector is the length of this vector. By Pythagoras Theorem, the square of the hypotenuse (the longest side) is equal to the sum of squares of the other two sides. To calculate the modulus of vector , we know that has a horizontal length of 4 and a vertical length of 3. Therefore,
The modulus operation is represented by two vertical bars (|) enclosing the vector, . This operation is not limited to 2-dimensional space. The way to calculate the modulus of a vector with more dimensions is the same - get the square root of sum of squares of the vector components.
That is all you need to know about the basic vector operations. Let’s move on to our next topic for more advanced vector operations.
Dot Product
Dot product, or sometimes is called inner product, is one of the most important vector operations. You are going to see it a lot later when we dive into the derivation of different machine learning algorithms. It also preludes how we calculate angles between two vectors and projection of one vector onto the other.
We have learnt that scalar multiplication multiplies a vector by a scalar. Dot product, on the other hand multiplies a vector by another vector. In general, for vector and vector , dot product () evaluates to:
In our example of previously defined vectors and ,
There are three properties of dot product operation.
- Dot products are commutative, so .
- Dot products are distributive, so .
- Dot products are not associative, so .
It is interesting to note that dot product of a vector by itself is equal to the square of its modulus.
Calculate Angle Between Two Vectors
Now we are ready to derive the angle between two vectors using what we have learnt on dot product.
First, let’s refresh our memory on cosine rule. Given the lengths of two sides of a triangle (a and b) and the angle between them (θ), we can calculate the length of the opposite side (c) using following formula.
If we take side a as our vector and side b as our vector , then side c is vector as shown below.
Equation (3) can be rewritten as
Because we learned from previous equation (2) that square of modulus of a vector equals to the inner product of the vector by itself.
So now we have , , and , how can we calculate the angle θ between and ?
On the left hand side of equation (4) we can convert to a dot product as
Substitute this back to equation (4), we get
Therefore, angle θ between vector and can be calculated by dot product of and and modulus of and .
We are also interested in some special angle θ between and . For example
- When θ = 0, and are in the same direction.
- When θ = 90, and are orthogonal to each other. $\cos\theta=\frac{r\cdot s}{|r|*|s|}=$0
- When θ = 180, and are in the opposite directions.
Vector Projection
Another important concept in vectors is projection. For vector and , we can draw a line from to such that is perpendicular to . The length represents the projection of vector on vector .
We know that
Substitute equation (6) into (5)
is called the scalar projection of vector onto . It has only magnitude, but no direction. In order to find the direction of the projection, we need to use following formula
is the unit length vector in the direction of . Multiplying the scalar projection with unit length vector gives us the projection in the direction of . This is called the vector projection of onto .
With this, we have concluded our discussion on dot product operation on vectors and the calculation of angle and projection between two vectors. In the next topic, we will see these concepts in action when they are applied to changing basis of a vector
Changing Basis
So far we have only seen vectors in their own coordinate systems. It would be worthwhile to define the coordinate system or basis where our vectors are referenced to.
We can express a 2-dimensional vector as the sum of two basis vectors. For our vector , we can define 2 basis vectors and $ e_2=\begin{pmatrix}0\1\end{pmatrix}$such that .
However, the choice of basis vectors and is arbitrary. It depends entirely on how coordinate systems are set up. You might want to have 2 basis vectors that are of unequal lengths or are not orthogonal to each other. Let’s see what happens when we change the basis vectors to a different one.
For example, we can define a new set of basis vectors and . and are defined on the basis vector and . What is our vector expressed in and now?
This is where vector projection comes into play. We need to calculate the vector projection of on the new basis vector and respectively.
To calculate the vector projection of onto ,
gives us the scalar projection of onto . By dividing that by the magnitude of , we know that the projection is of length of , thus the vector projection of onto is .
Similarly, we can calculate the vector projection of onto .
So our vector can be expressed as a vector sum of and .
If we evaluate this expression by substituting vectors and in the original and basis, we will get back our original vector .
Note here and are orthogonal to each other. We can verify this by calculating the cosine of angle θ between and .
Since , .
So we have successfully converted the basis for our vector from the original basis vectors and to the new basis vectors and . This method of change basis works as long as the new basis vectors are orthogonal to each other. For a more general case where the new basis vectors can have any angle between them involves a different matrix operation which will be covered in our next chapter.
When we extend this method to 3 or more dimensional space, it is critical that the additional basis vector is not a linear combination of existing ones. This property is called linearly independent. It means we cannot find a value ⍺ and β that satisfies the linear equation below, so does not lie on the same plane as and .
That is it! We have completed our discussion on vectors in linear algebra. You have built a solid foundation for what we will explore further in future chapters.
(Inspired by Mathematics for Machine Learning lecture series from Imperial College London)
来源:CSDN
作者:Lin D.
链接:https://blog.csdn.net/datascientistlin/article/details/103874756