I\'m new to machine learning and neural networks. I know how to build a nonlinear classification model, but my current problem has a continuous output. I\'ve been searching for
I'm sorry but the current answers are either incorrect or incomplete. The activation function is NOT necessarily what makes a neural network non-linear
To understand this we need to realize that we are talking about non-linearity in the parameters. For example, notice that the following regression predicted values are considered linear predictions, despite non-linear transformations of the inputs because the parameters are linear:
Now for simplicity, let us consider a single neuron, single layer neural network:
If the transfer function is linear then:
As you have already probably noticed, this is basically a linear regression. Even if we were to add multiple inputs and neurons, each with a linear activation function, we would now only have an ensemble of regressions (all linear in their parameters and therefore this simple neural network is linear):
Now going back to (3), let's add a layer, so that we have a neural network with 2 layers, one neuron each (both with linear activation functions):
(first layer)
(second layer)
Now notice:
Reduces to:
Where and
Which means that our two layered network (each with a single neuron) is not linear in its parameters despite every activation function in the network being linear.
Thus the answer to your question, "what makes a neural network non-linear" is: non-linearity in the parameters.
This non-linearity in the parameters comes about two ways: 1) having more than one layer with neurons in your network (as exhibited above), and 2) having non-linear activation functions.
For an example on non-linearity coming about through non-linear activation functions, suppose our input space, weights, and biases are all constrained such that they are all strictly positive (for simplicity). Now using (2) (single layer, single neuron) and the activation function , we have the following:
Which Reduces to:
Where , , and
Now, ignoring what issues this neural network has, it should be clear, that at the very least, it is non-linear in the parameters and that non-linearity has been introduced solely by choice of the activation function.
Finally, yes neural networks can model complex data structures that cannot be modeled by using linear models (e.g. simple regression). For an example on this see the classic XOr problem: https://medium.com/@jayeshbahire/the-xor-problem-in-neural-networks-50006411840b