Most modern non-embedded CPUs (x86 and the larger ARM cores, for example) have hardware instructions to compute square roots directly. The hardware implementation backing these instructions varies, but typically is a variant on the schoolbook digit-by-digit algorithm (though not always in base two; base four or sixteen can also be used). These are typically among the slowest basic arithmetic operations on a CPU; timings like 16-64 cycles are not uncommon, and these instructions are often not pipelined.
On CPUs that lack direct hardware square root instructions (Itanium, PPC, others), the typical approach is to generate an initial estimate (either with an instruction that produces the estimate, or with a lookup table) and then refine that estimate using an iterative method (Newton or Goldschmidt usually). You might track down some of Peter Markstein or Roger Golliver's writings on the subject if you're interested.
More complex mathematical functions (like trig operations) are typically computed by reducing the argument into some fundamental domain and then approximating it with a polynomial or rational function. You can look at the sources of any of several math libraries that are available online for more detail (fdlibm is a good starting point).
The x86 instruction set provides a number of instructions that support mathematical functions like exp, log, and sin, but these are not commonly used anymore, because good software library implementations give better performance.