Machine learning is simply a generic term to define a variety of learning algorithms that produce a quasi learning from examples (unlabeled/labeled). The actual accuracy/error is entirely determined by the quality of training/test data you provide to your learning algorithm. This can be measured using a convergence rate. The reason you provide examples is because you want the learning algorithm of your choice to be able to informatively by guidance make generalization. The algorithms can be classed into two main areas supervised learning(classification) and unsupervised learning(clustering) techniques. It is extremely important that you make an informed decision on how you plan on separating your training and test data sets as well as the quality that you provide to your learning algorithm. When you providing data sets you want to also be aware of things like over fitting and maintaining a sense of healthy bias in your examples. The algorithm then basically learns wrote to wrote on the basis of generalization it achieves from the data you have provided to it both for training and then for testing in process you try to get your learning algorithm to produce new examples on basis of your targeted training. In clustering there is very little informative guidance the algorithm basically tries to produce through measures of patterns between data to build related sets of clusters e.g kmeans/knearest neighbor.
some good books:
Introduction to ML (Nilsson/Stanford),
Gaussian Process for ML,
Introduction to ML (Alpaydin),
Information Theory Inference and Learning Algorithms (very useful book),
Machine Learning (Mitchell),
Pattern Recognition and Machine Learning (standard ML course book at Edinburgh and various Unis but relatively a heavy reading with math),
Data Mining and Practical Machine Learning with Weka (work through the theory using weka and practice in Java)
Reinforcement Learning there is a free book online you can read:
http://www.cs.ualberta.ca/~sutton/book/ebook/the-book.html
IR, IE, Recommenders, and Text/Data/Web Mining in general use alot of Machine Learning principles. You can even apply Metaheuristic/Global Optimization Techniques here to further automate your learning processes. e.g apply an evolutionary technique like GA (genetic algorithm) to optimize your neural network based approach (which may use some learning algorithm). You can approach it purely in form of a probablistic machine learning approach for example bayesian learning. Most of these algorithms all have a very heavy use of statistics. Concepts of convergence and generalization are important to many of these learning algorithms.