There is numerous classical books:
- David MacKay's classic (for free here)
- Norvig's AIMA, of which a new version came out recently
- Bishop's Neural Networks for Pattern Recognition
- Bishop's Machine Learning and Pattern Recognition
The first two are the easiest, the second one covers more than machine learning. However, there is little "pragmatic" or "engineering" stuff in there. And the math is quite demanding, but so is the whole field. I guess you will do best with O'Reilly's programming collective intelligence because it has its focus on programming.