If the PDF is a scans of printed text, it will be hard (involves image processing, character recognizing etc.) to do it yourself. PDF will generally store the scanned documents as JPEGs internally. You are better of using a third party tool (OCR tool) that does this.