I want to prevent automated html scraping from one of our sites while not affecting legitimate spidering (googlebot, etc.). Is there something that already exists to accomplish
robots.txt only works if the spider honors it. You can create a HttpModule to filter out spiders that you don't want crawling your site.