问题
I am trying to run Scrapy or Portia on a Microsoft Azure Web App. I have installed Scrapy by creating a virtual environment:
D:\Python27\Scripts\virtualenv.exe D:\home\Python
And then installed Scrapy:
D:\home\Python\Scripts\pip install Scrapy
The installation seemed to work. But executing a spider returns the following output:
D:\home\Python\Scripts\tutorial>d:\home\python\scripts\scrapy.exe crawl example 2015-09-13 23:09:31 [scrapy] INFO: Scrapy 1.0.3 started (bot: tutorial)
2015-09-13 23:09:31 [scrapy] INFO: Optional features available: ssl, http11
2015-09-13 23:09:31 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'}
2015-09-13 23:09:34 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
Unhandled error in Deferred:
2015-09-13 23:09:35 [twisted] CRITICAL: Unhandled error in Deferred:
Traceback (most recent call last):
File "D:\home\Python\lib\site-packages\scrapy\cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "D:\home\Python\lib\site-packages\scrapy\commands\crawl.py", line 57, in run
self.crawler_process.crawl(spname, **opts.spargs)
File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 153, in crawl
d = crawler.crawl(*args, **kwargs)
File "D:\home\Python\lib\site-packages\twisted\internet\defer.py", line 1274, in unwindGenerator
return _inlineCallbacks(None, gen, Deferred())
--- <exception caught here> ---
File "D:\home\Python\lib\site-packages\twisted\internet\defer.py", line 1128, in _inlineCallbacks
result = g.send(result)
File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 71, in crawl
self.engine = self._create_engine()
File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 83, in _create_engine
return ExecutionEngine(self, lambda _: self.stop())
File "D:\home\Python\lib\site-packages\scrapy\core\engine.py", line 66, in __init__
self.downloader = downloader_cls(crawler)
File "D:\home\Python\lib\site-packages\scrapy\core\downloader\__init__.py", line 65, in __init__
self.handlers = DownloadHandlers(crawler)
File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 23, in __init__
cls = load_object(clspath)
File "D:\home\Python\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "D:\Python27\Lib\importlib\__init__.py", line 37, in import_module
__import__(name)
File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in <module>
from .http import HTTPDownloadHandler
File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 5, in <module>
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 15, in <module>
from scrapy.xlib.tx import Agent, ProxyAgent, ResponseDone, \
File "D:\home\Python\lib\site-packages\scrapy\xlib\tx\__init__.py", line 3, in <module>
from twisted.web import client
File "D:\home\Python\lib\site-packages\twisted\web\client.py", line 42, in <module>
from twisted.internet.endpoints import TCP4ClientEndpoint, SSL4ClientEndpoint
File "D:\home\Python\lib\site-packages\twisted\internet\endpoints.py", line 34, in <module>
from twisted.internet.stdio import StandardIO, PipeAddress
File "D:\home\Python\lib\site-packages\twisted\internet\stdio.py", line 30, in <module>
from twisted.internet import _win32stdio
File "D:\home\Python\lib\site-packages\twisted\internet\_win32stdio.py", line 7, in <module>
import win32api
exceptions.ImportError: No module named win32api
2015-09-13 23:09:35 [twisted] CRITICAL:
The documentation http://doc.scrapy.org/en/latest/intro/install.html says that I have to install pywin32. I don't know how I can download/install it via command line since I am in the web app environment.
Is it even possible to run Scrapy or Portia on an Azure Web App or do I have to use a fully fledged Virtual Machine on Azure?
Thank you!
回答1:
You can't run general purpose Windows applications "on" an Azure Web App. Things that run on Azure as web apps have to be built specifically to do so. So, you have to use a full-fledged Virtual Machine on Azure.
It seems Azure Webapps can run some Python apps, if they are built on certain frameworks: https://azure.microsoft.com/en-us/documentation/articles/web-sites-python-configure/
来源:https://stackoverflow.com/questions/32555510/how-to-run-scrapy-portia-on-azure-web-app