I have a method that scrapes a web page and saves data into a file (see below for an example code). I need to test that the resulting data is well-formed.
The problem is, the data is received from a series of calls, and further calls use the results of previous ones. What is worse, many of the calls involved are done on the same objects (a Webdriver
, a WebDriverWait
and the expected_conditions
module), with different arguments.
I see that unittest.mock.Mock
can mock the result of a simple call, or a series of simple calls, but can't see how to implement something entangled like this. The only way I see is to manually reimplement each and every call the method makes, and copy the arguments I pass in the method into those implementations so that they know what to return for each call. And do that again for every other test case. This sounds like an absolute nightmare to write and maintain: several times more code than the tests themselves and near 1:1 duplication with the code. So I refuse to proceed until someone tells me that there's a better way or proves that there is none and everyone really does it like this (which I don't believe) and e.g. rewrites all the tests each time a label on the page changes (which is an implementation detail, so normally, it shouldn't affect test code at all).
Sample code (adapted for http://example.com):
import selenium.webdriver
from selenium.webdriver.common.by import By as by
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
def dump_accreditation_data(d, w, i, path):
f = codecs.open(os.path.join(path, "%d.txt" % i), "w", encoding="utf-8")
u = u'http://example.com/%s/accreditation' % i
d.get(u)
# page load
w.until(EC.visibility_of_element_located((by.XPATH,"//p"))) #the real code has a more complex expression here with national characters
w.until_not(EC.visibility_of_element_located((by.CSS_SELECTOR, '.waiter')))
print >> f, u
# organization name
e = w.until(EC.visibility_of_element_located((
by.CSS_SELECTOR, 'h1'
)))
org_name = e.text
print >> f, org_name
del e
#etc
e = d.find_element_by_xpath(u'//a[text()="More information..."')
print >> f, e.get_attribute('href')
#How it's supposed to be used:
d = selenium.webdriver.Firefox()
w = WebDriverWait(d, 10)
dump_accreditation_data(d, w, 123, "<output_path>")
For the code as it is, I agree, unit-testing the way you describe does not make much sense. But, that is not just because it would be a lot of work: The goal of testing is, certainly, to find errors in the code. The goal of unit-testing is, to find those errors that can be found in the isolated unit. But, a significant part of your example code is related to interaction with external libraries.
There is comparably little code on the algorithmic level, for example:
os.path.join(path, "%d.txt" % i)
or
u = u'http://example.com/%s/accreditation' % i
or the creation of the output file content.
That is, if there are bugs in the code, they are more likely to be on the interaction level: Calling the right library functions in the right order with the right parameters, parameters having the correct formats etc. - With mocks of the libraries, however, you will not find the interaction bugs, because the mocks are implemented by you and will just reflect your (potentially wrong) understanding of the library behaviour.
My suggestion for testing this code is: Separate the algorithmic code from the code that does the interaction with the libraries. You could, for example, create small helper functions to compute the output file name and the input url. You could, in the interaction dominated part of the code, extract all the data from the web page, and then (in a separate function) create the output file content using all that data.
These helper functions can then all be tested using unit-testing. The rest of the functionality you would test with integration testing.
来源:https://stackoverflow.com/questions/54016901/mock-a-series-of-interdependent-calls