问题
Am validating a CSV file with Cerberus but am struggling with what I'd assume is some basic logic
Scenario:
A CSV file has 2 columns. Column 2
requires to have a value only if Column 1
has a value. If Column 1
is empty then Column 2
should also be empty.
Am thinking this would be one of the most straight forward rules to write but so far nothing is working as expected.
Below is the same logic using python dictionaries.
from cerberus import Validator
v = Validator()
schema = {
"col1": {"required": False},
"col2": {"required": True, "dependencies": "col1"},
}
document = {
"col1": "a",
"col2": ""
}
v.validate(document, schema) # This responds with True!? Why?
v.errors
{}
I would have expected an error for Column 2
here because Column 1
has been provided but here the result is True
meaning no error
I've checked raised issues on github but can't seem to find any obvious solution.
回答1:
Note
The evaluation of this rule (dependencies
) does not consider any constraints defined with therequired
rule.
Whatever the "required"
would be:
from cerberus import Validator
v = Validator()
document = {
"col1": "a",
"col2": ""
}
schema = {
"col1": {"required": False},
"col2": {"required": True, "dependencies": "col1"},
}
print(v.validate(document, schema)) # True
print(v.errors) # {}
schema = {
"col1": {"required": True},
"col2": {"required": True, "dependencies": "col1"},
}
print(v.validate(document, schema)) # True
print(v.errors) # {}
schema = {
"col1": {"required": True},
"col2": {"required": False, "dependencies": "col1"},
}
print(v.validate(document, schema)) # True
print(v.errors) # {}
http://docs.python-cerberus.org/en/stable/validation-rules.html#dependencies
Update:
Solution for your condition "Make col2 mandatory if col1 has a value in it.".
To apply a sophisticated rules - create a custom Validator as shown below:
from cerberus import Validator
class MyValidator(Validator):
def _validate_depends_on_col1(self, depends_on_col1, field, value):
""" Test if a field value is set depending on `col1` field value.
"""
if depends_on_col1 and self.document.get('col1', None) and not value:
self._error(field, f"`{field}` cannot be empty given that `col1` has a value")
v = MyValidator()
schema = {
"col1": {"required": False},
"col2": {"required": True, "depends_on_col1": True},
}
print(v.validate({"col1": "a", "col2": ""}, schema)) # False
print(v.errors) # {'col2': ['`col2` cannot be empty given that `col1` has a value']}
print(v.validate({"col1": "", "col2": ""}, schema)) # True
print(v.errors) # {}
print(v.validate({"col1": 0, "col2": "aaa"}, schema)) # True
print(v.errors) # {}
Note, you need to run into convention of what column col1
values should be treated as empty (to adjust a custom validator rules).
Extended version to specify a "dependancy" field name:
class MyValidator(Validator):
def _validate_depends_on_col(self, col_name, field, value):
""" Test if a field value is set depending on `col_name` field value.
"""
if col_name and self.document.get(col_name, None) and not value:
self._error(field, f"`{field}` cannot be empty given that `{col_name}` has a value")
v = MyValidator()
document = {"col1": "a", "col2": ""}
schema = {
"col1": {"required": False},
"col2": {"required": True, "depends_on_col": "col1"},
}
http://docs.python-cerberus.org/en/stable/customize.html
回答2:
Assuming that you transformed your csv input into a list of documents, you could first preprocess the documents in order to remove the col2
field where it is empty:
for document in documents:
if not document["col2"]:
document.pop("col2")
Then this schema would do the job:
{"col1": {
"oneof": [
{"empty": True},
{"empty": False, "dependencies": "col2"}
]
}}
Mind that the the dependencies
and required
rules don't consider the value of a field, but only the presence of the field in the document.
来源:https://stackoverflow.com/questions/56705509/dependencies-validation-using-cerberus