问题
I have a Python test script that requires a configuration file. The configuration file is expected to be in JSON format.
But some of the users of my test script dislike the JSON format because it's unreadable.
So I changed my test script so that it expects the configuration file in YAML format, then converts the YAML file to a JSON file.
I would prefer that the function that loads the configuration file to handle both JSON and YAML. Is there a method in either the yaml or json module that can give me a Boolean response if the configuration file is JSON or YAML?
My workaround right now is to use two try/except clauses:
import os
import json
import yaml
# This is the configuration file - my script gets it from argparser but in
# this example, let's just say it is some file that I don't know what the format
# is
config_file = "some_config_file"
in_fh = open(config_file, "r")
config_dict = dict()
valid_json = True
valid_yaml = True
try:
config_dict = json.load(in_fh)
except:
print "Error trying to load the config file in JSON format"
valid_json = False
try:
config_dict = yaml.load(in_fh)
except:
print "Error trying to load the config file in YAML format"
valid_yaml = False
in_fh.close()
if not valid_yaml and not valid_json:
print "The config file is neither JSON or YAML"
sys.exit(1)
Now, there is a Python module I found on the Internet called isityaml that can be used to test for YAML. But I'd prefer not to install another package because I have to install this on several test hosts.
Does the json and yaml module have a method that gives me back a Boolean that tests for their respective formats?
config_file = "sample_config_file"
# I would like some method like this
if json.is_json(in_fh):
config_dict = json.load(in_fh)
回答1:
From looking at the json
and yaml
modules' documentation, it looks like they don't offer any appropriate modules. However, a common Python idiom is EAFP ("easier to ask forgiveness than permission"); in other words, go ahead and try to do the operation, and deal with exceptions if they arise.
def load_config(config_file):
with open(config_file, "r") as in_fh:
# Read the file into memory as a string so that we can try
# parsing it twice without seeking back to the beginning and
# re-reading.
config = in_fh.read()
config_dict = dict()
valid_json = True
valid_yaml = True
try:
config_dict = json.loads(config)
except:
print "Error trying to load the config file in JSON format"
valid_json = False
try:
config_dict = yaml.safe_load(config)
except:
print "Error trying to load the config file in YAML format"
valid_yaml = False
You could make your own is_json
or is_yaml
function if you wanted. This would involve processing the configuration twice, but that may be okay for your purposes.
def try_as(loader, s, on_error):
try:
loader(s)
return True
except on_error:
return False
def is_json(s):
return try_as(json.loads, s, ValueError)
def is_yaml(s):
return try_as(yaml.safe_load, s, yaml.scanner.ScannerError)
Finally, as @user2357112 alluded to, "every JSON file is also a valid YAML file" (as of YAML 1.2), so you should be able to unconditionally process everything as YAML (assuming you have a YAML 1.2-compatible parser; Python's default yaml
module isn't).
回答2:
From your
import yaml
I conclude that you use the old PyYAML. That package only supports YAML 1.1 (from 2005) and the format specified there is not a full superset of JSON. With the YAML 1.2 (released 2009), the YAML format became a superset of JSON.
The package ruamel.yaml (disclaimer: I am the author of that package) supports YAML 1.2. You can install it in your python virtual enviroment with pip install ruamel.yaml
. And by replacing PyYAML by ruamel.yaml
(and not adding a package), you can just do:
import os
from ruamel.yaml import YAML
config_file = "some_config_file"
yaml = YAML()
with open(config_file, "r") as in_fh:
config_dict = yaml.load(in_fh)
and load the file into config_dict
, not caring about whether the input is YAML or JSON and no need for having a test for either format.
回答3:
After years I met the same trouble. I fully agree with EAFP, but still I'm trying find the best detection if the configuration file is in JSON format or YAML. In code I have methods which inform user where he did issue in json-file and where in YAML. try/except did not handle this as I really want, and my eyes are bleeding when I see those nested blocks.
This is not perfect, still has minor issues, but for me, the basic concept fits my needs. I'd say "good enough".
My solution is: find all possible standalone commas in configuration file. If config file contains standalone commas (separators in json) we have json-file, if we do not find any commas, it's yaml. In my yaml-file I use commas only in comments (between " ") and in lists (between [ ]). Maybe someone will find it usefull.
import re
from pathlib import Path
commas = re.compile(r',(?=(?![\"]*[\s\w\?\.\"\!\-\_]*,))(?=(?![^\[]*\]))')
"""
Find all commas which are standalone
- not between quotes - comments, answers
- not between brackets - lists
"""
file_path = Path("example_file.cfg")
signs = commas.findall(file_path.open('r').read())
return "json" if len(signs) > 0 else "yaml"
来源:https://stackoverflow.com/questions/44338881/is-there-a-way-to-determine-whether-a-file-is-in-yaml-or-json-format