easy save/load of data in python

前端 未结 7 1267
小蘑菇
小蘑菇 2020-12-28 09:35

What is the easiest way to save and load data in python, preferably in a human-readable output format?

The data I am saving/loading consists of two vectors of floats

相关标签:
7条回答
  • 2020-12-28 09:42

    As I commented in the accepted answer, using numpy this can be done with a simple one-liner:

    Assuming you have numpy imported as np (which is common practice),

    np.savetxt('xy.txt', np.array([x, y]).T, fmt="%.3f", header="x   y")
    

    will save the data in the (optional) format and

    x, y = np.loadtxt('xy.txt', unpack=True)
    

    will load it.

    The file xy.txt will then look like:

    # x   y
    1.000 1.000
    1.500 2.250
    2.000 4.000
    2.500 6.250
    3.000 9.000
    

    Note that the format string fmt=... is optional, but if the goal is human-readability it may prove quite useful. If used, it is specified using the usual printf-like codes (In my example: floating-point number with 3 decimals).

    0 讨论(0)
  • 2020-12-28 09:48

    A simple serialization format that is easy for both humans to computers read is JSON.

    You can use the json Python module.

    0 讨论(0)
  • 2020-12-28 09:55

    Here is an example of Encoder until you probably want to write for Body class:

    # add this to your code
    class BodyEncoder(json.JSONEncoder):
        def default(self, obj):
            if isinstance(obj, np.ndarray):
                return obj.tolist()
            if hasattr(obj, '__jsonencode__'):
                return obj.__jsonencode__()
            if isinstance(obj, set):
                return list(obj)
            return obj.__dict__
    
        # Here you construct your way to dump your data for each instance
        # you need to customize this function
        def deserialize(data):
            bodies = [Body(d["name"],d["mass"],np.array(d["p"]),np.array(d["v"])) for d in data["bodies"]]
            axis_range = data["axis_range"]
            timescale = data["timescale"]
            return bodies, axis_range, timescale
    
        # Here you construct your way to load your data for each instance
        # you need to customize this function
        def serialize(data):
            file = open(FILE_NAME, 'w+')
            json.dump(data, file, cls=BodyEncoder, indent=4)
            print("Dumping Parameters of the Latest Run")
            print(json.dumps(data, cls=BodyEncoder, indent=4))
    

    Here is an example of the class I want to serialize:

    class Body(object):
        # you do not need to change your class structure
        def __init__(self, name, mass, p, v=(0.0, 0.0, 0.0)):
            # init variables like normal
            self.name = name
            self.mass = mass
            self.p = p
            self.v = v
            self.f = np.array([0.0, 0.0, 0.0])
    
        def attraction(self, other):
            # not important functions that I wrote...
    

    Here is how to serialize:

    # you need to customize this function
    def serialize_everything():
        bodies, axis_range, timescale = generate_data_to_serialize()
    
        data = {"bodies": bodies, "axis_range": axis_range, "timescale": timescale}
        BodyEncoder.serialize(data)
    

    Here is how to dump:

    def dump_everything():
        data = json.loads(open(FILE_NAME, "r").read())
        return BodyEncoder.deserialize(data)
    
    0 讨论(0)
  • 2020-12-28 09:57

    There are several options -- I don't exactly know what you like. If the two vectors have the same length, you could use numpy.savetxt() to save your vectors, say x and y, as columns:

     # saving:
     f = open("data", "w")
     f.write("# x y\n")        # column names
     numpy.savetxt(f, numpy.array([x, y]).T)
     # loading:
     x, y = numpy.loadtxt("data", unpack=True)
    

    If you are dealing with larger vectors of floats, you should probably use NumPy anyway.

    0 讨论(0)
  • 2020-12-28 10:02

    Since we're talking about a human editing the file, I assume we're talking about relatively little data.

    How about the following skeleton implementation. It simply saves the data as key=value pairs and works with lists, tuples and many other things.

        def save(fname, **kwargs):
          f = open(fname, "wt")
          for k, v in kwargs.items():
            print >>f, "%s=%s" % (k, repr(v))
          f.close()
    
        def load(fname):
          ret = {}
          for line in open(fname, "rt"):
            k, v = line.strip().split("=", 1)
            ret[k] = eval(v)
          return ret
    
        x = [1, 2, 3]
        y = [2.0, 1e15, -10.3]
        save("data.txt", x=x, y=y)
        d = load("data.txt")
        print d["x"]
        print d["y"]
    
    0 讨论(0)
  • 2020-12-28 10:06
    • If it should be human-readable, I'd also go with JSON. Unless you need to exchange it with enterprise-type people, they like XML better. :-)

    • If it should be human editable and isn't too complex, I'd probably go with some sort of INI-like format, like for example configparser.

    • If it is complex, and doesn't need to be exchanged, I'd go with just pickling the data, unless it's very complex, in which case I'd use ZODB.

    • If it's a LOT of data, and needs to be exchanged, I'd use SQL.

    That pretty much covers it, I think.

    0 讨论(0)
提交回复
热议问题