Tensorboard histograms to matplotlib

后端 未结 4 2002
野性不改
野性不改 2021-02-10 00:48

I would like to \"dump\" the tensorboard histograms and plot them via matplotlib. I would have more scientific paper appealing plots.

I managed to hack the way through t

相关标签:
4条回答
  • 2021-02-10 01:18

    A good solution is the one from @khuesmann, but this only allows you to retrieve the accumulated histogram, not the histogram per step -- which is the one actually being showed in tensorboard.

    If you want the distribution and so far, what I have understood is that Tensorboard usually compresses the histogram to decrease the memory used to store the data -- imagine storing a 2D histogram over 4 million steps, the memory can increase fast quickly. These compress histograms are accessible by doing this:

    from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
    
    n2n = EventAccumulator(PATH)
    n2n.Reload()
    
    # Check the tags under histograms and choose the one you want
    n2n.Tags()
    
    # This will give you the list used by tensorboard 
    # of the compress histograms by timestep and wall time
    n2n.CompressedHistograms(HISTOGRAM_TAG)
    

    The only problem is that it compresses the histogram to five percentiles (in Basic points they are 0, 668, 1587, 3085, 5000, 6915, 8413, 9332, 10000) which corresponds to (-Inf, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, Inf) in standard deviations. Check the code here.

    I haven't read much, but it wouldn't be hard to reconstruct the temporal histograms that tensorboard shows. If I find a way to do it, I will post it here.

    0 讨论(0)
  • 2021-02-10 01:25

    In order to plot a tensorboard histogram with matplotlib I am doing the following:

    event_acc = EventAccumulator(path, size_guidance={
        'histograms': STEP_COUNT,
    })
    event_acc.Reload()
    tags = event_acc.Tags()
    result = {}
    for hist in tags['histograms']:
        histograms = event_acc.Histograms(hist)
        result[hist] = np.array([np.repeat(np.array(h.histogram_value.bucket_limit), np.array(h.histogram_value.bucket).astype(np.int)) for h in histograms])
    return result
    

    h.histogram_value.bucket_limit gives me the value and h.histogram_value.bucket the count of this value. So when i repeat the values accordingly (np.repeat(...)), I get a huge array of expected size. This array can now be plotted with the default matplotlib logic.

    0 讨论(0)
  • 2021-02-10 01:34

    The best solution is loading all events and reconstructing all the histogram (as the answer of @khuesmann) but not using EventAccumulator but EventFileLoader. This will give you a histogram per wall time and step as the ones Tensorboard plots. It can be extended to return a list of actions by timestep and wall time.

    Don't forget to check which tag will you use.

    from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
    # Just in case, PATH_OF_FILE is the path of the file, not the folder
    loader = EventFileLoader(PATH_Of_FILE)
    
    # Where to store values
    wtimes,steps,actions = [],[],[]
    for event in loader.Load():
        wtime   = event.wall_time
        step    = event.step
        if len(event.summary.value) > 0:
            summary = event.summary.value[0]
            if summary.tag == HISTOGRAM_TAG:
                wtimes += [wtime]*int(summary.histo.num)
                steps  += [step] *int(summary.histo.num)
    
                for num,val in zip(summary.histo.bucket,summary.histo.bucket_limit):
                    actions += [val] *int(num)
    

    bear in mind that tensorflow approximates the actions and treats the actions as continuous variables, so even if you have discrete actions (e.g. 0,1,3) you will end up actions as 0.2,0.4,0.9,1.4 ... in that case round the values will do it.

    0 讨论(0)
  • 2021-02-10 01:38

    Why not just download the raw data (as CSV or JSON) from the tensorboard plots that you want, to import and use with matplotlib? See: Can I export a tensorflow summary to CSV?

    0 讨论(0)
提交回复
热议问题