How do you edit an existing Tensorboard Training Loss summary?

南楼画角 提交于 2020-08-08 03:43:24

问题


I've trained my network and generated some training/validation losses which I saved via the following code example (example of training loss only, validation is perfectly equivalent):

valid_summary_writer = tf.summary.create_file_writer("/path/to/logs/")
with train_summary_writer.as_default():
    tf.summary.scalar('Training Loss', data=epoch_loss, step=current_step)

After training I would then like to view the loss curves using Tensorboard. However because I saved the loss curves under the names 'Training Loss' and 'Validation Loss' these curves are plotted on separate graphs. I know that I should change the name to be simply 'loss' to solve this problem for future writes to the log directory. But how do I edit my existing log files for the training/validation losses to account for this?

I attempted to modify the following post's solution: https://stackoverflow.com/a/55061404 which edits the steps of a log file and re-writes the file; where my version involves changing the tags in the file. But I had no success in this area. It also requires importing older Tensorflow code through 'tf.compat.v1'. Is there a way to achieve this (maybe in TF 2.X)?

I had thought to simply acquire the loss and step values from each log directory containing the losses and write them to new log files via my previous working method, but I only managed to obtain the step, and not the loss value itself. Has anyone had any success here?

---=== EDIT ===---

I managed to fix the problem using the code from @jhedesa

I had to slightly alter the way that the function "rename_events_dir" was called as I am using Tensorflow collaboratively inside of a Google Colab Notebook. To do this I changed the final part of the code which read:

if __name__ == '__main__':
    if len(sys.argv) != 5:
        print(f'{sys.argv[0]} <input dir> <output dir> <old tags> <new tag>',
              file=sys.stderr)
        sys.exit(1)
    input_dir, output_dir, old_tags, new_tag = sys.argv[1:]
    old_tags = old_tags.split(';')
    rename_events_dir(input_dir, output_dir, old_tags, new_tag)
    print('Done')

To read this:

rootpath = '/path/to/model/'
dirlist = [dirname for dirname in os.listdir(rootpath) if dirname not in ['train', 'valid']]
for dirname in dirlist:
  rename_events_dir(rootpath + dirname + '/train', rootpath + '/train', 'Training Loss', 'loss')
  rename_events_dir(rootpath + dirname + '/valid', rootpath + '/valid', 'Validation Loss', 'loss')

Notice that I called "rename_events_dir" twice, once for editing the tags for the training loss, and once for the validation loss tags. I could have used the previous method of calling the code by setting "old_tags = 'Training Loss;Validation Loss'" and using "old_tags = old_tags.split(';')" to split the tags. I used my method simply to understand the code and how it processed the data.


回答1:


As mentioned in How to load selected range of samples in Tensorboard, TensorBoard events are actually stored record files, so you can read them and process them as such. Here is a script similar to the one posted there but for the purpose of renaming events, and updated to work in TF 2.x.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# rename_events.py

import sys
from pathlib import Path
import os
# Use this if you want to avoid using the GPU
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import tensorflow as tf
from tensorflow.core.util.event_pb2 import Event

def rename_events(input_path, output_path, old_tags, new_tag):
    # Make a record writer
    with tf.io.TFRecordWriter(str(output_path)) as writer:
        # Iterate event records
        for rec in tf.data.TFRecordDataset([str(input_path)]):
            # Read event
            ev = Event()
            ev.MergeFromString(rec.numpy())
            # Check if it is a summary
            if ev.summary:
                # Iterate summary values
                for v in ev.summary.value:
                    # Check if the tag should be renamed
                    if v.tag in old_tags:
                        # Rename with new tag name
                        v.tag = new_tag
            writer.write(ev.SerializeToString())

def rename_events_dir(input_dir, output_dir, old_tags, new_tag):
    input_dir = Path(input_dir)
    output_dir = Path(output_dir)
    # Make output directory
    output_dir.mkdir(parents=True, exist_ok=True)
    # Iterate event files
    for ev_file in input_dir.glob('**/*.tfevents*'):
        # Make directory for output event file
        out_file = Path(output_dir, ev_file.relative_to(input_dir))
        out_file.parent.mkdir(parents=True, exist_ok=True)
        # Write renamed events
        rename_events(ev_file, out_file, old_tags, new_tag)

if __name__ == '__main__':
    if len(sys.argv) != 5:
        print(f'{sys.argv[0]} <input dir> <output dir> <old tags> <new tag>',
              file=sys.stderr)
        sys.exit(1)
    input_dir, output_dir, old_tags, new_tag = sys.argv[1:]
    old_tags = old_tags.split(';')
    rename_events_dir(input_dir, output_dir, old_tags, new_tag)
    print('Done')

You would use it like this:

> python rename_events.py my_log_dir renamed_log_dir "Training Loss;Validation Loss" loss


来源:https://stackoverflow.com/questions/60079644/how-do-you-edit-an-existing-tensorboard-training-loss-summary

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!