How to feed caffe multi label data in HDF5 format?

前端 未结 2 999
抹茶落季
抹茶落季 2020-11-30 07:20

I want to use caffe with a vector label, not integer. I have checked some answers, and it seems HDF5 is a better way. But then I\'m stucked with error like:

相关标签:
2条回答
  • 2020-11-30 07:27

    Answer to this question's title:

    The HDF5 file should have two dataset in root, named "data" and "label", respectively. The shape is (data amount, dimension). I'm using only one-dimension data, so I'm not sure what's the order of channel, width, and height. Maybe it does not matter. dtype should be float or double.

    A sample code creating train set with h5py is:

    import h5py, os
    import numpy as np
    
    f = h5py.File('train.h5', 'w')
    # 1200 data, each is a 128-dim vector
    f.create_dataset('data', (1200, 128), dtype='f8')
    # Data's labels, each is a 4-dim vector
    f.create_dataset('label', (1200, 4), dtype='f4')
    
    # Fill in something with fixed pattern
    # Regularize values to between 0 and 1, or SigmoidCrossEntropyLoss will not work
    for i in range(1200):
        a = np.empty(128)
        if i % 4 == 0:
            for j in range(128):
                a[j] = j / 128.0;
            l = [1,0,0,0]
        elif i % 4 == 1:
            for j in range(128):
                a[j] = (128 - j) / 128.0;
            l = [1,0,1,0]
        elif i % 4 == 2:
            for j in range(128):
                a[j] = (j % 6) / 128.0;
            l = [0,1,1,0]
        elif i % 4 == 3:
            for j in range(128):
                a[j] = (j % 4) * 4 / 128.0;
            l = [1,0,1,1]
        f['data'][i] = a
        f['label'][i] = l
    
    f.close()
    

    Also, the accuracy layer is not needed, simply removing it is fine. Next problem is the loss layer. Since SoftmaxWithLoss has only one output (index of the dimension with max value), it can't be used for multi-label problem. Thank to Adian and Shai, I find SigmoidCrossEntropyLoss is good in this case.

    Below is the full code, from data creation, training network, and getting test result:

    main.py (modified from caffe lanet example)

    import os, sys
    
    PROJECT_HOME = '.../project/'
    CAFFE_HOME = '.../caffe/'
    os.chdir(PROJECT_HOME)
    
    sys.path.insert(0, CAFFE_HOME + 'caffe/python')
    import caffe, h5py
    
    from pylab import *
    from caffe import layers as L
    
    def net(hdf5, batch_size):
        n = caffe.NetSpec()
        n.data, n.label = L.HDF5Data(batch_size=batch_size, source=hdf5, ntop=2)
        n.ip1 = L.InnerProduct(n.data, num_output=50, weight_filler=dict(type='xavier'))
        n.relu1 = L.ReLU(n.ip1, in_place=True)
        n.ip2 = L.InnerProduct(n.relu1, num_output=50, weight_filler=dict(type='xavier'))
        n.relu2 = L.ReLU(n.ip2, in_place=True)
        n.ip3 = L.InnerProduct(n.relu2, num_output=4, weight_filler=dict(type='xavier'))
        n.loss = L.SigmoidCrossEntropyLoss(n.ip3, n.label)
        return n.to_proto()
    
    with open(PROJECT_HOME + 'auto_train.prototxt', 'w') as f:
        f.write(str(net(PROJECT_HOME + 'train.h5list', 50)))
    with open(PROJECT_HOME + 'auto_test.prototxt', 'w') as f:
        f.write(str(net(PROJECT_HOME + 'test.h5list', 20)))
    
    caffe.set_device(0)
    caffe.set_mode_gpu()
    solver = caffe.SGDSolver(PROJECT_HOME + 'auto_solver.prototxt')
    
    solver.net.forward()
    solver.test_nets[0].forward()
    solver.step(1)
    
    niter = 200
    test_interval = 10
    train_loss = zeros(niter)
    test_acc = zeros(int(np.ceil(niter * 1.0 / test_interval)))
    print len(test_acc)
    output = zeros((niter, 8, 4))
    
    # The main solver loop
    for it in range(niter):
        solver.step(1)  # SGD by Caffe
        train_loss[it] = solver.net.blobs['loss'].data
        solver.test_nets[0].forward(start='data')
        output[it] = solver.test_nets[0].blobs['ip3'].data[:8]
    
        if it % test_interval == 0:
            print 'Iteration', it, 'testing...'
            correct = 0
            data = solver.test_nets[0].blobs['ip3'].data
            label = solver.test_nets[0].blobs['label'].data
            for test_it in range(100):
                solver.test_nets[0].forward()
                # Positive values map to label 1, while negative values map to label 0
                for i in range(len(data)):
                    for j in range(len(data[i])):
                        if data[i][j] > 0 and label[i][j] == 1:
                            correct += 1
                        elif data[i][j] %lt;= 0 and label[i][j] == 0:
                            correct += 1
            test_acc[int(it / test_interval)] = correct * 1.0 / (len(data) * len(data[0]) * 100)
    
    # Train and test done, outputing convege graph
    _, ax1 = subplots()
    ax2 = ax1.twinx()
    ax1.plot(arange(niter), train_loss)
    ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r')
    ax1.set_xlabel('iteration')
    ax1.set_ylabel('train loss')
    ax2.set_ylabel('test accuracy')
    _.savefig('converge.png')
    
    # Check the result of last batch
    print solver.test_nets[0].blobs['ip3'].data
    print solver.test_nets[0].blobs['label'].data
    

    h5list files simply contain paths of h5 files in each line:

    train.h5list

    /home/foo/bar/project/train.h5
    

    test.h5list

    /home/foo/bar/project/test.h5
    

    and the solver:

    auto_solver.prototxt

    train_net: "auto_train.prototxt"
    test_net: "auto_test.prototxt"
    test_iter: 10
    test_interval: 20
    base_lr: 0.01
    momentum: 0.9
    weight_decay: 0.0005
    lr_policy: "inv"
    gamma: 0.0001
    power: 0.75
    display: 100
    max_iter: 10000
    snapshot: 5000
    snapshot_prefix: "sed"
    solver_mode: GPU
    

    Converge graph:

    Last batch result:

    [[ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
    [ 42.69248581 -43.00864792 13.19664764 -3.35134125]
    [ -1.36403108 1.38531208 2.77786589 -0.34310576]
    [ 2.91686511 -2.88944006 4.34043217 0.32656598]
    ...
    [ 35.91593933 -37.46276474 -6.2579031 -6.30313492]
    [ 42.69248581 -43.00864792 13.19664764 -3.35134125]
    [ -1.36403108 1.38531208 2.77786589 -0.34310576]
    [ 2.91686511 -2.88944006 4.34043217 0.32656598]]
    
    [[ 1. 0. 0. 0.]
    [ 1. 0. 1. 0.]
    [ 0. 1. 1. 0.]
    [ 1. 0. 1. 1.]
    ...
    [ 1. 0. 0. 0.]
    [ 1. 0. 1. 0.]
    [ 0. 1. 1. 0.]
    [ 1. 0. 1. 1.]]
    

    I think this code still has many things to improve. Any suggestion is appreciated.

    0 讨论(0)
  • 2020-11-30 07:41

    Your accuracy layer makes no sense.

    The way accuracy layer works: in caffe accuracy layer expects two inputs
    (i) a predicted probability vector and
    (ii) ground-truth corresponding scalar integer label.
    The accuracy layer than checks if the probability of the predicted label is indeed the maximal (or within top_k).
    Therefore if you have to classify C different classes, your inputs are going to be N-by-C (where N is batch size) input predicted probabilities for N samples belonging to each of the C classes, and N labels.

    The way it is defined in your net: You input accuracy layer N-by-4 predictions and N-by-4 labels -- this makes no sense for caffe.

    0 讨论(0)
提交回复
热议问题