PyTorch/LArCV Classification Example with Data Set (v0.1.0)

Posted on Tue 09 January 2018 in tutorial by Taritree

PyTorch Classification Example

In this notebook, we're going to use ResNet-18 implemented in pyTorch to classify the 5-particle example training data.

This tutorial is meant to walk through some of the necessary steps to load images stored in LArCV files and train a network. For more details on how to use pytorch, refer to the official pytorch tutorials.

This notebook will try to be self-contained in terms of code. However, you can find the code separated into different files in the following repositories

You will also need the training data. Go to the open data page and download the either the 5k or 50k training/validation samples.

In [1]:
# Import our modules

# python
import os,sys
import shutil
import time
import traceback

# numpy
import numpy as np

# torch
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models

# ROOT/LArCV
import ROOT
from larcv import larcv

%matplotlib notebook
import matplotlib.pyplot as plt
Welcome to JupyROOT 6.12/04

Set the GPU to use

In [2]:
torch.cuda.device( 1 )
Out[2]:
<torch.cuda.device at 0x7f6d5c4d7ed0>

Setup Data IO

Location of data on your local machine

Set the path to the data files in this block.

In [3]:
path_to_train_data="/home/taritree/working/dlphysics/testset/train_50k.root"
path_to_test_data="/home/taritree/working/dlphysics/testset/test_40k.root"
if not os.path.exists(path_to_train_data):
    print "Could not find the training data file."
if not os.path.exists(path_to_test_data):
    print "Could not find the validation data file."

Define LArCVDataset

First, we define a class that will load our data. There are many ways to do this. We create a concrete instance of pytorch's Dataset class, which can be used in the DataLoader class (which we do not use).

In [4]:
# from: https://github.com/deeplearnphysics/larcvdataset

larcv.PSet # touch this to force libBase to load, which has CreatePSetFromFile
from larcv.dataloader2 import larcv_threadio
from torch.utils.data import Dataset

class LArCVDataset(Dataset):
    """ LArCV data set interface for PyTorch"""

    def __init__( self, cfg, fillername, verbosity=0, loadallinmem=False, randomize_inmem_data=True, max_inmem_events=-1 ):
        self.verbosity = verbosity
        self.batchsize = None
        self.randomize_inmem_data = randomize_inmem_data
        self.max_inmem_events = max_inmem_events
        self.loadallinmem = loadallinmem
        self.cfg = cfg  

        # we setup the larcv threadfiller class, which handles io from larcv files
        # this follows steps from larcv tutorials
        
        # setup cfg dictionary needed for larcv_threadio      
        self.filler_cfg = {}
        self.filler_cfg["filler_name"] = fillername
        self.filler_cfg["verbosity"]   = self.verbosity
        self.filler_cfg["filler_cfg"]  = self.cfg
        if not os.path.exists(self.cfg):
            raise ValueError("Could not find filler configuration file: %s"%(self.cfg))

        # we read the first line of the config file, which should have name of config parameter set
        linepset = open(self.cfg,'r').readlines()
        self.cfgname = linepset[0].split(":")[0].strip()
        
        # we load the pset ourselves, as we want access to values in 'ProcessName' list
        # will use these as the names of the data products loaded. store in self.datalist
        self.pset = larcv.CreatePSetFromFile(self.cfg,self.cfgname).get("larcv::PSet")(self.cfgname)
        datastr_v = self.pset.get("std::vector<std::string>")("ProcessName")
        self.datalist = []
        for i in range(0,datastr_v.size()):
            self.datalist.append(datastr_v[i])
        
        # finally, configure io
        self.io = larcv_threadio()        
        self.io.configure(self.filler_cfg)
        
        if self.loadallinmem:
            self._loadinmem()

    def __len__(self):
        if not self.loadallinmem:
            return int(self.io.fetch_n_entries())
        else:
            return int(self.alldata[self.datalist[0]].shape[0])

    def __getitem__(self, idx):
        if not self.loadallinmem:
            self.io.next()
            out = {}
            for name in self.datalist:
                out[name] = self.io.fetch_data(name).data()
        else:
            indices = np.random.randint(len(self),size=self.batchsize)
            out = {}
            for name in self.datalist:
                out[name] = np.zeros( (self.batchsize,self.alldata[name].shape[1]), self.alldata[name].dtype )
                for n,idx in enumerate(indices):
                    out[name][n,:] = self.alldata[name][idx,:]
        return out
        
    def __str__(self):
        return dumpcfg()
    
    def _loadinmem(self):
        """load data into memory"""
        nevents = int(self.io.fetch_n_entries())
        if self.max_inmem_events>0 and nevents>self.max_inmem_events:
            nevents = self.max_inmem_events

        print "Attempting to load all ",nevents," into memory. good luck"
        start = time.time()

        # start threadio
        self.start(1)

        # get one data element to get shape
        self.io.next()
        firstout = {}
        for name in self.datalist:
            firstout[name] = self.io.fetch_data(name).data()
            self.alldata = {}
        for name in self.datalist:
            self.alldata[name] = np.zeros( (nevents,firstout[name].shape[1]), firstout[name].dtype )
            self.alldata[name][0] = firstout[name][0,:]
        for i in range(1,nevents):
            self.io.next()
            if i%1000==0:
                print "loading event %d of %d"%(i,nevents)
            for name in self.datalist:
                out = self.io.fetch_data(name).data()
                self.alldata[name][i,:] = out[0,:]

        print "elapsed time to bring data into memory: ",time.time()-start,"sec"
        self.stop()

    def start(self,batchsize):
        """exposes larcv_threadio::start which is used to start the thread managers"""
        self.batchsize = batchsize
        self.io.start_manager(self.batchsize)

    def stop(self):
        """ stops the thread managers"""
        self.io.stop_manager()

    def dumpcfg(self):
        """dump the configuration file to a string"""
        print open(self.cfg).read()
        

Write configuration files for the LArCV ThreadFiller class

We define the configurations in this block, then write to file. We will load the files later when we create LArCVDataset instances for both the training and test data.

A note: the configurations need to have a separate name. Also, the ProcessNames have to be different. This is because of the way the threads are managed.

In [5]:
train_cfg="""ThreadProcessor: {
  Verbosity:3
  NumThreads: 3
  NumBatchStorage: 3
  RandomAccess: true
  InputFiles: ["%s"]  
  ProcessName: ["image","label"]
  ProcessType: ["BatchFillerImage2D","BatchFillerPIDLabel"]
  ProcessList: {
    image: {
      Verbosity:3
      ImageProducer: "data"
      Channels: [2]
      EnableMirror: true
    }
    label: {
      Verbosity:3
      ParticleProducer: "mctruth"
      PdgClassList: [2212,11,211,13,22]
    }
  }
}
"""%(path_to_train_data)

test_cfg="""ThreadProcessorTest: {
  Verbosity:3
  NumThreads: 2
  NumBatchStorage: 2
  RandomAccess: true
  InputFiles: ["%s"]
  ProcessName: ["imagetest","labeltest"]
  ProcessType: ["BatchFillerImage2D","BatchFillerPIDLabel"]
  ProcessList: {
    imagetest: {
      Verbosity:3
      ImageProducer: "data"
      Channels: [2]
      EnableMirror: false
    }
    labeltest: {
      Verbosity:3
      ParticleProducer: "mctruth"
      PdgClassList: [2212,11,211,13,22]
    }
  }
}
"""%(path_to_test_data)

train_cfg_out = open("train_dataloader.cfg",'w')
print >> train_cfg_out,train_cfg
train_cfg_out.close()

test_cfg_out  = open("valid_dataloader.cfg",'w')
print >> test_cfg_out,test_cfg
test_cfg_out.close()

Setup Network

Define network

We use ResNet-18 as implemented in the torchvision module. We reproduce it here and make a slight modification: we change the number of input channels from 3 to 1. The original resnet expects an RGB image. For our example, we only use the image from one plane from our hypothetical LAr TPC detector.

Original can be found here.

In [6]:
import torch.nn as nn
import math

# define convolution without bias that we will use throughout the network
def conv3x3(in_planes, out_planes, stride=1):
    """3x3 convolution with padding"""
    return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
                     padding=1, bias=False)


# implements one ResNet unit
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, inplanes, planes, stride=1, downsample=None):
        super(BasicBlock, self).__init__()
        self.conv1 = conv3x3(inplanes, planes, stride)
        self.bn1 = nn.BatchNorm2d(planes)
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(planes, planes)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        residual = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        if self.downsample is not None:
            residual = self.downsample(x)

        out += residual
        out = self.relu(out)

        return out
    
# define the network. It provides options for 
class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes=1000, input_channels=3):
        """
        inputs
        ------
        block: type of resnet unit
        layers: list of 4 ints. defines number of basic block units in each set of resnet units
        num_classes: output classes
        input_channels: number of channels in input images
        """
        self.inplanes = 64
        super(ResNet, self).__init__()
        self.conv1 = nn.Conv2d(input_channels, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        
        # had to change stride of avgpool from original from 1 to 2
        self.avgpool = nn.AvgPool2d(7, stride=2)

        # I've added dropout to the network
        self.dropout = nn.Dropout2d(p=0.5,inplace=True)

        #print "block.expansion=",block.expansion                                                                                                                                                           
        self.fc = nn.Linear(512 * block.expansion, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()

    def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.inplanes, planes * block.expansion,
                          kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for i in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)
    
    def forward(self, x):

        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = self.dropout(x)
        #print "avepool: ",x.data.shape                                                                                                                                                                     
        x = x.view(x.size(0), -1)
        #print "view: ",x.data.shape                                                                                                                                                                        
        x = self.fc(x)

        return x


    
# define a helper function for ResNet-18
def resnet18(pretrained=False, **kwargs):
    """Constructs a ResNet-18 model.                                                                                                                                                                        
                                                                                                                                                                                                            
    Args:                                                                                                                                                                                                   
        pretrained (bool): If True, returns a model pre-trained on ImageNet                                                                                                                                 
    """
    model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
    return model

Create instance of network

In [7]:
model = resnet18(pretrained=False,num_classes=5, input_channels=1)
model.cuda()
Out[7]:
ResNet(
  (conv1): Conv2d (1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
  (relu): ReLU(inplace)
  (maxpool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=(1, 1))
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d (64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential(
        (0): Conv2d (64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d (128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential(
        (0): Conv2d (128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d (256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (downsample): Sequential(
        (0): Conv2d (256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
      (relu): ReLU(inplace)
      (conv2): Conv2d (512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
    )
  )
  (avgpool): AvgPool2d(kernel_size=7, stride=2, padding=0, ceil_mode=False, count_include_pad=True)
  (dropout): Dropout2d(p=0.5, inplace)
  (fc): Linear(in_features=512, out_features=5)
)

Define loss function

In [8]:
criterion = nn.CrossEntropyLoss().cuda()

Define optimizer and set training parameters

In [9]:
lr = 1.0e-3
momentum = 0.9
weight_decay = 1.0e-3
batchsize = 50
batchsize_valid = 500
start_epoch = 0
epochs      = 100
nbatches_per_iteration = 10000/batchsize
nbatches_per_valid     = 1000/batchsize_valid

# We use SGD
optimizer = torch.optim.SGD(model.parameters(), lr, momentum=momentum, weight_decay=weight_decay)

Define training and validation steps

We define functions and classes to help us perform training.

Define an object that will help us track averages

In [10]:
class AverageMeter(object):
    """Computes and stores the average and current value"""
    def __init__(self):
        self.reset()

    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.count

Training step

In [11]:
def train(train_loader, model, criterion, optimizer, nbatches, iteration, print_freq):
    batch_time = AverageMeter()
    data_time = AverageMeter()
    format_time = AverageMeter()
    train_time = AverageMeter()
    losses = AverageMeter()
    top1 = AverageMeter()

    # switch to train mode                                                                                                                                                                                  
    model.train()

    for i in range(0,nbatches):                                                                                                                                                   
        batchstart = time.time()

        end = time.time()
        data = train_loader[i]
        # measure data loading time                                                                                                                                                                         
        data_time.update(time.time() - end)

        end = time.time()
        img = data["image"]
        lbl = data["label"]
        img_np = np.zeros( (img.shape[0], 1, 256, 256), dtype=np.float32 )
        lbl_np = np.zeros( (lbl.shape[0] ), dtype=np.int )
        # batch loop                                                                                                                                                                                        
        for j in range(img.shape[0]):
            imgtmp = img[j].reshape( (256,256) )
            img_np[j,0,:,:] = padandcropandflip(imgtmp) # data augmentation                                                                                                                                 
            lbl_np[j] = np.argmax(lbl[j])
        input  = torch.from_numpy(img_np).cuda()
        target = torch.from_numpy(lbl_np).cuda()

        # measure data formatting time                                                                                                                                                                      
        format_time.update(time.time() - end)

        # convert into torch variable
        input_var = torch.autograd.Variable(input)
        target_var = torch.autograd.Variable(target)

        # compute output                                                                                                                                                                                    
        end = time.time()
        output = model(input_var)
        loss = criterion(output, target_var)

        # measure accuracy and record loss                                                                                                                                                                  
        prec1 = accuracy(output.data, target, topk=(1,))
        losses.update(loss.data[0], input.size(0))
        top1.update(prec1[0], input.size(0))
        
        # compute gradient and do SGD step                                                                                                                                                                  
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        train_time.update(time.time()-end)

        # measure elapsed time                                                                                                                                                                              
        batch_time.update(time.time() - batchstart)
        
        if print_freq>0 and i % print_freq == 0:
            status = (iteration,i,nbatches,
                      batch_time.val,batch_time.avg,
                      data_time.val,data_time.avg,
                      format_time.val,format_time.avg,
                      train_time.val,train_time.avg,
                      losses.val,losses.avg,
                      top1.val,top1.avg)
            print "Iteration: [%d][%d/%d]\tTime %.3f (%.3f)\tData %.3f (%.3f)\tFormat %.3f (%.3f)\tTrain %.3f (%.3f)\tLoss %.3f (%.3f)\tPrec@1 %.3f (%.3f)"%status
            
    return losses.avg,top1.avg

Validation step

Here we process the test data and accumilate the accuracy.

In [12]:
def validate(val_loader, model, criterion, nbatches, print_freq):
    batch_time = AverageMeter()
    losses = AverageMeter()
    top1 = AverageMeter()

    # switch to evaluate mode                                                                                                                                                                               
    model.eval()

    end = time.time()
    for i in range(0,nbatches):
        data = val_loader[i]
        img = data["imagetest"]
        lbl = data["labeltest"]
        img_np = np.zeros( (img.shape[0], 1, 256, 256), dtype=np.float32 )
        lbl_np = np.zeros( (lbl.shape[0] ), dtype=np.int )
        for j in range(img.shape[0]):
            img_np[j,0,:,:] = img[j].reshape( (256,256) )
            lbl_np[j] = np.argmax(lbl[j])
        input  = torch.from_numpy(img_np).cuda()
        target = torch.from_numpy(lbl_np).cuda()

        # convert into torch variable
        input_var = torch.autograd.Variable(input, volatile=True)
        target_var = torch.autograd.Variable(target, volatile=True)

        # compute output                                                                                                                                                                                    
        output = model(input_var)
        loss = criterion(output, target_var)

        # measure accuracy and record loss                                                                                                                                                                  
        prec1 = accuracy(output.data, target, topk=(1,))
        losses.update(loss.data[0], input.size(0))
        top1.update(prec1[0], input.size(0))

        # measure elapsed time                                                                                                                                                                              
        batch_time.update(time.time() - end)
        end = time.time()
        if print_freq>0 and i % print_freq == 0:
            status = (i,nbatches,batch_time.val,batch_time.avg,losses.val,losses.avg,top1.val,top1.avg)
            print "Test: [%d/%d]\tTime %.3f (%.3f)\tLoss %.3f (%.3f)\tPrec@1 %.3f (%.3f)"%status
 
    #print "Test:Result* Prec@1 %.3f\tLoss %.3f"%(top1.avg,losses.avg)
    
    return float(top1.avg),float(losses.avg)

utility functions

In [13]:
def adjust_learning_rate(optimizer, epoch, lr):
    """Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""
    #lr = lr * (0.5 ** (epoch // 300))                                                                                                                                                                      
    lr = lr
    #lr = lr*0.992                                                                                                                                                                                          
    #print "adjust learning rate to ",lr                                                                                                                                                                    
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

def accuracy(output, target, topk=(1,)):
    """Computes the precision@k for the specified values of k"""
    maxk = max(topk)
    batch_size = target.size(0)

    _, pred = output.topk(maxk, 1, True, True)
    pred = pred.t()
    correct = pred.eq(target.view(1, -1).expand_as(pred))

    res = []
    for k in topk:
        correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
        res.append(correct_k.mul_(100.0 / batch_size))
    return res

def dump_lr_schedule( startlr, numepochs ):
    for epoch in range(0,numepochs):
        lr = startlr*(0.5**(epoch//300))
        if epoch%10==0:
            print "Epoch [%d] lr=%.3e"%(epoch,lr)
    print "Epoch [%d] lr=%.3e"%(epoch,lr)
    return

def padandcropandflip(npimg2d):
    imgpad  = np.zeros( (264,264), dtype=np.float32 )
    imgpad[4:256+4,4:256+4] = npimg2d[:,:]
    if np.random.rand()>0.5:
        imgpad = np.flip( imgpad, 0 )
    if np.random.rand()>0.5:
        imgpad = np.flip( imgpad, 1 )
    randx = np.random.randint(0,8)
    randy = np.random.randint(0,8)
    return imgpad[randx:randx+256,randy:randy+256]

def save_checkpoint(state, is_best, p, filename='checkpoint.pth.tar'):
    if p>0:
        filename = "checkpoint.%dth.tar"%(p)
    torch.save(state, filename)
    if is_best:
        shutil.copyfile(filename, 'model_best.pth.tar')

Load the datasets and start data loading threads

Training data

For the training data, we ask that all the data is loaded into memory. Since we need to get many, many batches to train the network, reducing the time to get a batch of images will pay off in the long run.

However, we first pay an upfront cost: this step takes a LONG time.

In [14]:
#capevents = 500 # first two lines are for debug: capping events to keep this step short
#iotrain = LArCVDataset("train_dataloader.cfg", "ThreadProcessor", loadallinmem=True, max_inmem_events=capevents)
iotrain = LArCVDataset("train_dataloader.cfg", "ThreadProcessor", loadallinmem=True)
iotrain.start(batchsize)
Attempting to load all  50000  into memory. good luck
loading event 1000 of 50000
loading event 2000 of 50000
loading event 3000 of 50000
loading event 4000 of 50000
loading event 5000 of 50000
loading event 6000 of 50000
loading event 7000 of 50000
loading event 8000 of 50000
loading event 9000 of 50000
loading event 10000 of 50000
loading event 11000 of 50000
loading event 12000 of 50000
loading event 13000 of 50000
loading event 14000 of 50000
loading event 15000 of 50000
loading event 16000 of 50000
loading event 17000 of 50000
loading event 18000 of 50000
loading event 19000 of 50000
loading event 20000 of 50000
loading event 21000 of 50000
loading event 22000 of 50000
loading event 23000 of 50000
loading event 24000 of 50000
loading event 25000 of 50000
loading event 26000 of 50000
loading event 27000 of 50000
loading event 28000 of 50000
loading event 29000 of 50000
loading event 30000 of 50000
loading event 31000 of 50000
loading event 32000 of 50000
loading event 33000 of 50000
loading event 34000 of 50000
loading event 35000 of 50000
loading event 36000 of 50000
loading event 37000 of 50000
loading event 38000 of 50000
loading event 39000 of 50000
loading event 40000 of 50000
loading event 41000 of 50000
loading event 42000 of 50000
loading event 43000 of 50000
loading event 44000 of 50000
loading event 45000 of 50000
loading event 46000 of 50000
loading event 47000 of 50000
loading event 48000 of 50000
loading event 49000 of 50000
elapsed time to bring data into memory:  2608.85350108 sec
ThreadProcessor : {
  InputFiles : ["/home/taritree/working/dlphysics/testset/train_50k.root"]
  NumBatchStorage : 3
  NumThreads : 3
  ProcessName : ["image","label"]
  ProcessType : ["BatchFillerImage2D","BatchFillerPIDLabel"]
  RandomAccess : true
  Verbosity : 3
  ProcessList : {
    image : {
      Channels : [2]
      EnableMirror : true
      ImageProducer : "data"
      Verbosity : 3
    }

    label : {
      ParticleProducer : "mctruth"
      PdgClassList : [2212,11,211,13,22]
      Verbosity : 3
    }

  }

}

 setting verbosity 3
Error in <TProtoClass::FindDataMember>: data member with index 0 is not found in class thread
Error in <CreateRealData>: Cannot find data member # 0 of class thread for parent larcv::ThreadProcessor!

Validation data

For the validation data, we do not load data into memory all at once. We will use the validation only periodically, in between many training steps. During those training steps, the thread filler will load data into memory.

In [15]:
iovalid = LArCVDataset("valid_dataloader.cfg", "ThreadProcessorTest")
iovalid.start(batchsize_valid)
 setting verbosity 3

Training Loop

For each iteration of the training loop, we

  • set the learning rate
  • perform a training iteration which involves forward and backward passes for the number of batches set in nbatches_per_iteration
  • run a validation iteration over nbatches_per_valid
  • for both the training and validation iterations, we save the average loss and average accuracy over the batches. Values are stored in a numpy array
  • we update the plot the training versus validation loss
  • every 10 epochs (i.e. 50 iterations), we save the state of the model and the optimizer
In [16]:
best_prec1 = 0.0

# define plots
fig,axs = plt.subplots(1,2)
axs[0].set_xlabel('epoch')
axs[0].set_ylabel('loss')
axs[0].set_ylim(5e-2,10.0)
axs[0].set_yscale("log")

axs[1].set_xlabel('epoch')
axs[1].set_ylabel('accuracy')
axs[1].set_ylim(0.0,100.0)

# iterations:
# there are 50k events in the training set
# we use 10k images per training iteration
# therefore, 1 epoch is 5 iterations
start_iteration = 5*start_epoch
end_iteration   = 5*epochs
num_iterations  = end_iteration - start_iteration
x = np.linspace(start_epoch,epochs,num_iterations)

# numpy arrays for loss and accuracy
y_train_loss = np.zeros(num_iterations)
y_train_acc  = np.zeros(num_iterations)
y_valid_loss = np.zeros(num_iterations)
y_valid_acc  = np.zeros(num_iterations)


for iiter in range(0,num_iterations):
    
    iteration = start_iteration + iiter
    epoch = float(iteration)/5.0
    
    # set the learning rate
    adjust_learning_rate(optimizer, iteration, lr)
    iterout = "Iteration [%d]: "%(iteration)
    for param_group in optimizer.param_groups:
        iterout += "lr=%.3e"%(param_group['lr'])
    print iterout

    # train for one iteration                                                                                                                                                                               
    try:
        train_ave_loss, train_ave_acc = train(iotrain, model, criterion, optimizer, 
                                              nbatches_per_iteration, iteration, -1)
    except Exception,e:
        print "Error in training routine!"
        print e.message
        print e.__class__.__name__
        traceback.print_exc(e)
        break
    print "Iteration [%d] train aveloss=%.3f aveacc=%.3f"%(iteration,
                                                           train_ave_loss,
                                                           train_ave_acc)
    y_train_loss[iiter] = train_ave_loss
    y_train_acc[iiter]  = train_ave_acc

    # evaluate on validation set                                                                                                                                                                        
    try:
        prec1,valid_loss = validate(iovalid, model, criterion, nbatches_per_valid, -1)
    except Exception,e:
        print "Error in validation routine!"
        print e.message
        print e.__class__.__name__
        traceback.print_exc(e)
        break
    print "Test[%d]:Result* Prec@1 %.3f\tLoss %.3f"%(iteration,prec1,valid_loss)
    y_valid_loss[iiter] = valid_loss
    y_valid_acc[iiter]  = prec1
        
    # plot up to current iteration
    axs[0].plot(x[:iiter+1],y_train_loss[:iiter+1],'b')
    axs[0].plot(x[:iiter+1],y_valid_loss[:iiter+1],'r')
    
    # plot up to current iteration
    axs[1].plot(x[:iiter+1],y_train_acc[:iiter+1],'b')
    axs[1].plot(x[:iiter+1],y_valid_acc[:iiter+1],'r')
    
    fig.canvas.draw()
    
    # remember best prec@1 and save checkpoint                                                                                                                                                          
    is_best = prec1 > best_prec1
    best_prec1 = max(prec1, best_prec1)
    save_checkpoint({
        'epoch': iteration + 1,
        'state_dict': model.state_dict(),
        'best_prec1': best_prec1,
        'optimizer' : optimizer.state_dict(),
    }, is_best, -1)
    if iteration%50==0:
        save_checkpoint({
            'epoch': iteration + 1,
            'state_dict': model.state_dict(),
            'best_prec1': best_prec1,
            'optimizer' : optimizer.state_dict(),
        }, False, iteration)
Iteration [0]: lr=1.000e-03
Iteration [0] train aveloss=1.516 aveacc=32.930
Test[0]:Result* Prec@1 43.700	Loss 1.348
Iteration [1]: lr=1.000e-03
Iteration [1] train aveloss=1.299 aveacc=44.890
Test[1]:Result* Prec@1 52.100	Loss 1.148
Iteration [2]: lr=1.000e-03
Iteration [2] train aveloss=1.055 aveacc=53.830
Test[2]:Result* Prec@1 59.500	Loss 0.893
Iteration [3]: lr=1.000e-03
Iteration [3] train aveloss=0.920 aveacc=58.850
Test[3]:Result* Prec@1 59.000	Loss 0.852
Iteration [4]: lr=1.000e-03
Iteration [4] train aveloss=0.866 aveacc=60.510
Test[4]:Result* Prec@1 61.400	Loss 0.859
Iteration [5]: lr=1.000e-03
Iteration [5] train aveloss=0.826 aveacc=62.280
Test[5]:Result* Prec@1 63.900	Loss 0.744
Iteration [6]: lr=1.000e-03
Iteration [6] train aveloss=0.817 aveacc=62.480
Test[6]:Result* Prec@1 64.200	Loss 0.728
Iteration [7]: lr=1.000e-03
Iteration [7] train aveloss=0.793 aveacc=63.020
Test[7]:Result* Prec@1 64.400	Loss 0.775
Iteration [8]: lr=1.000e-03
Iteration [8] train aveloss=0.767 aveacc=63.340
Test[8]:Result* Prec@1 66.000	Loss 0.696
Iteration [9]: lr=1.000e-03
Iteration [9] train aveloss=0.753 aveacc=63.620
Test[9]:Result* Prec@1 66.400	Loss 0.677
Iteration [10]: lr=1.000e-03
Iteration [10] train aveloss=0.730 aveacc=64.260
Test[10]:Result* Prec@1 63.200	Loss 0.734
Iteration [11]: lr=1.000e-03
Iteration [11] train aveloss=0.743 aveacc=64.010
Test[11]:Result* Prec@1 66.500	Loss 0.674
Iteration [12]: lr=1.000e-03
Iteration [12] train aveloss=0.728 aveacc=65.130
Test[12]:Result* Prec@1 66.300	Loss 0.704
Iteration [13]: lr=1.000e-03
Iteration [13] train aveloss=0.719 aveacc=64.820
Test[13]:Result* Prec@1 64.400	Loss 0.687
Iteration [14]: lr=1.000e-03
Iteration [14] train aveloss=0.740 aveacc=64.320
Test[14]:Result* Prec@1 66.300	Loss 0.663
Iteration [15]: lr=1.000e-03
Iteration [15] train aveloss=0.701 aveacc=65.680
Test[15]:Result* Prec@1 65.300	Loss 0.708
Iteration [16]: lr=1.000e-03
Iteration [16] train aveloss=0.697 aveacc=66.300
Test[16]:Result* Prec@1 65.000	Loss 0.731
Iteration [17]: lr=1.000e-03
Iteration [17] train aveloss=0.701 aveacc=64.950
Test[17]:Result* Prec@1 67.700	Loss 0.642
Iteration [18]: lr=1.000e-03
Iteration [18] train aveloss=0.691 aveacc=65.720
Test[18]:Result* Prec@1 69.500	Loss 0.622
Iteration [19]: lr=1.000e-03
Iteration [19] train aveloss=0.690 aveacc=66.900
Test[19]:Result* Prec@1 67.400	Loss 0.696
Iteration [20]: lr=1.000e-03
Iteration [20] train aveloss=0.692 aveacc=66.050
Test[20]:Result* Prec@1 68.900	Loss 0.644
Iteration [21]: lr=1.000e-03
Iteration [21] train aveloss=0.682 aveacc=66.110
Test[21]:Result* Prec@1 68.400	Loss 0.647
Iteration [22]: lr=1.000e-03
Iteration [22] train aveloss=0.691 aveacc=65.720
Test[22]:Result* Prec@1 67.300	Loss 0.666
Iteration [23]: lr=1.000e-03
Iteration [23] train aveloss=0.676 aveacc=67.310
Test[23]:Result* Prec@1 69.100	Loss 0.652
Iteration [24]: lr=1.000e-03
Iteration [24] train aveloss=0.663 aveacc=67.610
Test[24]:Result* Prec@1 67.000	Loss 0.643
Iteration [25]: lr=1.000e-03
Iteration [25] train aveloss=0.660 aveacc=67.830
Test[25]:Result* Prec@1 68.400	Loss 0.627
Iteration [26]: lr=1.000e-03
Iteration [26] train aveloss=0.663 aveacc=67.840
Test[26]:Result* Prec@1 70.900	Loss 0.615
Iteration [27]: lr=1.000e-03
Iteration [27] train aveloss=0.665 aveacc=67.270
Test[27]:Result* Prec@1 70.100	Loss 0.620
Iteration [28]: lr=1.000e-03
Iteration [28] train aveloss=0.650 aveacc=67.610
Test[28]:Result* Prec@1 67.400	Loss 0.644
Iteration [29]: lr=1.000e-03
Iteration [29] train aveloss=0.660 aveacc=68.390
Test[29]:Result* Prec@1 71.000	Loss 0.621
Iteration [30]: lr=1.000e-03
Iteration [30] train aveloss=0.653 aveacc=68.340
Test[30]:Result* Prec@1 68.300	Loss 0.637
Iteration [31]: lr=1.000e-03
Iteration [31] train aveloss=0.646 aveacc=69.090
Test[31]:Result* Prec@1 69.200	Loss 0.615
Iteration [32]: lr=1.000e-03
Iteration [32] train aveloss=0.637 aveacc=69.270
Test[32]:Result* Prec@1 73.100	Loss 0.587
Iteration [33]: lr=1.000e-03
Iteration [33] train aveloss=0.658 aveacc=68.120
Test[33]:Result* Prec@1 71.700	Loss 0.618
Iteration [34]: lr=1.000e-03
Iteration [34] train aveloss=0.625 aveacc=70.150
Test[34]:Result* Prec@1 71.400	Loss 0.599
Iteration [35]: lr=1.000e-03
Iteration [35] train aveloss=0.646 aveacc=69.010
Test[35]:Result* Prec@1 73.000	Loss 0.594
Iteration [36]: lr=1.000e-03
Iteration [36] train aveloss=0.638 aveacc=69.400
Test[36]:Result* Prec@1 71.100	Loss 0.613
Iteration [37]: lr=1.000e-03
Iteration [37] train aveloss=0.626 aveacc=70.460
Test[37]:Result* Prec@1 70.700	Loss 0.615
Iteration [38]: lr=1.000e-03
Iteration [38] train aveloss=0.615 aveacc=70.800
Test[38]:Result* Prec@1 69.200	Loss 0.638
Iteration [39]: lr=1.000e-03
Iteration [39] train aveloss=0.609 aveacc=71.630
Test[39]:Result* Prec@1 71.600	Loss 0.612
Iteration [40]: lr=1.000e-03
Iteration [40] train aveloss=0.609 aveacc=70.820
Test[40]:Result* Prec@1 69.900	Loss 0.652
Iteration [41]: lr=1.000e-03
Iteration [41] train aveloss=0.611 aveacc=71.200
Test[41]:Result* Prec@1 74.400	Loss 0.565
Iteration [42]: lr=1.000e-03
Iteration [42] train aveloss=0.614 aveacc=71.860
Test[42]:Result* Prec@1 73.000	Loss 0.589
Iteration [43]: lr=1.000e-03
Iteration [43] train aveloss=0.609 aveacc=71.900
Test[43]:Result* Prec@1 74.100	Loss 0.609
Iteration [44]: lr=1.000e-03
Iteration [44] train aveloss=0.597 aveacc=72.570
Test[44]:Result* Prec@1 71.500	Loss 0.635
Iteration [45]: lr=1.000e-03
Iteration [45] train aveloss=0.593 aveacc=73.240
Test[45]:Result* Prec@1 74.800	Loss 0.542
Iteration [46]: lr=1.000e-03
Iteration [46] train aveloss=0.589 aveacc=73.530
Test[46]:Result* Prec@1 72.600	Loss 0.611
Iteration [47]: lr=1.000e-03
Iteration [47] train aveloss=0.583 aveacc=73.500
Test[47]:Result* Prec@1 74.300	Loss 0.575
Iteration [48]: lr=1.000e-03
Iteration [48] train aveloss=0.580 aveacc=74.210
Test[48]:Result* Prec@1 79.000	Loss 0.534
Iteration [49]: lr=1.000e-03
Iteration [49] train aveloss=0.589 aveacc=73.800
Test[49]:Result* Prec@1 73.300	Loss 0.589
Iteration [50]: lr=1.000e-03
Iteration [50] train aveloss=0.566 aveacc=75.180
Test[50]:Result* Prec@1 78.900	Loss 0.517
Iteration [51]: lr=1.000e-03
Iteration [51] train aveloss=0.571 aveacc=74.870
Test[51]:Result* Prec@1 75.700	Loss 0.537
Iteration [52]: lr=1.000e-03
Iteration [52] train aveloss=0.555 aveacc=76.270
Test[52]:Result* Prec@1 76.400	Loss 0.574
Iteration [53]: lr=1.000e-03
Iteration [53] train aveloss=0.567 aveacc=75.590
Test[53]:Result* Prec@1 77.300	Loss 0.543
Iteration [54]: lr=1.000e-03
Iteration [54] train aveloss=0.535 aveacc=77.280
Test[54]:Result* Prec@1 71.100	Loss 0.664
Iteration [55]: lr=1.000e-03
Iteration [55] train aveloss=0.551 aveacc=76.490
Test[55]:Result* Prec@1 73.500	Loss 0.628
Iteration [56]: lr=1.000e-03
Iteration [56] train aveloss=0.537 aveacc=76.860
Test[56]:Result* Prec@1 73.500	Loss 0.710
Iteration [57]: lr=1.000e-03
Iteration [57] train aveloss=0.528 aveacc=77.840
Test[57]:Result* Prec@1 75.600	Loss 0.563
Iteration [58]: lr=1.000e-03
Iteration [58] train aveloss=0.532 aveacc=77.470
Test[58]:Result* Prec@1 76.400	Loss 0.525
Iteration [59]: lr=1.000e-03
Iteration [59] train aveloss=0.522 aveacc=78.630
Test[59]:Result* Prec@1 79.400	Loss 0.505
Iteration [60]: lr=1.000e-03
Iteration [60] train aveloss=0.540 aveacc=77.500
Test[60]:Result* Prec@1 79.900	Loss 0.495
Iteration [61]: lr=1.000e-03
Iteration [61] train aveloss=0.506 aveacc=78.630
Test[61]:Result* Prec@1 78.700	Loss 0.500
Iteration [62]: lr=1.000e-03
Iteration [62] train aveloss=0.496 aveacc=79.010
Test[62]:Result* Prec@1 75.100	Loss 0.575
Iteration [63]: lr=1.000e-03
Iteration [63] train aveloss=0.515 aveacc=78.500
Test[63]:Result* Prec@1 77.500	Loss 0.555
Iteration [64]: lr=1.000e-03
Iteration [64] train aveloss=0.508 aveacc=78.900
Test[64]:Result* Prec@1 78.600	Loss 0.508
Iteration [65]: lr=1.000e-03
Iteration [65] train aveloss=0.500 aveacc=79.450
Test[65]:Result* Prec@1 76.800	Loss 0.542
Iteration [66]: lr=1.000e-03
Iteration [66] train aveloss=0.499 aveacc=79.760
Test[66]:Result* Prec@1 79.900	Loss 0.485
Iteration [67]: lr=1.000e-03
Iteration [67] train aveloss=0.505 aveacc=79.570
Test[67]:Result* Prec@1 80.300	Loss 0.504
Iteration [68]: lr=1.000e-03
Iteration [68] train aveloss=0.484 aveacc=80.400
Test[68]:Result* Prec@1 80.800	Loss 0.462
Iteration [69]: lr=1.000e-03
Iteration [69] train aveloss=0.485 aveacc=80.140
Test[69]:Result* Prec@1 79.400	Loss 0.491
Iteration [70]: lr=1.000e-03
Iteration [70] train aveloss=0.483 aveacc=79.860
Test[70]:Result* Prec@1 77.800	Loss 0.496
Iteration [71]: lr=1.000e-03
Iteration [71] train aveloss=0.481 aveacc=80.830
Test[71]:Result* Prec@1 81.900	Loss 0.483
Iteration [72]: lr=1.000e-03
Iteration [72] train aveloss=0.469 aveacc=81.060
Test[72]:Result* Prec@1 80.500	Loss 0.478
Iteration [73]: lr=1.000e-03
Iteration [73] train aveloss=0.489 aveacc=80.470
Test[73]:Result* Prec@1 78.500	Loss 0.553
Iteration [74]: lr=1.000e-03
Iteration [74] train aveloss=0.474 aveacc=80.960
Test[74]:Result* Prec@1 79.200	Loss 0.560
Iteration [75]: lr=1.000e-03
Iteration [75] train aveloss=0.467 aveacc=80.810
Test[75]:Result* Prec@1 79.000	Loss 0.503
Iteration [76]: lr=1.000e-03
Iteration [76] train aveloss=0.482 aveacc=79.980
Test[76]:Result* Prec@1 79.800	Loss 0.478
Iteration [77]: lr=1.000e-03
Iteration [77] train aveloss=0.467 aveacc=81.060
Test[77]:Result* Prec@1 74.200	Loss 0.590
Iteration [78]: lr=1.000e-03
Iteration [78] train aveloss=0.458 aveacc=81.330
Test[78]:Result* Prec@1 79.500	Loss 0.521
Iteration [79]: lr=1.000e-03
Iteration [79] train aveloss=0.449 aveacc=81.510
Test[79]:Result* Prec@1 79.600	Loss 0.492
Iteration [80]: lr=1.000e-03
Iteration [80] train aveloss=0.459 aveacc=81.170
Test[80]:Result* Prec@1 81.700	Loss 0.460
Iteration [81]: lr=1.000e-03
Iteration [81] train aveloss=0.451 aveacc=81.900
Test[81]:Result* Prec@1 81.500	Loss 0.492
Iteration [82]: lr=1.000e-03
Iteration [82] train aveloss=0.451 aveacc=82.020
Test[82]:Result* Prec@1 84.300	Loss 0.433
Iteration [83]: lr=1.000e-03
Iteration [83] train aveloss=0.446 aveacc=82.040
Test[83]:Result* Prec@1 83.700	Loss 0.416
Iteration [84]: lr=1.000e-03
Iteration [84] train aveloss=0.449 aveacc=81.760
Test[84]:Result* Prec@1 77.900	Loss 0.540
Iteration [85]: lr=1.000e-03
Iteration [85] train aveloss=0.456 aveacc=81.580
Test[85]:Result* Prec@1 82.600	Loss 0.419
Iteration [86]: lr=1.000e-03
Iteration [86] train aveloss=0.453 aveacc=81.590
Test[86]:Result* Prec@1 82.000	Loss 0.508
Iteration [87]: lr=1.000e-03
Iteration [87] train aveloss=0.430 aveacc=82.460
Test[87]:Result* Prec@1 78.800	Loss 0.505
Iteration [88]: lr=1.000e-03
Iteration [88] train aveloss=0.449 aveacc=82.210
Test[88]:Result* Prec@1 80.500	Loss 0.490
Iteration [89]: lr=1.000e-03
Iteration [89] train aveloss=0.442 aveacc=82.410
Test[89]:Result* Prec@1 79.100	Loss 0.494
Iteration [90]: lr=1.000e-03
Iteration [90] train aveloss=0.425 aveacc=82.650
Test[90]:Result* Prec@1 84.200	Loss 0.421
Iteration [91]: lr=1.000e-03
Iteration [91] train aveloss=0.426 aveacc=82.930
Test[91]:Result* Prec@1 79.600	Loss 0.504
Iteration [92]: lr=1.000e-03
Iteration [92] train aveloss=0.439 aveacc=82.460
Test[92]:Result* Prec@1 81.800	Loss 0.492
Iteration [93]: lr=1.000e-03
Iteration [93] train aveloss=0.444 aveacc=82.220
Test[93]:Result* Prec@1 82.800	Loss 0.415
Iteration [94]: lr=1.000e-03
Iteration [94] train aveloss=0.439 aveacc=83.040
Test[94]:Result* Prec@1 82.500	Loss 0.433
Iteration [95]: lr=1.000e-03
Iteration [95] train aveloss=0.429 aveacc=82.790
Test[95]:Result* Prec@1 78.300	Loss 0.541
Iteration [96]: lr=1.000e-03
Iteration [96] train aveloss=0.421 aveacc=83.000
Test[96]:Result* Prec@1 79.000	Loss 0.522
Iteration [97]: lr=1.000e-03
Iteration [97] train aveloss=0.420 aveacc=83.300
Test[97]:Result* Prec@1 81.300	Loss 0.469
Iteration [98]: lr=1.000e-03
Iteration [98] train aveloss=0.431 aveacc=82.560
Test[98]:Result* Prec@1 84.700	Loss 0.377
Iteration [99]: lr=1.000e-03
Iteration [99] train aveloss=0.437 aveacc=82.600
Test[99]:Result* Prec@1 81.000	Loss 0.476
Iteration [100]: lr=1.000e-03
Iteration [100] train aveloss=0.420 aveacc=82.920
Test[100]:Result* Prec@1 83.100	Loss 0.425
Iteration [101]: lr=1.000e-03
Iteration [101] train aveloss=0.431 aveacc=82.850
Test[101]:Result* Prec@1 82.900	Loss 0.429
Iteration [102]: lr=1.000e-03
Iteration [102] train aveloss=0.428 aveacc=82.920
Test[102]:Result* Prec@1 80.800	Loss 0.451
Iteration [103]: lr=1.000e-03
Iteration [103] train aveloss=0.412 aveacc=83.820
Test[103]:Result* Prec@1 82.300	Loss 0.472
Iteration [104]: lr=1.000e-03
Iteration [104] train aveloss=0.410 aveacc=83.400
Test[104]:Result* Prec@1 83.600	Loss 0.426
Iteration [105]: lr=1.000e-03
Iteration [105] train aveloss=0.420 aveacc=83.780
Test[105]:Result* Prec@1 84.300	Loss 0.405
Iteration [106]: lr=1.000e-03
Iteration [106] train aveloss=0.393 aveacc=84.330
Test[106]:Result* Prec@1 83.800	Loss 0.415
Iteration [107]: lr=1.000e-03
Iteration [107] train aveloss=0.407 aveacc=83.810
Test[107]:Result* Prec@1 82.500	Loss 0.430
Iteration [108]: lr=1.000e-03
Iteration [108] train aveloss=0.405 aveacc=83.960
Test[108]:Result* Prec@1 79.000	Loss 0.559
Iteration [109]: lr=1.000e-03
Iteration [109] train aveloss=0.395 aveacc=84.550
Test[109]:Result* Prec@1 83.300	Loss 0.399
Iteration [110]: lr=1.000e-03
Iteration [110] train aveloss=0.390 aveacc=84.820
Test[110]:Result* Prec@1 82.700	Loss 0.394
Iteration [111]: lr=1.000e-03
Iteration [111] train aveloss=0.397 aveacc=84.160
Test[111]:Result* Prec@1 82.700	Loss 0.448
Iteration [112]: lr=1.000e-03
Iteration [112] train aveloss=0.411 aveacc=83.760
Test[112]:Result* Prec@1 84.700	Loss 0.393
Iteration [113]: lr=1.000e-03
Iteration [113] train aveloss=0.397 aveacc=84.430
Test[113]:Result* Prec@1 85.900	Loss 0.390
Iteration [114]: lr=1.000e-03
Iteration [114] train aveloss=0.401 aveacc=84.370
Test[114]:Result* Prec@1 85.700	Loss 0.376
Iteration [115]: lr=1.000e-03
Iteration [115] train aveloss=0.388 aveacc=84.850
Test[115]:Result* Prec@1 81.600	Loss 0.425
Iteration [116]: lr=1.000e-03
Iteration [116] train aveloss=0.405 aveacc=84.180
Test[116]:Result* Prec@1 81.700	Loss 0.519
Iteration [117]: lr=1.000e-03
Iteration [117] train aveloss=0.398 aveacc=84.390
Test[117]:Result* Prec@1 81.300	Loss 0.473
Iteration [118]: lr=1.000e-03
Iteration [118] train aveloss=0.392 aveacc=84.540
Test[118]:Result* Prec@1 81.400	Loss 0.472
Iteration [119]: lr=1.000e-03
Iteration [119] train aveloss=0.392 aveacc=84.450
Test[119]:Result* Prec@1 79.100	Loss 0.536
Iteration [120]: lr=1.000e-03
Iteration [120] train aveloss=0.384 aveacc=85.390
Test[120]:Result* Prec@1 82.900	Loss 0.419
Iteration [121]: lr=1.000e-03
Iteration [121] train aveloss=0.388 aveacc=84.250
Test[121]:Result* Prec@1 83.500	Loss 0.434
Iteration [122]: lr=1.000e-03
Iteration [122] train aveloss=0.379 aveacc=85.390
Test[122]:Result* Prec@1 86.100	Loss 0.381
Iteration [123]: lr=1.000e-03
Iteration [123] train aveloss=0.391 aveacc=84.620
Test[123]:Result* Prec@1 85.900	Loss 0.357
Iteration [124]: lr=1.000e-03
Iteration [124] train aveloss=0.385 aveacc=84.860
Test[124]:Result* Prec@1 81.200	Loss 0.433
Iteration [125]: lr=1.000e-03
Iteration [125] train aveloss=0.379 aveacc=85.180
Test[125]:Result* Prec@1 83.600	Loss 0.385
Iteration [126]: lr=1.000e-03
Iteration [126] train aveloss=0.382 aveacc=85.020
Test[126]:Result* Prec@1 80.000	Loss 0.505
Iteration [127]: lr=1.000e-03
Iteration [127] train aveloss=0.384 aveacc=84.720
Test[127]:Result* Prec@1 83.700	Loss 0.394
Iteration [128]: lr=1.000e-03
Iteration [128] train aveloss=0.385 aveacc=85.200
Test[128]:Result* Prec@1 79.000	Loss 0.648
Iteration [129]: lr=1.000e-03
Iteration [129] train aveloss=0.387 aveacc=84.860
Test[129]:Result* Prec@1 84.300	Loss 0.402
Iteration [130]: lr=1.000e-03
Iteration [130] train aveloss=0.392 aveacc=84.590
Test[130]:Result* Prec@1 82.500	Loss 0.431
Iteration [131]: lr=1.000e-03
Iteration [131] train aveloss=0.374 aveacc=85.110
Test[131]:Result* Prec@1 83.900	Loss 0.408
Iteration [132]: lr=1.000e-03
Iteration [132] train aveloss=0.383 aveacc=84.760
Test[132]:Result* Prec@1 82.100	Loss 0.518
Iteration [133]: lr=1.000e-03
Iteration [133] train aveloss=0.374 aveacc=85.180
Test[133]:Result* Prec@1 83.700	Loss 0.427
Iteration [134]: lr=1.000e-03
Iteration [134] train aveloss=0.380 aveacc=85.020
Test[134]:Result* Prec@1 83.000	Loss 0.426
Iteration [135]: lr=1.000e-03
Iteration [135] train aveloss=0.377 aveacc=85.150
Test[135]:Result* Prec@1 83.300	Loss 0.420
Iteration [136]: lr=1.000e-03
Iteration [136] train aveloss=0.379 aveacc=85.260
Test[136]:Result* Prec@1 82.100	Loss 0.473
Iteration [137]: lr=1.000e-03
Iteration [137] train aveloss=0.363 aveacc=85.690
Test[137]:Result* Prec@1 82.300	Loss 0.439
Iteration [138]: lr=1.000e-03
Iteration [138] train aveloss=0.370 aveacc=85.760
Test[138]:Result* Prec@1 85.100	Loss 0.373
Iteration [139]: lr=1.000e-03
Iteration [139] train aveloss=0.375 aveacc=85.210
Test[139]:Result* Prec@1 81.600	Loss 0.473
Iteration [140]: lr=1.000e-03
Iteration [140] train aveloss=0.380 aveacc=84.790
Test[140]:Result* Prec@1 84.800	Loss 0.415
Iteration [141]: lr=1.000e-03
Iteration [141] train aveloss=0.364 aveacc=85.520
Test[141]:Result* Prec@1 84.100	Loss 0.449
Iteration [142]: lr=1.000e-03
Iteration [142] train aveloss=0.367 aveacc=85.750
Test[142]:Result* Prec@1 82.400	Loss 0.437
Iteration [143]: lr=1.000e-03
Iteration [143] train aveloss=0.363 aveacc=85.610
Test[143]:Result* Prec@1 81.900	Loss 0.435
Iteration [144]: lr=1.000e-03
Iteration [144] train aveloss=0.348 aveacc=86.510
Test[144]:Result* Prec@1 82.900	Loss 0.441
Iteration [145]: lr=1.000e-03
Iteration [145] train aveloss=0.366 aveacc=85.860
Test[145]:Result* Prec@1 84.800	Loss 0.407
Iteration [146]: lr=1.000e-03
Iteration [146] train aveloss=0.367 aveacc=85.360
Test[146]:Result* Prec@1 83.500	Loss 0.429
Iteration [147]: lr=1.000e-03
Iteration [147] train aveloss=0.363 aveacc=85.760
Test[147]:Result* Prec@1 84.300	Loss 0.385
Iteration [148]: lr=1.000e-03
Iteration [148] train aveloss=0.363 aveacc=85.450
Test[148]:Result* Prec@1 86.800	Loss 0.358
Iteration [149]: lr=1.000e-03
Iteration [149] train aveloss=0.363 aveacc=85.670
Test[149]:Result* Prec@1 84.900	Loss 0.435
Iteration [150]: lr=1.000e-03
Iteration [150] train aveloss=0.369 aveacc=85.820
Test[150]:Result* Prec@1 81.700	Loss 0.449
Iteration [151]: lr=1.000e-03
Iteration [151] train aveloss=0.364 aveacc=85.790
Test[151]:Result* Prec@1 85.100	Loss 0.377
Iteration [152]: lr=1.000e-03
Iteration [152] train aveloss=0.365 aveacc=85.140
Test[152]:Result* Prec@1 83.500	Loss 0.429
Iteration [153]: lr=1.000e-03
Iteration [153] train aveloss=0.341 aveacc=86.880
Test[153]:Result* Prec@1 86.800	Loss 0.348
Iteration [154]: lr=1.000e-03
Iteration [154] train aveloss=0.353 aveacc=86.210
Test[154]:Result* Prec@1 87.200	Loss 0.330
Iteration [155]: lr=1.000e-03
Iteration [155] train aveloss=0.351 aveacc=86.310
Test[155]:Result* Prec@1 80.200	Loss 0.519
Iteration [156]: lr=1.000e-03
Iteration [156] train aveloss=0.354 aveacc=86.370
Test[156]:Result* Prec@1 85.300	Loss 0.372
Iteration [157]: lr=1.000e-03
Iteration [157] train aveloss=0.368 aveacc=85.830
Test[157]:Result* Prec@1 83.900	Loss 0.391
Iteration [158]: lr=1.000e-03
Iteration [158] train aveloss=0.359 aveacc=85.880
Test[158]:Result* Prec@1 83.200	Loss 0.475
Iteration [159]: lr=1.000e-03
Iteration [159] train aveloss=0.363 aveacc=85.900
Test[159]:Result* Prec@1 84.100	Loss 0.429
Iteration [160]: lr=1.000e-03
Iteration [160] train aveloss=0.356 aveacc=86.250
Test[160]:Result* Prec@1 80.400	Loss 0.694
Iteration [161]: lr=1.000e-03
Iteration [161] train aveloss=0.371 aveacc=85.300
Test[161]:Result* Prec@1 85.900	Loss 0.389
Iteration [162]: lr=1.000e-03
Iteration [162] train aveloss=0.344 aveacc=86.790
Test[162]:Result* Prec@1 85.700	Loss 0.369
Iteration [163]: lr=1.000e-03
Iteration [163] train aveloss=0.342 aveacc=86.270
Test[163]:Result* Prec@1 82.700	Loss 0.415
Iteration [164]: lr=1.000e-03
Iteration [164] train aveloss=0.348 aveacc=86.390
Test[164]:Result* Prec@1 83.900	Loss 0.382
Iteration [165]: lr=1.000e-03
Iteration [165] train aveloss=0.346 aveacc=86.130
Test[165]:Result* Prec@1 86.000	Loss 0.357
Iteration [166]: lr=1.000e-03
Iteration [166] train aveloss=0.348 aveacc=86.210
Test[166]:Result* Prec@1 85.500	Loss 0.394
Iteration [167]: lr=1.000e-03
Iteration [167] train aveloss=0.354 aveacc=86.350
Test[167]:Result* Prec@1 88.100	Loss 0.335
Iteration [168]: lr=1.000e-03
Iteration [168] train aveloss=0.339 aveacc=86.560
Test[168]:Result* Prec@1 86.800	Loss 0.344
Iteration [169]: lr=1.000e-03
Iteration [169] train aveloss=0.356 aveacc=86.110
Test[169]:Result* Prec@1 84.900	Loss 0.376
Iteration [170]: lr=1.000e-03
Iteration [170] train aveloss=0.356 aveacc=86.210
Test[170]:Result* Prec@1 83.200	Loss 0.440
Iteration [171]: lr=1.000e-03
Iteration [171] train aveloss=0.344 aveacc=86.920
Test[171]:Result* Prec@1 83.300	Loss 0.420
Iteration [172]: lr=1.000e-03
Iteration [172] train aveloss=0.344 aveacc=86.710
Test[172]:Result* Prec@1 84.400	Loss 0.395
Iteration [173]: lr=1.000e-03
Iteration [173] train aveloss=0.354 aveacc=85.980
Test[173]:Result* Prec@1 85.100	Loss 0.382
Iteration [174]: lr=1.000e-03
Iteration [174] train aveloss=0.341 aveacc=86.710
Test[174]:Result* Prec@1 85.900	Loss 0.388
Iteration [175]: lr=1.000e-03
Iteration [175] train aveloss=0.352 aveacc=86.500
Test[175]:Result* Prec@1 82.900	Loss 0.405
Iteration [176]: lr=1.000e-03
Iteration [176] train aveloss=0.337 aveacc=87.220
Test[176]:Result* Prec@1 83.200	Loss 0.420
Iteration [177]: lr=1.000e-03
Iteration [177] train aveloss=0.351 aveacc=86.510
Test[177]:Result* Prec@1 78.300	Loss 0.534
Iteration [178]: lr=1.000e-03
Iteration [178] train aveloss=0.348 aveacc=86.560
Test[178]:Result* Prec@1 86.200	Loss 0.363
Iteration [179]: lr=1.000e-03
Iteration [179] train aveloss=0.341 aveacc=86.580
Test[179]:Result* Prec@1 84.800	Loss 0.380
Iteration [180]: lr=1.000e-03
Iteration [180] train aveloss=0.345 aveacc=86.220
Test[180]:Result* Prec@1 83.300	Loss 0.440
Iteration [181]: lr=1.000e-03
Iteration [181] train aveloss=0.330 aveacc=87.280
Test[181]:Result* Prec@1 87.000	Loss 0.343
Iteration [182]: lr=1.000e-03
Iteration [182] train aveloss=0.341 aveacc=86.740
Test[182]:Result* Prec@1 81.400	Loss 0.462
Iteration [183]: lr=1.000e-03
Iteration [183] train aveloss=0.344 aveacc=86.400
Test[183]:Result* Prec@1 84.000	Loss 0.415
Iteration [184]: lr=1.000e-03
Iteration [184] train aveloss=0.348 aveacc=86.210
Test[184]:Result* Prec@1 74.500	Loss 1.096
Iteration [185]: lr=1.000e-03
Iteration [185] train aveloss=0.340 aveacc=86.820
Test[185]:Result* Prec@1 85.400	Loss 0.405
Iteration [186]: lr=1.000e-03
Iteration [186] train aveloss=0.330 aveacc=87.560
Test[186]:Result* Prec@1 84.600	Loss 0.407
Iteration [187]: lr=1.000e-03
Iteration [187] train aveloss=0.337 aveacc=86.630
Test[187]:Result* Prec@1 66.800	Loss 1.879
Iteration [188]: lr=1.000e-03
Iteration [188] train aveloss=0.332 aveacc=86.890
Test[188]:Result* Prec@1 84.900	Loss 0.405
Iteration [189]: lr=1.000e-03
Iteration [189] train aveloss=0.330 aveacc=87.070
Test[189]:Result* Prec@1 84.900	Loss 0.385
Iteration [190]: lr=1.000e-03
Iteration [190] train aveloss=0.334 aveacc=87.260
Test[190]:Result* Prec@1 83.100	Loss 0.473
Iteration [191]: lr=1.000e-03
Iteration [191] train aveloss=0.331 aveacc=86.980
Test[191]:Result* Prec@1 83.900	Loss 0.406
Iteration [192]: lr=1.000e-03
Iteration [192] train aveloss=0.336 aveacc=86.530
Test[192]:Result* Prec@1 81.900	Loss 0.477
Iteration [193]: lr=1.000e-03
Iteration [193] train aveloss=0.325 aveacc=87.080
Test[193]:Result* Prec@1 87.400	Loss 0.319
Iteration [194]: lr=1.000e-03
Iteration [194] train aveloss=0.327 aveacc=87.270
Test[194]:Result* Prec@1 81.600	Loss 0.471
Iteration [195]: lr=1.000e-03
Iteration [195] train aveloss=0.338 aveacc=87.060
Test[195]:Result* Prec@1 84.000	Loss 0.406
Iteration [196]: lr=1.000e-03
Iteration [196] train aveloss=0.333 aveacc=87.300
Test[196]:Result* Prec@1 85.000	Loss 0.368
Iteration [197]: lr=1.000e-03
Iteration [197] train aveloss=0.331 aveacc=86.860
Test[197]:Result* Prec@1 84.900	Loss 0.382
Iteration [198]: lr=1.000e-03
Iteration [198] train aveloss=0.342 aveacc=86.580
Test[198]:Result* Prec@1 84.300	Loss 0.398
Iteration [199]: lr=1.000e-03
Iteration [199] train aveloss=0.325 aveacc=87.350
Test[199]:Result* Prec@1 83.800	Loss 0.406
Iteration [200]: lr=1.000e-03
Iteration [200] train aveloss=0.333 aveacc=87.050
Test[200]:Result* Prec@1 77.900	Loss 0.713
Iteration [201]: lr=1.000e-03
Iteration [201] train aveloss=0.324 aveacc=87.390
Test[201]:Result* Prec@1 73.700	Loss 1.004
Iteration [202]: lr=1.000e-03
Iteration [202] train aveloss=0.333 aveacc=87.070
Test[202]:Result* Prec@1 86.300	Loss 0.353
Iteration [203]: lr=1.000e-03
Iteration [203] train aveloss=0.323 aveacc=87.410
Test[203]:Result* Prec@1 55.000	Loss 3.883
Iteration [204]: lr=1.000e-03
Iteration [204] train aveloss=0.339 aveacc=86.930
Test[204]:Result* Prec@1 83.000	Loss 0.428
Iteration [205]: lr=1.000e-03
Iteration [205] train aveloss=0.331 aveacc=86.920
Test[205]:Result* Prec@1 82.100	Loss 0.537
Iteration [206]: lr=1.000e-03
Iteration [206] train aveloss=0.325 aveacc=87.090
Test[206]:Result* Prec@1 85.600	Loss 0.385
Iteration [207]: lr=1.000e-03
Iteration [207] train aveloss=0.327 aveacc=87.390
Test[207]:Result* Prec@1 86.700	Loss 0.358
Iteration [208]: lr=1.000e-03
Iteration [208] train aveloss=0.316 aveacc=87.590
Test[208]:Result* Prec@1 84.500	Loss 0.408
Iteration [209]: lr=1.000e-03
Iteration [209] train aveloss=0.327 aveacc=87.450
Test[209]:Result* Prec@1 77.300	Loss 0.575
Iteration [210]: lr=1.000e-03
Iteration [210] train aveloss=0.329 aveacc=87.360
Test[210]:Result* Prec@1 85.800	Loss 0.370
Iteration [211]: lr=1.000e-03
Iteration [211] train aveloss=0.324 aveacc=87.650
Test[211]:Result* Prec@1 83.300	Loss 0.410
Iteration [212]: lr=1.000e-03
Iteration [212] train aveloss=0.330 aveacc=87.130
Test[212]:Result* Prec@1 85.900	Loss 0.375
Iteration [213]: lr=1.000e-03
Iteration [213] train aveloss=0.324 aveacc=87.480
Test[213]:Result* Prec@1 84.500	Loss 0.396
Iteration [214]: lr=1.000e-03
Iteration [214] train aveloss=0.314 aveacc=87.740
Test[214]:Result* Prec@1 83.700	Loss 0.407
Iteration [215]: lr=1.000e-03
Iteration [215] train aveloss=0.325 aveacc=86.890
Test[215]:Result* Prec@1 80.500	Loss 0.489
Iteration [216]: lr=1.000e-03
Iteration [216] train aveloss=0.323 aveacc=87.430
Test[216]:Result* Prec@1 82.600	Loss 0.442
Iteration [217]: lr=1.000e-03
Iteration [217] train aveloss=0.326 aveacc=87.070
Test[217]:Result* Prec@1 82.900	Loss 0.415
Iteration [218]: lr=1.000e-03
Iteration [218] train aveloss=0.319 aveacc=87.480
Test[218]:Result* Prec@1 86.600	Loss 0.359
Iteration [219]: lr=1.000e-03
Iteration [219] train aveloss=0.328 aveacc=87.150
Test[219]:Result* Prec@1 85.000	Loss 0.384
Iteration [220]: lr=1.000e-03
Iteration [220] train aveloss=0.325 aveacc=87.240
Test[220]:Result* Prec@1 85.500	Loss 0.357
Iteration [221]: lr=1.000e-03
Iteration [221] train aveloss=0.321 aveacc=87.600
Test[221]:Result* Prec@1 87.100	Loss 0.334
Iteration [222]: lr=1.000e-03
Iteration [222] train aveloss=0.325 aveacc=87.050
Test[222]:Result* Prec@1 83.400	Loss 0.423
Iteration [223]: lr=1.000e-03
Iteration [223] train aveloss=0.324 aveacc=87.280
Test[223]:Result* Prec@1 85.300	Loss 0.392
Iteration [224]: lr=1.000e-03
Iteration [224] train aveloss=0.313 aveacc=88.110
Test[224]:Result* Prec@1 84.500	Loss 0.390
Iteration [225]: lr=1.000e-03
Iteration [225] train aveloss=0.313 aveacc=87.870
Test[225]:Result* Prec@1 85.100	Loss 0.386
Iteration [226]: lr=1.000e-03
Iteration [226] train aveloss=0.327 aveacc=87.450
Test[226]:Result* Prec@1 84.700	Loss 0.392
Iteration [227]: lr=1.000e-03
Iteration [227] train aveloss=0.308 aveacc=87.970
Test[227]:Result* Prec@1 84.400	Loss 0.457
Iteration [228]: lr=1.000e-03
Iteration [228] train aveloss=0.321 aveacc=87.500
Test[228]:Result* Prec@1 85.000	Loss 0.362
Iteration [229]: lr=1.000e-03
Iteration [229] train aveloss=0.316 aveacc=87.760
Test[229]:Result* Prec@1 74.300	Loss 1.261
Iteration [230]: lr=1.000e-03
Iteration [230] train aveloss=0.320 aveacc=87.690
Test[230]:Result* Prec@1 78.900	Loss 0.705
Iteration [231]: lr=1.000e-03
Iteration [231] train aveloss=0.318 aveacc=87.850
Test[231]:Result* Prec@1 86.400	Loss 0.353
Iteration [232]: lr=1.000e-03
Iteration [232] train aveloss=0.315 aveacc=87.740
Test[232]:Result* Prec@1 83.800	Loss 0.437
Iteration [233]: lr=1.000e-03
Iteration [233] train aveloss=0.311 aveacc=88.120
Test[233]:Result* Prec@1 85.500	Loss 0.377
Iteration [234]: lr=1.000e-03
Iteration [234] train aveloss=0.302 aveacc=88.410
Test[234]:Result* Prec@1 87.600	Loss 0.303
Iteration [235]: lr=1.000e-03
Iteration [235] train aveloss=0.302 aveacc=88.390
Test[235]:Result* Prec@1 73.600	Loss 0.932
Iteration [236]: lr=1.000e-03
Iteration [236] train aveloss=0.320 aveacc=87.420
Test[236]:Result* Prec@1 86.100	Loss 0.339
Iteration [237]: lr=1.000e-03
Iteration [237] train aveloss=0.315 aveacc=87.790
Test[237]:Result* Prec@1 76.000	Loss 1.096
Iteration [238]: lr=1.000e-03
Iteration [238] train aveloss=0.311 aveacc=88.040
Test[238]:Result* Prec@1 85.800	Loss 0.372
Iteration [239]: lr=1.000e-03
Iteration [239] train aveloss=0.310 aveacc=88.250
Test[239]:Result* Prec@1 84.600	Loss 0.397
Iteration [240]: lr=1.000e-03
Iteration [240] train aveloss=0.300 aveacc=88.220
Test[240]:Result* Prec@1 55.500	Loss 5.103
Iteration [241]: lr=1.000e-03
Iteration [241] train aveloss=0.302 aveacc=88.120
Test[241]:Result* Prec@1 84.600	Loss 0.412
Iteration [242]: lr=1.000e-03
Iteration [242] train aveloss=0.309 aveacc=88.050
Test[242]:Result* Prec@1 85.600	Loss 0.381
Iteration [243]: lr=1.000e-03
Iteration [243] train aveloss=0.308 aveacc=87.990
Test[243]:Result* Prec@1 86.300	Loss 0.348
Iteration [244]: lr=1.000e-03
Iteration [244] train aveloss=0.293 aveacc=89.000
Test[244]:Result* Prec@1 85.900	Loss 0.353
Iteration [245]: lr=1.000e-03
Iteration [245] train aveloss=0.305 aveacc=88.300
Test[245]:Result* Prec@1 81.900	Loss 0.423
Iteration [246]: lr=1.000e-03
Iteration [246] train aveloss=0.308 aveacc=87.840
Test[246]:Result* Prec@1 84.300	Loss 0.389
Iteration [247]: lr=1.000e-03
Iteration [247] train aveloss=0.298 aveacc=88.380
Test[247]:Result* Prec@1 87.000	Loss 0.354
Iteration [248]: lr=1.000e-03
Iteration [248] train aveloss=0.310 aveacc=87.910
Test[248]:Result* Prec@1 75.900	Loss 0.634
Iteration [249]: lr=1.000e-03
Iteration [249] train aveloss=0.300 aveacc=88.000
Test[249]:Result* Prec@1 85.800	Loss 0.352
Iteration [250]: lr=1.000e-03
Iteration [250] train aveloss=0.290 aveacc=88.740
Test[250]:Result* Prec@1 87.900	Loss 0.308
Iteration [251]: lr=1.000e-03
Iteration [251] train aveloss=0.308 aveacc=87.670
Test[251]:Result* Prec@1 85.400	Loss 0.372
Iteration [252]: lr=1.000e-03
Iteration [252] train aveloss=0.294 aveacc=88.860
Test[252]:Result* Prec@1 86.600	Loss 0.349
Iteration [253]: lr=1.000e-03
Iteration [253] train aveloss=0.304 aveacc=88.530
Test[253]:Result* Prec@1 85.300	Loss 0.379
Iteration [254]: lr=1.000e-03
Iteration [254] train aveloss=0.303 aveacc=88.550
Test[254]:Result* Prec@1 84.400	Loss 0.395
Iteration [255]: lr=1.000e-03
Iteration [255] train aveloss=0.292 aveacc=88.940
Test[255]:Result* Prec@1 74.900	Loss 1.040
Iteration [256]: lr=1.000e-03
Iteration [256] train aveloss=0.308 aveacc=87.830
Test[256]:Result* Prec@1 83.100	Loss 0.453
Iteration [257]: lr=1.000e-03
Iteration [257] train aveloss=0.302 aveacc=88.090
Test[257]:Result* Prec@1 81.100	Loss 0.470
Iteration [258]: lr=1.000e-03
Iteration [258] train aveloss=0.296 aveacc=88.390
Test[258]:Result* Prec@1 53.100	Loss 4.688
Iteration [259]: lr=1.000e-03
Iteration [259] train aveloss=0.301 aveacc=88.310
Test[259]:Result* Prec@1 56.600	Loss 3.432
Iteration [260]: lr=1.000e-03
Iteration [260] train aveloss=0.298 aveacc=88.300
Test[260]:Result* Prec@1 84.500	Loss 0.391
Iteration [261]: lr=1.000e-03
Iteration [261] train aveloss=0.295 aveacc=88.420
Test[261]:Result* Prec@1 86.400	Loss 0.375
Iteration [262]: lr=1.000e-03
Iteration [262] train aveloss=0.302 aveacc=88.340
Test[262]:Result* Prec@1 69.500	Loss 1.245
Iteration [263]: lr=1.000e-03
Iteration [263] train aveloss=0.302 aveacc=88.160
Test[263]:Result* Prec@1 82.500	Loss 0.427
Iteration [264]: lr=1.000e-03
Iteration [264] train aveloss=0.293 aveacc=88.540
Test[264]:Result* Prec@1 85.000	Loss 0.413
Iteration [265]: lr=1.000e-03
Iteration [265] train aveloss=0.297 aveacc=88.450
Test[265]:Result* Prec@1 79.000	Loss 0.546
Iteration [266]: lr=1.000e-03
Iteration [266] train aveloss=0.282 aveacc=88.970
Test[266]:Result* Prec@1 75.900	Loss 0.950
Iteration [267]: lr=1.000e-03
Iteration [267] train aveloss=0.286 aveacc=89.120
Test[267]:Result* Prec@1 86.700	Loss 0.343
Iteration [268]: lr=1.000e-03
Iteration [268] train aveloss=0.302 aveacc=88.540
Test[268]:Result* Prec@1 85.800	Loss 0.361
Iteration [269]: lr=1.000e-03
Iteration [269] train aveloss=0.294 aveacc=88.670
Test[269]:Result* Prec@1 86.600	Loss 0.358
Iteration [270]: lr=1.000e-03
Iteration [270] train aveloss=0.286 aveacc=89.440
Test[270]:Result* Prec@1 85.000	Loss 0.371
Iteration [271]: lr=1.000e-03
Iteration [271] train aveloss=0.297 aveacc=88.160
Test[271]:Result* Prec@1 76.900	Loss 0.972
Iteration [272]: lr=1.000e-03
Iteration [272] train aveloss=0.293 aveacc=88.710
Test[272]:Result* Prec@1 74.500	Loss 0.798
Iteration [273]: lr=1.000e-03
Iteration [273] train aveloss=0.295 aveacc=88.530
Test[273]:Result* Prec@1 86.700	Loss 0.328
Iteration [274]: lr=1.000e-03
Iteration [274] train aveloss=0.283 aveacc=89.170
Test[274]:Result* Prec@1 68.300	Loss 0.912
Iteration [275]: lr=1.000e-03
Iteration [275] train aveloss=0.286 aveacc=89.270
Test[275]:Result* Prec@1 83.100	Loss 0.528
Iteration [276]: lr=1.000e-03
Iteration [276] train aveloss=0.291 aveacc=89.060
Test[276]:Result* Prec@1 84.800	Loss 0.382
Iteration [277]: lr=1.000e-03
Iteration [277] train aveloss=0.280 aveacc=89.430
Test[277]:Result* Prec@1 64.500	Loss 2.230
Iteration [278]: lr=1.000e-03
Iteration [278] train aveloss=0.288 aveacc=88.960
Test[278]:Result* Prec@1 84.000	Loss 0.446
Iteration [279]: lr=1.000e-03
Iteration [279] train aveloss=0.280 aveacc=89.120
Test[279]:Result* Prec@1 64.600	Loss 2.299
Iteration [280]: lr=1.000e-03
Iteration [280] train aveloss=0.296 aveacc=88.440
Test[280]:Result* Prec@1 83.800	Loss 0.502
Iteration [281]: lr=1.000e-03
Iteration [281] train aveloss=0.289 aveacc=88.680
Test[281]:Result* Prec@1 73.000	Loss 1.374
Iteration [282]: lr=1.000e-03
Iteration [282] train aveloss=0.283 aveacc=89.000
Test[282]:Result* Prec@1 86.500	Loss 0.387
Iteration [283]: lr=1.000e-03
Iteration [283] train aveloss=0.294 aveacc=88.650
Test[283]:Result* Prec@1 83.500	Loss 0.376
Iteration [284]: lr=1.000e-03
Iteration [284] train aveloss=0.291 aveacc=88.580
Test[284]:Result* Prec@1 76.700	Loss 0.831
Iteration [285]: lr=1.000e-03
Iteration [285] train aveloss=0.289 aveacc=88.800
Test[285]:Result* Prec@1 87.700	Loss 0.302
Iteration [286]: lr=1.000e-03
Iteration [286] train aveloss=0.279 aveacc=89.100
Test[286]:Result* Prec@1 84.500	Loss 0.412
Iteration [287]: lr=1.000e-03
Iteration [287] train aveloss=0.292 aveacc=89.040
Test[287]:Result* Prec@1 47.300	Loss 7.311
Iteration [288]: lr=1.000e-03
Iteration [288] train aveloss=0.282 aveacc=89.290
Test[288]:Result* Prec@1 77.900	Loss 0.607
Iteration [289]: lr=1.000e-03
Iteration [289] train aveloss=0.290 aveacc=88.780
Test[289]:Result* Prec@1 78.400	Loss 0.541
Iteration [290]: lr=1.000e-03
Iteration [290] train aveloss=0.277 aveacc=89.000
Test[290]:Result* Prec@1 56.500	Loss 4.201
Iteration [291]: lr=1.000e-03
Iteration [291] train aveloss=0.283 aveacc=89.270
Test[291]:Result* Prec@1 85.000	Loss 0.381
Iteration [292]: lr=1.000e-03
Iteration [292] train aveloss=0.271 aveacc=89.700
Test[292]:Result* Prec@1 80.500	Loss 0.526
Iteration [293]: lr=1.000e-03
Iteration [293] train aveloss=0.289 aveacc=88.730
Test[293]:Result* Prec@1 78.400	Loss 0.546
Iteration [294]: lr=1.000e-03
Iteration [294] train aveloss=0.277 aveacc=89.260
Test[294]:Result* Prec@1 46.300	Loss 8.920
Iteration [295]: lr=1.000e-03
Iteration [295] train aveloss=0.263 aveacc=90.100
Test[295]:Result* Prec@1 81.600	Loss 0.473
Iteration [296]: lr=1.000e-03
Iteration [296] train aveloss=0.271 aveacc=89.600
Test[296]:Result* Prec@1 74.000	Loss 0.728
Iteration [297]: lr=1.000e-03
Iteration [297] train aveloss=0.275 aveacc=89.680
Test[297]:Result* Prec@1 41.000	Loss 12.024
Iteration [298]: lr=1.000e-03
Iteration [298] train aveloss=0.272 aveacc=89.660
Test[298]:Result* Prec@1 85.200	Loss 0.382
Iteration [299]: lr=1.000e-03
Iteration [299] train aveloss=0.285 aveacc=88.990
Test[299]:Result* Prec@1 49.300	Loss 6.753
Iteration [300]: lr=1.000e-03
Iteration [300] train aveloss=0.286 aveacc=88.620
Test[300]:Result* Prec@1 44.900	Loss 9.932
Iteration [301]: lr=1.000e-03
Iteration [301] train aveloss=0.283 aveacc=89.190
Test[301]:Result* Prec@1 86.900	Loss 0.343
Iteration [302]: lr=1.000e-03
Iteration [302] train aveloss=0.280 aveacc=89.430
Test[302]:Result* Prec@1 81.000	Loss 0.606
Iteration [303]: lr=1.000e-03
Iteration [303] train aveloss=0.280 aveacc=88.870
Test[303]:Result* Prec@1 84.000	Loss 0.421
Iteration [304]: lr=1.000e-03
Iteration [304] train aveloss=0.275 aveacc=89.290
Test[304]:Result* Prec@1 85.000	Loss 0.380
Iteration [305]: lr=1.000e-03
Iteration [305] train aveloss=0.276 aveacc=89.390
Test[305]:Result* Prec@1 87.400	Loss 0.344
Iteration [306]: lr=1.000e-03
Iteration [306] train aveloss=0.279 aveacc=89.060
Test[306]:Result* Prec@1 85.100	Loss 0.410
Iteration [307]: lr=1.000e-03
Iteration [307] train aveloss=0.271 aveacc=89.720
Test[307]:Result* Prec@1 87.300	Loss 0.341
Iteration [308]: lr=1.000e-03
Iteration [308] train aveloss=0.284 aveacc=89.240
Test[308]:Result* Prec@1 77.000	Loss 0.698
Iteration [309]: lr=1.000e-03
Iteration [309] train aveloss=0.270 aveacc=89.510
Test[309]:Result* Prec@1 80.600	Loss 0.467
Iteration [310]: lr=1.000e-03
Iteration [310] train aveloss=0.276 aveacc=89.600
Test[310]:Result* Prec@1 82.200	Loss 0.458
Iteration [311]: lr=1.000e-03
Iteration [311] train aveloss=0.268 aveacc=89.520
Test[311]:Result* Prec@1 88.300	Loss 0.340
Iteration [312]: lr=1.000e-03
Iteration [312] train aveloss=0.275 aveacc=89.500
Test[312]:Result* Prec@1 85.400	Loss 0.396
Iteration [313]: lr=1.000e-03
Iteration [313] train aveloss=0.271 aveacc=89.660
Test[313]:Result* Prec@1 87.900	Loss 0.328
Iteration [314]: lr=1.000e-03
Iteration [314] train aveloss=0.268 aveacc=89.830
Test[314]:Result* Prec@1 70.400	Loss 1.560
Iteration [315]: lr=1.000e-03
Iteration [315] train aveloss=0.272 aveacc=89.460
Test[315]:Result* Prec@1 67.400	Loss 1.614
Iteration [316]: lr=1.000e-03
Iteration [316] train aveloss=0.270 aveacc=89.660
Test[316]:Result* Prec@1 85.600	Loss 0.365
Iteration [317]: lr=1.000e-03
Iteration [317] train aveloss=0.270 aveacc=89.580
Test[317]:Result* Prec@1 86.200	Loss 0.365
Iteration [318]: lr=1.000e-03
Iteration [318] train aveloss=0.275 aveacc=89.720
Test[318]:Result* Prec@1 80.800	Loss 0.488
Iteration [319]: lr=1.000e-03
Iteration [319] train aveloss=0.263 aveacc=90.160
Test[319]:Result* Prec@1 84.000	Loss 0.440
Iteration [320]: lr=1.000e-03
Iteration [320] train aveloss=0.282 aveacc=89.070
Test[320]:Result* Prec@1 85.600	Loss 0.388
Iteration [321]: lr=1.000e-03
Iteration [321] train aveloss=0.277 aveacc=89.040
Test[321]:Result* Prec@1 84.500	Loss 0.427
Iteration [322]: lr=1.000e-03
Iteration [322] train aveloss=0.270 aveacc=89.270
Test[322]:Result* Prec@1 47.900	Loss 7.543
Iteration [323]: lr=1.000e-03
Iteration [323] train aveloss=0.271 aveacc=89.780
Test[323]:Result* Prec@1 85.200	Loss 0.396
Iteration [324]: lr=1.000e-03
Iteration [324] train aveloss=0.274 aveacc=89.570
Test[324]:Result* Prec@1 83.800	Loss 0.399
Iteration [325]: lr=1.000e-03
Iteration [325] train aveloss=0.266 aveacc=89.800
Test[325]:Result* Prec@1 85.200	Loss 0.368
Iteration [326]: lr=1.000e-03
Iteration [326] train aveloss=0.267 aveacc=90.040
Test[326]:Result* Prec@1 45.000	Loss 12.132
Iteration [327]: lr=1.000e-03
Iteration [327] train aveloss=0.267 aveacc=89.820
Test[327]:Result* Prec@1 87.400	Loss 0.309
Iteration [328]: lr=1.000e-03
Iteration [328] train aveloss=0.258 aveacc=90.070
Test[328]:Result* Prec@1 69.200	Loss 1.440
Iteration [329]: lr=1.000e-03
Iteration [329] train aveloss=0.264 aveacc=89.560
Test[329]:Result* Prec@1 84.100	Loss 0.459
Iteration [330]: lr=1.000e-03
Iteration [330] train aveloss=0.258 aveacc=90.190
Test[330]:Result* Prec@1 78.600	Loss 0.528
Iteration [331]: lr=1.000e-03
Iteration [331] train aveloss=0.273 aveacc=89.460
Test[331]:Result* Prec@1 85.600	Loss 0.387
Iteration [332]: lr=1.000e-03
Iteration [332] train aveloss=0.253 aveacc=90.140
Test[332]:Result* Prec@1 86.300	Loss 0.401
Iteration [333]: lr=1.000e-03
Iteration [333] train aveloss=0.267 aveacc=89.950
Test[333]:Result* Prec@1 84.700	Loss 0.354
Iteration [334]: lr=1.000e-03
Iteration [334] train aveloss=0.269 aveacc=89.370
Test[334]:Result* Prec@1 65.500	Loss 2.042
Iteration [335]: lr=1.000e-03
Iteration [335] train aveloss=0.257 aveacc=90.200
Test[335]:Result* Prec@1 85.100	Loss 0.378
Iteration [336]: lr=1.000e-03
Iteration [336] train aveloss=0.260 aveacc=90.250
Test[336]:Result* Prec@1 77.200	Loss 0.653
Iteration [337]: lr=1.000e-03
Iteration [337] train aveloss=0.255 aveacc=90.210
Test[337]:Result* Prec@1 79.300	Loss 0.590
Iteration [338]: lr=1.000e-03
Iteration [338] train aveloss=0.266 aveacc=89.850
Test[338]:Result* Prec@1 76.700	Loss 0.713
Iteration [339]: lr=1.000e-03
Iteration [339] train aveloss=0.250 aveacc=90.460
Test[339]:Result* Prec@1 83.800	Loss 0.430
Iteration [340]: lr=1.000e-03
Iteration [340] train aveloss=0.265 aveacc=90.020
Test[340]:Result* Prec@1 67.600	Loss 1.007
Iteration [341]: lr=1.000e-03
Iteration [341] train aveloss=0.258 aveacc=90.240
Test[341]:Result* Prec@1 81.300	Loss 0.624
Iteration [342]: lr=1.000e-03
Iteration [342] train aveloss=0.267 aveacc=90.330
Test[342]:Result* Prec@1 39.600	Loss 12.274
Iteration [343]: lr=1.000e-03
Iteration [343] train aveloss=0.260 aveacc=90.190
Test[343]:Result* Prec@1 85.300	Loss 0.400
Iteration [344]: lr=1.000e-03
Iteration [344] train aveloss=0.264 aveacc=89.770
Test[344]:Result* Prec@1 85.800	Loss 0.370
Iteration [345]: lr=1.000e-03
Iteration [345] train aveloss=0.252 aveacc=90.390
Test[345]:Result* Prec@1 83.100	Loss 0.621
Iteration [346]: lr=1.000e-03
Iteration [346] train aveloss=0.257 aveacc=90.360
Test[346]:Result* Prec@1 42.200	Loss 13.378
Iteration [347]: lr=1.000e-03
Iteration [347] train aveloss=0.266 aveacc=89.900
Test[347]:Result* Prec@1 67.200	Loss 1.983
Iteration [348]: lr=1.000e-03
Iteration [348] train aveloss=0.251 aveacc=90.540
Test[348]:Result* Prec@1 83.000	Loss 0.425
Iteration [349]: lr=1.000e-03
Iteration [349] train aveloss=0.254 aveacc=90.110
Test[349]:Result* Prec@1 68.900	Loss 1.944
Iteration [350]: lr=1.000e-03
Iteration [350] train aveloss=0.251 aveacc=90.320
Test[350]:Result* Prec@1 85.500	Loss 0.420
Iteration [351]: lr=1.000e-03
Iteration [351] train aveloss=0.256 aveacc=90.280
Test[351]:Result* Prec@1 87.900	Loss 0.346
Iteration [352]: lr=1.000e-03
Iteration [352] train aveloss=0.243 aveacc=90.860
Test[352]:Result* Prec@1 85.100	Loss 0.390
Iteration [353]: lr=1.000e-03
Iteration [353] train aveloss=0.253 aveacc=90.460
Test[353]:Result* Prec@1 84.700	Loss 0.419
Iteration [354]: lr=1.000e-03
Iteration [354] train aveloss=0.259 aveacc=89.830
Test[354]:Result* Prec@1 85.600	Loss 0.377
Iteration [355]: lr=1.000e-03
Iteration [355] train aveloss=0.244 aveacc=90.710
Test[355]:Result* Prec@1 70.600	Loss 1.331
Iteration [356]: lr=1.000e-03
Iteration [356] train aveloss=0.271 aveacc=89.490
Test[356]:Result* Prec@1 80.500	Loss 0.581
Iteration [357]: lr=1.000e-03
Iteration [357] train aveloss=0.258 aveacc=90.050
Test[357]:Result* Prec@1 71.100	Loss 0.947
Iteration [358]: lr=1.000e-03
Iteration [358] train aveloss=0.244 aveacc=90.610
Test[358]:Result* Prec@1 84.800	Loss 0.421
Iteration [359]: lr=1.000e-03
Iteration [359] train aveloss=0.265 aveacc=89.810
Test[359]:Result* Prec@1 85.000	Loss 0.399
Iteration [360]: lr=1.000e-03
Iteration [360] train aveloss=0.244 aveacc=91.020
Test[360]:Result* Prec@1 85.000	Loss 0.393
Iteration [361]: lr=1.000e-03
Iteration [361] train aveloss=0.252 aveacc=90.400
Test[361]:Result* Prec@1 57.000	Loss 4.272
Iteration [362]: lr=1.000e-03
Iteration [362] train aveloss=0.253 aveacc=90.440
Test[362]:Result* Prec@1 77.400	Loss 0.670
Iteration [363]: lr=1.000e-03
Iteration [363] train aveloss=0.255 aveacc=90.420
Test[363]:Result* Prec@1 84.100	Loss 0.461
Iteration [364]: lr=1.000e-03
Iteration [364] train aveloss=0.252 aveacc=90.240
Test[364]:Result* Prec@1 84.300	Loss 0.500
Iteration [365]: lr=1.000e-03
Iteration [365] train aveloss=0.247 aveacc=90.990
Test[365]:Result* Prec@1 72.000	Loss 1.744
Iteration [366]: lr=1.000e-03
Iteration [366] train aveloss=0.253 aveacc=90.700
Test[366]:Result* Prec@1 75.800	Loss 0.753
Iteration [367]: lr=1.000e-03
Iteration [367] train aveloss=0.246 aveacc=90.620
Test[367]:Result* Prec@1 85.500	Loss 0.446
Iteration [368]: lr=1.000e-03
Iteration [368] train aveloss=0.243 aveacc=90.750
Test[368]:Result* Prec@1 83.900	Loss 0.426
Iteration [369]: lr=1.000e-03
Iteration [369] train aveloss=0.252 aveacc=90.190
Test[369]:Result* Prec@1 84.700	Loss 0.409
Iteration [370]: lr=1.000e-03
Iteration [370] train aveloss=0.253 aveacc=90.250
Test[370]:Result* Prec@1 86.100	Loss 0.365
Iteration [371]: lr=1.000e-03
Iteration [371] train aveloss=0.251 aveacc=90.770
Test[371]:Result* Prec@1 69.300	Loss 1.722
Iteration [372]: lr=1.000e-03
Iteration [372] train aveloss=0.248 aveacc=90.890
Test[372]:Result* Prec@1 74.500	Loss 0.806
Iteration [373]: lr=1.000e-03
Iteration [373] train aveloss=0.243 aveacc=90.830
Test[373]:Result* Prec@1 85.700	Loss 0.377
Iteration [374]: lr=1.000e-03
Iteration [374] train aveloss=0.249 aveacc=90.640
Test[374]:Result* Prec@1 79.700	Loss 0.535
Iteration [375]: lr=1.000e-03
Iteration [375] train aveloss=0.240 aveacc=91.060
Test[375]:Result* Prec@1 86.000	Loss 0.379
Iteration [376]: lr=1.000e-03
Iteration [376] train aveloss=0.245 aveacc=90.740
Test[376]:Result* Prec@1 84.500	Loss 0.446
Iteration [377]: lr=1.000e-03
Iteration [377] train aveloss=0.241 aveacc=91.050
Test[377]:Result* Prec@1 66.000	Loss 2.398
Iteration [378]: lr=1.000e-03
Iteration [378] train aveloss=0.241 aveacc=90.880
Test[378]:Result* Prec@1 86.200	Loss 0.377
Iteration [379]: lr=1.000e-03
Iteration [379] train aveloss=0.244 aveacc=90.690
Test[379]:Result* Prec@1 84.000	Loss 0.458
Iteration [380]: lr=1.000e-03
Iteration [380] train aveloss=0.244 aveacc=90.990
Test[380]:Result* Prec@1 86.100	Loss 0.386
Iteration [381]: lr=1.000e-03
Iteration [381] train aveloss=0.236 aveacc=91.040
Test[381]:Result* Prec@1 83.100	Loss 0.466
Iteration [382]: lr=1.000e-03
Iteration [382] train aveloss=0.235 aveacc=91.120
Test[382]:Result* Prec@1 51.800	Loss 5.644
Iteration [383]: lr=1.000e-03
Iteration [383] train aveloss=0.245 aveacc=90.900
Test[383]:Result* Prec@1 83.900	Loss 0.432
Iteration [384]: lr=1.000e-03
Iteration [384] train aveloss=0.245 aveacc=90.610
Test[384]:Result* Prec@1 42.500	Loss 11.027
Iteration [385]: lr=1.000e-03
Iteration [385] train aveloss=0.233 aveacc=91.350
Test[385]:Result* Prec@1 84.900	Loss 0.404
Iteration [386]: lr=1.000e-03
Iteration [386] train aveloss=0.237 aveacc=90.870
Test[386]:Result* Prec@1 85.400	Loss 0.392
Iteration [387]: lr=1.000e-03
Iteration [387] train aveloss=0.233 aveacc=91.090
Test[387]:Result* Prec@1 84.400	Loss 0.454
Iteration [388]: lr=1.000e-03
Iteration [388] train aveloss=0.246 aveacc=90.680
Test[388]:Result* Prec@1 58.300	Loss 4.107
Iteration [389]: lr=1.000e-03
Iteration [389] train aveloss=0.237 aveacc=90.970
Test[389]:Result* Prec@1 69.500	Loss 2.124
Iteration [390]: lr=1.000e-03
Iteration [390] train aveloss=0.227 aveacc=91.610
Test[390]:Result* Prec@1 84.000	Loss 0.395
Iteration [391]: lr=1.000e-03
Iteration [391] train aveloss=0.236 aveacc=91.010
Test[391]:Result* Prec@1 81.500	Loss 0.511
Iteration [392]: lr=1.000e-03
Iteration [392] train aveloss=0.239 aveacc=90.860
Test[392]:Result* Prec@1 84.700	Loss 0.423
Iteration [393]: lr=1.000e-03
Iteration [393] train aveloss=0.232 aveacc=91.130
Test[393]:Result* Prec@1 77.000	Loss 0.688
Iteration [394]: lr=1.000e-03
Iteration [394] train aveloss=0.239 aveacc=90.840
Test[394]:Result* Prec@1 80.100	Loss 0.810
Iteration [395]: lr=1.000e-03
Iteration [395] train aveloss=0.232 aveacc=90.880
Test[395]:Result* Prec@1 84.800	Loss 0.411
Iteration [396]: lr=1.000e-03
Iteration [396] train aveloss=0.236 aveacc=91.230
Test[396]:Result* Prec@1 48.400	Loss 10.558
Iteration [397]: lr=1.000e-03
Iteration [397] train aveloss=0.236 aveacc=91.310
Test[397]:Result* Prec@1 86.300	Loss 0.362
Iteration [398]: lr=1.000e-03
Iteration [398] train aveloss=0.238 aveacc=91.130
Test[398]:Result* Prec@1 82.400	Loss 0.463
Iteration [399]: lr=1.000e-03
Iteration [399] train aveloss=0.234 aveacc=90.800
Test[399]:Result* Prec@1 85.900	Loss 0.381
Iteration [400]: lr=1.000e-03
Iteration [400] train aveloss=0.234 aveacc=91.120
Test[400]:Result* Prec@1 85.700	Loss 0.417
Iteration [401]: lr=1.000e-03
Iteration [401] train aveloss=0.245 aveacc=90.920
Test[401]:Result* Prec@1 85.400	Loss 0.391
Iteration [402]: lr=1.000e-03
Iteration [402] train aveloss=0.228 aveacc=91.480
Test[402]:Result* Prec@1 85.200	Loss 0.409
Iteration [403]: lr=1.000e-03
Iteration [403] train aveloss=0.228 aveacc=91.100
Test[403]:Result* Prec@1 45.900	Loss 10.156
Iteration [404]: lr=1.000e-03
Iteration [404] train aveloss=0.241 aveacc=90.550
Test[404]:Result* Prec@1 85.000	Loss 0.383
Iteration [405]: lr=1.000e-03
Iteration [405] train aveloss=0.232 aveacc=91.150
Test[405]:Result* Prec@1 71.500	Loss 1.012
Iteration [406]: lr=1.000e-03
Iteration [406] train aveloss=0.233 aveacc=91.470
Test[406]:Result* Prec@1 86.600	Loss 0.332
Iteration [407]: lr=1.000e-03
Iteration [407] train aveloss=0.228 aveacc=91.140
Test[407]:Result* Prec@1 79.200	Loss 0.763
Iteration [408]: lr=1.000e-03
Iteration [408] train aveloss=0.230 aveacc=91.440
Test[408]:Result* Prec@1 86.300	Loss 0.436
Iteration [409]: lr=1.000e-03
Iteration [409] train aveloss=0.222 aveacc=91.620
Test[409]:Result* Prec@1 85.100	Loss 0.396
Iteration [410]: lr=1.000e-03
Iteration [410] train aveloss=0.237 aveacc=91.270
Test[410]:Result* Prec@1 86.300	Loss 0.379
Iteration [411]: lr=1.000e-03
Iteration [411] train aveloss=0.229 aveacc=91.200
Test[411]:Result* Prec@1 86.900	Loss 0.384
Iteration [412]: lr=1.000e-03
Iteration [412] train aveloss=0.234 aveacc=91.290
Test[412]:Result* Prec@1 78.300	Loss 0.989
Iteration [413]: lr=1.000e-03
Iteration [413] train aveloss=0.224 aveacc=91.780
Test[413]:Result* Prec@1 55.900	Loss 4.943
Iteration [414]: lr=1.000e-03
Iteration [414] train aveloss=0.232 aveacc=91.320
Test[414]:Result* Prec@1 83.800	Loss 0.425
Iteration [415]: lr=1.000e-03
Iteration [415] train aveloss=0.223 aveacc=91.750
Test[415]:Result* Prec@1 55.200	Loss 4.850
Iteration [416]: lr=1.000e-03
Iteration [416] train aveloss=0.222 aveacc=91.490
Test[416]:Result* Prec@1 79.700	Loss 0.532
Iteration [417]: lr=1.000e-03
Iteration [417] train aveloss=0.232 aveacc=91.300
Test[417]:Result* Prec@1 81.700	Loss 0.457
Iteration [418]: lr=1.000e-03
Iteration [418] train aveloss=0.216 aveacc=92.120
Test[418]:Result* Prec@1 39.100	Loss 13.251
Iteration [419]: lr=1.000e-03
Iteration [419] train aveloss=0.220 aveacc=91.890
Test[419]:Result* Prec@1 84.400	Loss 0.448
Iteration [420]: lr=1.000e-03
Iteration [420] train aveloss=0.234 aveacc=91.410
Test[420]:Result* Prec@1 57.600	Loss 4.367
Iteration [421]: lr=1.000e-03
Iteration [421] train aveloss=0.229 aveacc=91.030
Test[421]:Result* Prec@1 87.000	Loss 0.374
Iteration [422]: lr=1.000e-03
Iteration [422] train aveloss=0.230 aveacc=91.540
Test[422]:Result* Prec@1 80.700	Loss 0.545
Iteration [423]: lr=1.000e-03
Iteration [423] train aveloss=0.224 aveacc=91.530
Test[423]:Result* Prec@1 71.100	Loss 1.655
Iteration [424]: lr=1.000e-03
Iteration [424] train aveloss=0.221 aveacc=91.740
Test[424]:Result* Prec@1 81.400	Loss 0.518
Iteration [425]: lr=1.000e-03
Iteration [425] train aveloss=0.236 aveacc=91.120
Test[425]:Result* Prec@1 85.500	Loss 0.370
Iteration [426]: lr=1.000e-03
Iteration [426] train aveloss=0.218 aveacc=91.760
Test[426]:Result* Prec@1 85.200	Loss 0.402
Iteration [427]: lr=1.000e-03
Iteration [427] train aveloss=0.209 aveacc=92.250
Test[427]:Result* Prec@1 41.000	Loss 12.466
Iteration [428]: lr=1.000e-03
Iteration [428] train aveloss=0.222 aveacc=91.660
Test[428]:Result* Prec@1 85.100	Loss 0.412
Iteration [429]: lr=1.000e-03
Iteration [429] train aveloss=0.229 aveacc=91.350
Test[429]:Result* Prec@1 87.000	Loss 0.393
Iteration [430]: lr=1.000e-03
Iteration [430] train aveloss=0.208 aveacc=92.540
Test[430]:Result* Prec@1 49.600	Loss 5.950
Iteration [431]: lr=1.000e-03
Iteration [431] train aveloss=0.224 aveacc=91.750
Test[431]:Result* Prec@1 82.600	Loss 0.431
Iteration [432]: lr=1.000e-03
Iteration [432] train aveloss=0.219 aveacc=91.810
Test[432]:Result* Prec@1 83.600	Loss 0.510
Iteration [433]: lr=1.000e-03
Iteration [433] train aveloss=0.234 aveacc=91.210
Test[433]:Result* Prec@1 82.700	Loss 0.545
Iteration [434]: lr=1.000e-03
Iteration [434] train aveloss=0.207 aveacc=92.480
Test[434]:Result* Prec@1 86.100	Loss 0.393
Iteration [435]: lr=1.000e-03
Iteration [435] train aveloss=0.227 aveacc=91.750
Test[435]:Result* Prec@1 85.200	Loss 0.396
Iteration [436]: lr=1.000e-03
Iteration [436] train aveloss=0.219 aveacc=91.450
Test[436]:Result* Prec@1 85.900	Loss 0.374
Iteration [437]: lr=1.000e-03
Iteration [437] train aveloss=0.214 aveacc=91.760
Test[437]:Result* Prec@1 43.100	Loss 10.707
Iteration [438]: lr=1.000e-03
Iteration [438] train aveloss=0.215 aveacc=92.060
Test[438]:Result* Prec@1 65.200	Loss 2.112
Iteration [439]: lr=1.000e-03
Iteration [439] train aveloss=0.214 aveacc=91.820
Test[439]:Result* Prec@1 84.500	Loss 0.434
Iteration [440]: lr=1.000e-03
Iteration [440] train aveloss=0.215 aveacc=91.900
Test[440]:Result* Prec@1 83.100	Loss 0.484
Iteration [441]: lr=1.000e-03
Iteration [441] train aveloss=0.212 aveacc=92.190
Test[441]:Result* Prec@1 82.900	Loss 0.476
Iteration [442]: lr=1.000e-03
Iteration [442] train aveloss=0.205 aveacc=92.450
Test[442]:Result* Prec@1 86.900	Loss 0.373
Iteration [443]: lr=1.000e-03
Iteration [443] train aveloss=0.206 aveacc=92.620
Test[443]:Result* Prec@1 80.000	Loss 0.588
Iteration [444]: lr=1.000e-03
Iteration [444] train aveloss=0.209 aveacc=91.950
Test[444]:Result* Prec@1 84.500	Loss 0.444
Iteration [445]: lr=1.000e-03
Iteration [445] train aveloss=0.216 aveacc=92.030
Test[445]:Result* Prec@1 76.400	Loss 1.028
Iteration [446]: lr=1.000e-03
Iteration [446] train aveloss=0.207 aveacc=92.520
Test[446]:Result* Prec@1 47.700	Loss 10.043
Iteration [447]: lr=1.000e-03
Iteration [447] train aveloss=0.209 aveacc=92.460
Test[447]:Result* Prec@1 46.900	Loss 9.025
Iteration [448]: lr=1.000e-03
Iteration [448] train aveloss=0.217 aveacc=91.730
Test[448]:Result* Prec@1 82.000	Loss 0.468
Iteration [449]: lr=1.000e-03
Iteration [449] train aveloss=0.209 aveacc=92.410
Test[449]:Result* Prec@1 85.700	Loss 0.376
Iteration [450]: lr=1.000e-03
Iteration [450] train aveloss=0.209 aveacc=92.170
Test[450]:Result* Prec@1 83.300	Loss 0.431
Iteration [451]: lr=1.000e-03
Iteration [451] train aveloss=0.220 aveacc=91.720
Test[451]:Result* Prec@1 85.200	Loss 0.448
Iteration [452]: lr=1.000e-03
Iteration [452] train aveloss=0.197 aveacc=92.630
Test[452]:Result* Prec@1 83.500	Loss 0.458
Iteration [453]: lr=1.000e-03
Iteration [453] train aveloss=0.193 aveacc=92.840
Test[453]:Result* Prec@1 70.300	Loss 1.682
Iteration [454]: lr=1.000e-03
Iteration [454] train aveloss=0.214 aveacc=91.930
Test[454]:Result* Prec@1 45.700	Loss 9.104
Iteration [455]: lr=1.000e-03
Iteration [455] train aveloss=0.204 aveacc=92.560
Test[455]:Result* Prec@1 87.200	Loss 0.379
Iteration [456]: lr=1.000e-03
Iteration [456] train aveloss=0.214 aveacc=92.200
Test[456]:Result* Prec@1 72.100	Loss 1.003
Iteration [457]: lr=1.000e-03
Iteration [457] train aveloss=0.201 aveacc=92.690
Test[457]:Result* Prec@1 69.600	Loss 1.114
Iteration [458]: lr=1.000e-03
Iteration [458] train aveloss=0.197 aveacc=92.790
Test[458]:Result* Prec@1 87.000	Loss 0.354
Iteration [459]: lr=1.000e-03
Iteration [459] train aveloss=0.205 aveacc=92.490
Test[459]:Result* Prec@1 81.300	Loss 0.487
Iteration [460]: lr=1.000e-03
Iteration [460] train aveloss=0.205 aveacc=92.200
Test[460]:Result* Prec@1 85.400	Loss 0.409
Iteration [461]: lr=1.000e-03
Iteration [461] train aveloss=0.200 aveacc=92.710
Test[461]:Result* Prec@1 79.800	Loss 0.505
Iteration [462]: lr=1.000e-03
Iteration [462] train aveloss=0.213 aveacc=92.260
Test[462]:Result* Prec@1 52.400	Loss 5.328
Iteration [463]: lr=1.000e-03
Iteration [463] train aveloss=0.213 aveacc=92.030
Test[463]:Result* Prec@1 84.800	Loss 0.421
Iteration [464]: lr=1.000e-03
Iteration [464] train aveloss=0.213 aveacc=91.970
Test[464]:Result* Prec@1 84.500	Loss 0.429
Iteration [465]: lr=1.000e-03
Iteration [465] train aveloss=0.208 aveacc=92.220
Test[465]:Result* Prec@1 85.800	Loss 0.381
Iteration [466]: lr=1.000e-03
Iteration [466] train aveloss=0.198 aveacc=92.640
Test[466]:Result* Prec@1 70.900	Loss 1.514
Iteration [467]: lr=1.000e-03
Iteration [467] train aveloss=0.198 aveacc=92.640
Test[467]:Result* Prec@1 44.200	Loss 12.786
Iteration [468]: lr=1.000e-03
Iteration [468] train aveloss=0.195 aveacc=92.660
Test[468]:Result* Prec@1 84.500	Loss 0.436
Iteration [469]: lr=1.000e-03
Iteration [469] train aveloss=0.201 aveacc=92.700
Test[469]:Result* Prec@1 83.900	Loss 0.471
Iteration [470]: lr=1.000e-03
Iteration [470] train aveloss=0.199 aveacc=92.430
Test[470]:Result* Prec@1 83.400	Loss 0.432
Iteration [471]: lr=1.000e-03
Iteration [471] train aveloss=0.209 aveacc=92.250
Test[471]:Result* Prec@1 85.400	Loss 0.405
Iteration [472]: lr=1.000e-03
Iteration [472] train aveloss=0.205 aveacc=92.520
Test[472]:Result* Prec@1 83.900	Loss 0.419
Iteration [473]: lr=1.000e-03
Iteration [473] train aveloss=0.206 aveacc=92.410
Test[473]:Result* Prec@1 87.500	Loss 0.336
Iteration [474]: lr=1.000e-03
Iteration [474] train aveloss=0.197 aveacc=92.720
Test[474]:Result* Prec@1 79.000	Loss 0.543
Iteration [475]: lr=1.000e-03
Iteration [475] train aveloss=0.196 aveacc=92.730
Test[475]:Result* Prec@1 83.600	Loss 0.453
Iteration [476]: lr=1.000e-03
Iteration [476] train aveloss=0.190 aveacc=93.030
Test[476]:Result* Prec@1 78.200	Loss 0.770
Iteration [477]: lr=1.000e-03
Iteration [477] train aveloss=0.198 aveacc=92.640
Test[477]:Result* Prec@1 82.900	Loss 0.561
Iteration [478]: lr=1.000e-03
Iteration [478] train aveloss=0.200 aveacc=92.470
Test[478]:Result* Prec@1 83.000	Loss 0.441
Iteration [479]: lr=1.000e-03
Iteration [479] train aveloss=0.198 aveacc=92.960
Test[479]:Result* Prec@1 85.800	Loss 0.409
Iteration [480]: lr=1.000e-03
Iteration [480] train aveloss=0.194 aveacc=92.870
Test[480]:Result* Prec@1 85.900	Loss 0.427
Iteration [481]: lr=1.000e-03
Iteration [481] train aveloss=0.191 aveacc=92.820
Test[481]:Result* Prec@1 43.800	Loss 8.062
Iteration [482]: lr=1.000e-03
Iteration [482] train aveloss=0.195 aveacc=92.520
Test[482]:Result* Prec@1 78.300	Loss 0.796
Iteration [483]: lr=1.000e-03
Iteration [483] train aveloss=0.200 aveacc=92.540
Test[483]:Result* Prec@1 85.300	Loss 0.443
Iteration [484]: lr=1.000e-03
Iteration [484] train aveloss=0.195 aveacc=92.730
Test[484]:Result* Prec@1 61.700	Loss 2.328
Iteration [485]: lr=1.000e-03
Iteration [485] train aveloss=0.192 aveacc=92.920
Test[485]:Result* Prec@1 86.700	Loss 0.373
Iteration [486]: lr=1.000e-03
Iteration [486] train aveloss=0.191 aveacc=92.870
Test[486]:Result* Prec@1 52.300	Loss 6.176
Iteration [487]: lr=1.000e-03
Iteration [487] train aveloss=0.190 aveacc=93.200
Test[487]:Result* Prec@1 73.800	Loss 1.325
Iteration [488]: lr=1.000e-03
Iteration [488] train aveloss=0.191 aveacc=93.010
Test[488]:Result* Prec@1 83.400	Loss 0.487
Iteration [489]: lr=1.000e-03
Iteration [489] train aveloss=0.190 aveacc=93.090
Test[489]:Result* Prec@1 82.500	Loss 0.449
Iteration [490]: lr=1.000e-03
Iteration [490] train aveloss=0.186 aveacc=93.040
Test[490]:Result* Prec@1 59.600	Loss 3.171
Iteration [491]: lr=1.000e-03
Iteration [491] train aveloss=0.195 aveacc=93.180
Test[491]:Result* Prec@1 84.500	Loss 0.458
Iteration [492]: lr=1.000e-03
Iteration [492] train aveloss=0.185 aveacc=93.220
Test[492]:Result* Prec@1 78.800	Loss 0.722
Iteration [493]: lr=1.000e-03
Iteration [493] train aveloss=0.188 aveacc=93.010
Test[493]:Result* Prec@1 85.900	Loss 0.410
Iteration [494]: lr=1.000e-03
Iteration [494] train aveloss=0.200 aveacc=92.750
Test[494]:Result* Prec@1 82.200	Loss 0.599
Iteration [495]: lr=1.000e-03
Iteration [495] train aveloss=0.183 aveacc=93.160
Test[495]:Result* Prec@1 79.800	Loss 0.614
Iteration [496]: lr=1.000e-03
Iteration [496] train aveloss=0.186 aveacc=93.320
Test[496]:Result* Prec@1 81.000	Loss 0.532
Iteration [497]: lr=1.000e-03
Iteration [497] train aveloss=0.182 aveacc=93.360
Test[497]:Result* Prec@1 73.600	Loss 0.868
Iteration [498]: lr=1.000e-03
Iteration [498] train aveloss=0.193 aveacc=92.830
Test[498]:Result* Prec@1 85.300	Loss 0.429
Iteration [499]: lr=1.000e-03
Iteration [499] train aveloss=0.186 aveacc=93.140
Test[499]:Result* Prec@1 83.200	Loss 0.550
In [17]:
# once the training is over. stop the fillers
iotrain.stop()
iovalid.stop()

Observations from training

For the first 30 epochs (150 iterations), the training is going well. The average loss for the training data (blue) and validation data (red) are dropping steadily and doing so together.

After epoch 30, the training loss keeps lowering. However, the training and validation losses are separating. The validation loss stops improving and becomes very variable. These are both hallmarks of overtraining.

Looking at the standard output, the accuracy of the validation gets stuck at about 85%. This is expected with our training data. The 15% of events involves images where the labels are inaccurate. For example, a proton interacts with a nucleus producing a bunch of photons. Or a muon decays early into an electron. Refer to the blog post about version 0.1.0 of the open training data. This means we probably hit the accuracy limit.

This is why we saved a checkpoint every 50 iterations. You'll find checkpoint.Xth.tar files in the folder where this notebook is located. We can use the model saved at epoch 30 (i.e. checkpoint.150th.tar). In a subsequent post, we'll look at the performance of that model.