# Evaluators¶

## Classification¶

### classification_error¶

paddle.v2.evaluator.classification_error(*args, **xargs)

Classification Error Evaluator. It will print error rate for classification.

The classification error is:

$classification\_error = \frac{NumOfWrongPredicts}{NumOfAllSamples}$

The simple usage is:

eval =  classification_evaluator.error(input=prob,label=lbl)

Parameters: name (basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer name. The output prediction of network. label (basestring) – Label layer name. weight (paddle.v2.config_base.Layer) – Weight Layer name. It should be a matrix with size [sample_num, 1]. And will just multiply to NumOfWrongPredicts and NumOfAllSamples. So, the elements of weight are all one, then means not set weight. The larger weight it is, the more important this sample is. top_k (int) – number k in top-k error rate threshold (float) – The classification threshold. None.

### auc¶

paddle.v2.evaluator.auc(*args, **xargs)

Auc Evaluator which adapts to binary classification.

The simple usage:

eval = evaluator.auc(input, label)

Parameters: name (None|basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer name. The output prediction of network. label (None|basestring) – Label layer name. weight (paddle.v2.config_base.Layer) – Weight Layer name. It should be a matrix with size [sample_num, 1].

### ctc_error¶

paddle.v2.evaluator.ctc_error(*args, **xargs)

This evaluator is to calculate sequence-to-sequence edit distance.

The simple usage is :

eval = ctc_evaluator.error(input=input, label=lbl)

Parameters: name (None|basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer. Should be the same as the input for ctc. label (paddle.v2.config_base.Layer) – input label, which is a data. Should be the same as the label for ctc

### chunk¶

paddle.v2.evaluator.chunk(*args, **xargs)

Chunk evaluator is used to evaluate segment labelling accuracy for a sequence. It calculates precision, recall and F1 scores for the chunk detection.

To use chunk evaluator, several concepts need to be clarified firstly.

• Chunk type is the type of the whole chunk and a chunk consists of one or several words. (For example in NER, ORG for organization name, PER for person name etc.)
• Tag type indicates the position of a word in a chunk. (B for begin, I for inside, E for end, S for single)

We can name a label by combining tag type and chunk type. (ie. B-ORG for begining of an organization name)

The construction of label dictionary should obey the following rules:

• Use one of the listed labelling schemes. These schemes differ in ways indicating chunk boundry.
Scheme    Description
plain    Use the same label for the whole chunk.
IOB      Two labels for chunk type X, B-X for chunk begining and I-X for chunk inside.
IOE      Two labels for chunk type X, E-X for chunk ending and I-X for chunk inside.
IOBES    Four labels for chunk type X, B-X for chunk begining, I-X for chunk inside, E-X for chunk end and S-X for single word chunk.


To make it clear, let’s illustrate by an NER example. Assuming that there are three named entity types including ORG, PER and LOC which are called ‘chunk type’ here, if ‘IOB’ scheme were used, the label set will be extended to a set including B-ORG, I-ORG, B-PER, I-PER, B-LOC, I-LOC and O, in which B-ORG for begining of ORG and I-ORG for inside of ORG. Prefixes which are called ‘tag type’ here are added to chunk types and there are two tag types including B and I. Of course, the training data should be labeled accordingly.

• Mapping is done correctly by the listed equations and assigning protocol.

The following table are equations to extract tag type and chunk type from a label.

tagType = label % numTagType
chunkType = label / numTagType
otherChunkType = numChunkTypes


The following table shows the mapping rule between tagType and tag type in each scheme.

Scheme Begin Inside End   Single
plain  0     -      -     -
IOB    0     1      -     -
IOE    -     0      1     -
IOBES  0     1      2     3


Continue the NER example, and the label dict should look like this to satify above equations:

B-ORG  0
I-ORG  1
B-PER  2
I-PER  3
B-LOC  4
I-LOC  5
O      6


In this example, chunkType has three values: 0 for ORG, 1 for PER, 2 for LOC, because the scheme is “IOB” so tagType has two values: 0 for B and 1 for I. Here we will use I-LOC to explain the above mapping rules in detail. For I-LOC, the label id is 5, so we can get tagType=1 and chunkType=2, which means I-LOC is a part of NER chunk LOC and the tag is I.

The simple usage is:

eval = evaluator.chunk(input, label, chunk_scheme, num_chunk_types)

Parameters: input (paddle.v2.config_base.Layer) – The input layers. label (paddle.v2.config_base.Layer) – An input layer containing the ground truth label. chunk_scheme (basestring) – The labelling schemes support 4 types. It is one of “IOB”, “IOE”, “IOBES”, “plain”. It is required. num_chunk_types – number of chunk types other than “other” name (basename|None) – The Evaluator name, it is optional. excluded_chunk_types (list of integer|None) – chunks of these types are not considered

### precision_recall¶

paddle.v2.evaluator.precision_recall(*args, **xargs)

An Evaluator to calculate precision and recall, F1-score. It is adapt to the task with multiple labels.

• If positive_label=-1, it will print the average precision, recall, F1-score of all labels.
• If use specify positive_label, it will print the precision, recall, F1-score of this label.

The simple usage:

eval = precision_evaluator.recall(input, label)

Parameters: name (None|basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer name. The output prediction of network. label (paddle.v2.config_base.Layer) – Label layer name. positive_label (paddle.v2.config_base.Layer.) – The input label layer. weight (paddle.v2.config_base.Layer) – Weight Layer name. It should be a matrix with size [sample_num, 1]. (TODO, explaination)

## Rank¶

### pnpair¶

paddle.v2.evaluator.pnpair(*args, **xargs)

Positive-negative pair rate Evaluator which adapts to rank task like learning to rank. This evaluator must contain at least three layers.

The simple usage:

eval = evaluator.pnpair(input, label, query_id)

Parameters: input (paddle.v2.config_base.Layer) – Input Layer name. The output prediction of network. label (paddle.v2.config_base.Layer) – Label layer name. query_id (paddle.v2.config_base.Layer) – Query_id layer name. Query_id indicates that which query each sample belongs to. Its shape should be the same as output of Label layer. weight (paddle.v2.config_base.Layer) – Weight Layer name. It should be a matrix with size [sample_num, 1] which indicates the weight of each sample. The default weight of sample is 1 if the weight layer is None. And the pair weight is the mean of the two samples’ weight. name (None|basestring) – Evaluator name.

## Utils¶

### sum¶

paddle.v2.evaluator.sum(*args, **xargs)

An Evaluator to sum the result of input.

The simple usage:

eval = evaluator.sum(input)

Parameters: name (None|basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer name. weight (paddle.v2.config_base.Layer) – Weight Layer name. It should be a matrix with size [sample_num, 1]. (TODO, explaination)

### column_sum¶

paddle.v2.evaluator.column_sum(*args, **xargs)

This Evaluator is used to sum the last column of input.

The simple usage is:

eval = column_evaluator.sum(input, label)

Parameters: name (None|basestring) – Evaluator name. input (paddle.v2.config_base.Layer) – Input Layer name.

## Print¶

### classification_error_printer¶

paddle.v2.evaluator.classification_error_printer(*args, **xargs)

This Evaluator is used to print the classification error of each sample.

The simple usage is:

eval = classification_error_evaluator.printer(input)

Parameters: input (paddle.v2.config_base.Layer) – Input layer. label (paddle.v2.config_base.Layer) – Input label layer. name (None|basestring) – Evaluator name.

paddle.v2.evaluator.gradient_printer(*args, **xargs)

This Evaluator is used to print the gradient of input layers. It contains one or more input layers.

The simple usage is:

eval = gradient_evaluator.printer(input)

Parameters: input (paddle.v2.config_base.Layer|list) – One or more input layers. name (None|basestring) – Evaluator name.

### maxid_printer¶

paddle.v2.evaluator.maxid_printer(*args, **xargs)

This Evaluator is used to print maximum top k values and their indexes of each row of input layers. It contains one or more input layers. k is specified by num_results.

The simple usage is:

eval = maxid_evaluator.printer(input)

Parameters: input (paddle.v2.config_base.Layer|list) – Input Layer name. num_results (int.) – This number is used to specify the top k numbers. It is 1 by default. name (None|basestring) – Evaluator name.

### maxframe_printer¶

paddle.v2.evaluator.maxframe_printer(*args, **xargs)

This Evaluator is used to print the top k frames of each input layers. The input layers should contain sequences info or sequences type. k is specified by num_results. It contains one or more input layers.

Note

The width of each frame is 1.

The simple usage is:

eval = maxframe_evaluator.printer(input)

Parameters: input (paddle.v2.config_base.Layer|list) – Input Layer name. name (None|basestring) – Evaluator name.

### seqtext_printer¶

paddle.v2.evaluator.seqtext_printer(*args, **xargs)

Sequence text printer will print text according to index matrix and a dictionary. There can be multiple input to this layer:

1. If there is no id_input, the input must be a matrix containing the sequence of indices;

1. If there is id_input, it should be ids, and interpreted as sample ids.

The output format will be:

1. sequence without sub-sequence, and there is probability.
id      prob space_seperated_tokens_from_dictionary_according_to_seq

1. sequence without sub-sequence, and there is not probability.
id      space_seperated_tokens_from_dictionary_according_to_seq

1. sequence with sub-sequence, and there is not probability.
id      space_seperated_tokens_from_dictionary_according_to_sub_seq
space_seperated_tokens_from_dictionary_according_to_sub_seq
...


Typically SequenceTextPrinter layer takes output of maxid or RecurrentGroup with maxid (when generating) as an input.

The simple usage is:

eval = seqtext_evaluator.printer(input=maxid,
id_input=sample_id,
dict_file=dict_file,
result_file=result_file)

Parameters: input (paddle.v2.config_base.Layer|list) – Input Layer name. result_file (basestring) – Path of the file to store the generated results. id_input (paddle.v2.config_base.Layer) – Index of the input sequence, and the specified index will be prited in the gereated results. This an optional parameter. dict_file (basestring) – Path of dictionary. This is an optional parameter. Every line is a word in the dictionary with (line number - 1) as the word index. If this parameter is set to None, or to an empty string, only word index are printed in the generated results. delimited (bool) – Whether to use space to separate output tokens. Default is True. No space is added if set to False. name (None|basestring) – Evaluator name. The seq_text_printer that prints the generated sequence to a file. evaluator

### value_printer¶

paddle.v2.evaluator.value_printer(*args, **xargs)

This Evaluator is used to print the values of input layers. It contains one or more input layers.

The simple usage is:

eval = value_evaluator.printer(input)

Parameters: input (paddle.v2.config_base.Layer|list) – One or more input layers. name (None|basestring) – Evaluator name.

## Detection¶

### detection_map¶

paddle.v2.evaluator.detection_map(*args, **xargs)

Detection mAP Evaluator. It will print mean Average Precision (mAP) for detection.

The detection mAP Evaluator based on the output of detection_output layer counts the true positive and the false positive bbox and integral them to get the mAP.

The simple usage is:

eval =  detection_evaluator.map(input=det_output,label=lbl)

Parameters: input (paddle.v2.config_base.Layer) – Input layer. label (paddle.v2.config_base.Layer) – Label layer. overlap_threshold (float) – The bbox overlap threshold of a true positive. background_id (int) – The background class index. evaluate_difficult (bool) – Whether evaluate a difficult ground truth.