fluid.metrics

Accuracy

class paddle.fluid.metrics.Accuracy(name=None)

Accumulate the accuracy from minibatches and compute the average accuracy for every pass. https://en.wikipedia.org/wiki/Accuracy_and_precision

Parameters:name – the metrics name

Examples

labels = fluid.layers.data(name="data", shape=[1], dtype="int32")
data = fluid.layers.data(name="data", shape=[32, 32], dtype="int32")
pred = fluid.layers.fc(input=data, size=1000, act="tanh")
minibatch_accuracy = fluid.layers.accuracy(pred, label)
accuracy_evaluator = fluid.metrics.Accuracy()
for pass in range(PASSES):
    accuracy_evaluator.reset()
    for data in train_reader():
        batch_size = data[0]
        loss = exe.run(fetch_list=[cost, minibatch_accuracy])
    accuracy_evaluator.update(value=minibatch_accuracy, weight=batch_size)
    numpy_acc = accuracy_evaluator.eval()
update(value, weight)

Update minibatch states.

Parameters:
  • value (float|numpy.array) – accuracy of one minibatch.
  • weight (int|float) – batch size.

Auc

class paddle.fluid.metrics.Auc(name, curve='ROC', num_thresholds=4095)

Auc metric adapts to the binary classification. Refer to https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve Need to note that auc metric compute the value via Python natively. If you concern the speed, please use the fluid.layers.auc instead.

The auc function creates four local variables, true_positives, true_negatives, false_positives and false_negatives that are used to compute the AUC. To discretize the AUC curve, a linearly spaced set of thresholds is used to compute pairs of recall and precision values. The area under the ROC-curve is therefore computed using the height of the recall values by the false positive rate, while the area under the PR-curve is the computed using the height of the precision values by the recall.

Parameters:
  • name – metric name
  • curve – Specifies the name of the curve to be computed, ‘ROC’ [default] or ‘PR’ for the Precision-Recall-curve.

“NOTE: only implement the ROC curve type via Python now.”

Examples

pred = fluid.layers.fc(input=data, size=1000, act="tanh")
metric = fluid.metrics.Auc()
for data in train_reader():
    loss, preds, labels = exe.run(fetch_list=[cost, preds, labels])
    metric.update(preds, labels)
    numpy_auc = metric.eval()

ChunkEvaluator

class paddle.fluid.metrics.ChunkEvaluator(name=None)

Accumulate counter numbers output by chunk_eval from mini-batches and compute the precision recall and F1-score using the accumulated counter numbers. For some basics of chunking, please refer to ‘Chunking with Support Vector Machines <https://aclanthology.info/pdf/N/N01/N01-1025.pdf>’. ChunkEvalEvaluator computes the precision, recall, and F1-score of chunk detection, and supports IOB, IOE, IOBES and IO (also known as plain) tagging schemes.

Examples

labels = fluid.layers.data(name="data", shape=[1], dtype="int32")
data = fluid.layers.data(name="data", shape=[32, 32], dtype="int32")
pred = fluid.layers.fc(input=data, size=1000, act="tanh")
precision, recall, f1_score, num_infer_chunks, num_label_chunks, num_correct_chunks = layers.chunk_eval(
    input=pred,
    label=label)
metric = fluid.metrics.ChunkEvaluator()
for data in train_reader():
    loss, preds, labels = exe.run(fetch_list=[cost, preds, labels])
    metric.update(num_infer_chunks, num_label_chunks, num_correct_chunks)
    numpy_precision, numpy_recall, numpy_f1 = metric.eval()
update(num_infer_chunks, num_label_chunks, num_correct_chunks)

Update the states based on the layers.chunk_eval() ouputs. :param num_infer_chunks: The number of chunks in Inference on the given minibatch. :type num_infer_chunks: int|numpy.array :param num_label_chunks: The number of chunks in Label on the given mini-batch. :type num_label_chunks: int|numpy.array :param num_correct_chunks: The number of chunks both in Inference and Label on the

given mini-batch.

CompositeMetric

class paddle.fluid.metrics.CompositeMetric(name=None)

Composite multiple metrics in one instance. for example, merge F1, accuracy, recall into one Metric.

Examples

labels = fluid.layers.data(name="data", shape=[1], dtype="int32")
data = fluid.layers.data(name="data", shape=[32, 32], dtype="int32")
pred = fluid.layers.fc(input=data, size=1000, act="tanh")
comp = fluid.metrics.CompositeMetric()
acc = fluid.metrics.Precision()
recall = fluid.metrics.Recall()
comp.add_metric(acc)
comp.add_metric(recall)
for pass in range(PASSES):
  comp.reset()
  for data in train_reader():
      loss, preds, labels = exe.run(fetch_list=[cost, preds, labels])
  comp.update(preds=preds, labels=labels)
  numpy_acc, numpy_recall = comp.eval()
add_metric(metric)

add one metric instance to CompositeMetric.

Parameters:metric – a instance of MetricBase.
update(preds, labels)

Update every metrics in sequence.

Parameters:
  • preds (numpy.array) – the predictions of current minibatch
  • labels (numpy.array) – the labels of current minibatch, if the label is one-hot or soft-label, should custom the corresponding update rule.
eval()

Evaluate every metrics in sequence.

Returns:a list of metrics value in Python.
Return type:list(float|numpy.array)

DetectionMAP

class paddle.fluid.metrics.DetectionMAP(input, gt_label, gt_box, gt_difficult=None, class_num=None, background_label=0, overlap_threshold=0.5, evaluate_difficult=True, ap_version='integral')

Calculate the detection mean average precision (mAP).

The general steps are as follows: 1. calculate the true positive and false positive according to the input

of detection and labels.
  1. calculate mAP value, support two versions: ‘11 point’ and ‘integral’.
Please get more information from the following articles:
https://sanchom.wordpress.com/tag/average-precision/ https://arxiv.org/abs/1512.02325
Parameters:
  • input (Variable) – The detection results, which is a LoDTensor with shape [M, 6]. The layout is [label, confidence, xmin, ymin, xmax, ymax].
  • gt_label (Variable) – The ground truth label index, which is a LoDTensor with shape [N, 1].
  • gt_box (Variable) – The ground truth bounding box (bbox), which is a LoDTensor with shape [N, 4]. The layout is [xmin, ymin, xmax, ymax].
  • gt_difficult (Variable|None) – Whether this ground truth is a difficult bounding bbox, which can be a LoDTensor [N, 1] or not set. If None, it means all the ground truth labels are not difficult bbox.
  • class_num (int) – The class number.
  • background_label (int) – The index of background label, the background label will be ignored. If set to -1, then all categories will be considered, 0 by defalut.
  • overlap_threshold (float) – The threshold for deciding true/false positive, 0.5 by defalut.
  • evaluate_difficult (bool) – Whether to consider difficult ground truth for evaluation, True by defalut. This argument does not work when gt_difficult is None.
  • ap_version (string) – The average precision calculation ways, it must be ‘integral’ or ‘11point’. Please check https://sanchom.wordpress.com/tag/average-precision/ for details. - 11point: the 11-point interpolated average precision. - integral: the natural integral of the precision-recall curve.

Examples

exe = fluid.Executor(place)
map_evaluator = fluid.Evaluator.DetectionMAP(input,
    gt_label, gt_box, gt_difficult)
cur_map, accum_map = map_evaluator.get_map_var()
fetch = [cost, cur_map, accum_map]
for epoch in PASS_NUM:
    map_evaluator.reset(exe)
    for data in batches:
        loss, cur_map_v, accum_map_v = exe.run(fetch_list=fetch)

In the above example:

‘cur_map_v’ is the mAP of current mini-batch. ‘accum_map_v’ is the accumulative mAP of one pass.
get_map_var()
Returns: mAP variable of current mini-batch and
accumulative mAP variable cross mini-batches.
reset(executor, reset_program=None)

Reset metric states at the begin of each pass/user specified batch.

Parameters:
  • executor (Executor) – a executor for executing the reset_program.
  • reset_program (Program|None) – a single Program for reset process. If None, will create a Program.

EditDistance

class paddle.fluid.metrics.EditDistance(name)

Edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Refer to https://en.wikipedia.org/wiki/Edit_distance

Accumulate edit distance sum and sequence number from mini-batches and compute the average edit_distance and instance error of all batches.

Parameters:name – the metrics name

Examples

distances, seq_num = fluid.layers.edit_distance(input, label)
distance_evaluator = fluid.metrics.EditDistance()
for epoch in PASS_NUM:
    distance_evaluator.reset()
    for data in batches:
        loss = exe.run(fetch_list=[cost] + list(edit_distance_metrics))
    distance_evaluator.update(distances, seq_num)
    distance, instance_error = distance_evaluator.eval()

In the above example: ‘distance’ is the average of the edit distance in a pass. ‘instance_error’ is the instance error rate in a pass.

MetricBase

class paddle.fluid.metrics.MetricBase(name)

Base Class for all Metrics. MetricBase define a group of interfaces for the model evaluation methods. Metrics accumulate metric states between consecutive minibatches, at every minibatch, use update interface to add current minibatch value to global states. Use eval to compute accumative metric value from last reset() or from scratch on. If you need to custom a new metric, please inherit from MetricBase and custom implementation.

Parameters:name (str) – The name of metric instance. such as, “accuracy”. It needed if you want to distinct different metrics in a model.
reset()

reset clear the states of metrics. By default, the states are the members who do not has _ prefix, reset set them to inital states. If you violate the implicit name rule, please also custom the reset interface.

get_config()

Get the metric and current states. The states are the members who do not has “_” prefix.

Parameters:None
Returns:a dict of metric and states
Return type:dict
update(preds, labels)

Updates the metric states at every minibatch. One user can compute the minibatch metric via pure Python, or via a c++ operator.

Parameters:
  • preds (numpy.array) – the predictions of current minibatch
  • labels (numpy.array) – the labels of current minibatch, if the label is one-hot or soft-label, should custom the corresponding update rule.
eval()

Evalute the current metrics based the accumulated states.

Returns:the metrics via Python.
Return type:float|list(float)|numpy.array

Precision

class paddle.fluid.metrics.Precision(name=None)

Precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances. https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers

Note Precision is different with Accuracy in binary classifiers. accuracy = true positive / total instances precision = true positive / all positive instance

Examples


metric = fluid.metrics.Precision() for pass in range(PASSES):

metric.reset() for data in train_reader():

loss, preds, labels = exe.run(fetch_list=[cost, preds, labels])

metric.update(preds=preds, labels=labels) numpy_precision = metric.eval()

Recall

class paddle.fluid.metrics.Recall(name=None)

Recall (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instances

https://en.wikipedia.org/wiki/Precision_and_recall

Examples


metric = fluid.metrics.Recall() for pass in range(PASSES):

metric.reset() for data in train_reader():

loss, preds, labels = exe.run(fetch_list=[cost, preds, labels])

metric.update(preds=preds, labels=labels) numpy_recall = metric.eval()