fluid

Program

class paddle.fluid.Program

Python Program. Beneath it is a ProgramDesc, which is used for create c++ Program. A program is a self-contained programing language like container. It has at least one Block, when the control flow op like conditional_block, while_op is included, it will contains nested block. Please reference the framework.proto for details.

Notes: we have default_startup_program and default_main_program by default, a pair of them will shared the parameters. The default_startup_program only run once to initialize parameters, default_main_program run in every mini batch and adjust the weights.

Returns:A empty program.

Examples

>>> main_program = fluid.Program()
>>> startup_program = fluid.Program()
>>> with fluid.program_guard(main_program=main_program, startup_program=startup_program):
>>>     fluid.layers.data(name="x", shape=[-1, 784], dtype='float32')
>>>     fluid.layers.data(name="y", shape=[-1, 1], dtype='int32')
>>>     fluid.layers.fc(name="fc", shape=[10], dtype='float32', act="relu")
op_role

The operator role. In a enum {Forward, Backward, Optimize}.

Notes: this is a low level API. It is used only for ParallelExecutor to duplicate or schedule operator to devices.

For example, the forward operator should be executed on every device. The backward operator should be executed on every device and the parameter gradient of backward (use op_role_var to get this variable) operator should be merged to one device. The optimization operators should be executed on only one device and broadcast the optimization result, i.e., the new parameter, to every other device.

set_op_role

The operator role. In a enum {Forward, Backward, Optimize}.

Notes: this is a low level API. It is used only for ParallelExecutor to duplicate or schedule operator to devices.

For example, the forward operator should be executed on every device. The backward operator should be executed on every device and the parameter gradient of backward (use op_role_var to get this variable) operator should be merged to one device. The optimization operators should be executed on only one device and broadcast the optimization result, i.e., the new parameter, to every other device.

op_role_var

The auxiliary variables for op_role property.

See Also: Program.op_role‘s documentation for details.

Notes: This is a very low-level API. Users should not use it directly.

set_op_role_var

The auxiliary variables for op_role property.

See Also: Program.op_role‘s documentation for details.

Notes: This is a very low-level API. Users should not use it directly.

optimized_guard(*args, **kwds)

A with guard to set Optimization OpRole and OpRoleVar automatically.

Notes: This is a very low level API. Users should not use it directly.

Parameters:param_and_grads (list) – The variables (names) to be optimized.

Examples

>>> p, g = backward(...)
>>> with program.optimized_guard([p,g]):
>>>     p = p - 0.001 * g
to_string(throw_on_error, with_details=False)

To debug string.

Parameters:
  • throw_on_error (bool) – raise Value error when any of required fields is not set.
  • with_details (bool) – True if more details about variables and parameters, e.g., trainable, optimize_attr, need to print.
Returns
(str): The debug string.
Raises:ValueError – If any of required fields is not set and throw_on_error is True.
get_desc()

Get the C++ side of ProgramDesc object pointer. The C++ object is exposed by pybind.

Notes: This is a very low level API. Users should not use this API directly.

clone(for_test=False)

Create a new, duplicated program.

Some operators, e.g., batch_norm, behave differently between training and testing. They have an attribute, is_test, to control this behaviour. This method will change the is_test attribute of them to True when for_test=True.

  • Set for_test to False when we want to clone the program for training.
  • Set for_test to True when we want to clone the program for testing.

Notes: This API DOES NOT prune any operator. Use clone(for_test=True) before backward and optimization please. e.g.

>>> test_program = fluid.default_main_program().clone(for_test=True)
>>> optimizer = fluid.optimizer.Momentum(learning_rate=0.01, momentum=0.9)
>>> optimizer.minimize()
Parameters:for_test (bool) – True if change the is_test attribute of operators to True.
Returns:The new, duplicated Program object.
Return type:Program

Examples

  1. To clone a test program, the sample code is:
>>> import paddle.fluid as fluid
>>> train_program = fluid.Program()
>>> startup_program = fluid.Program()
>>> with fluid.program_guard(train_program, startup_program):
>>>     img = fluid.layers.data(name='image', shape=[784])
>>>     hidden = fluid.layers.fc(input=img, size=200, act='relu')
>>>     hidden = fluid.layers.dropout(hidden, dropout_prob=0.5)
>>>     loss = fluid.layers.cross_entropy(
>>>                 input=fluid.layers.fc(hidden, size=10, act='softmax'),
>>>                 label=fluid.layers.data(name='label', shape=[1], dtype='int64'))
>>>
>>> test_program = train_program.clone(for_test=True)
>>>
>>> sgd = fluid.optimizer.SGD(learning_rate=1e-3)
>>> with fluid.program_guard(train_program, startup_program):
>>>     sgd.minimize(loss)

2. The clone method can be avoid if you create program for training and program for testing individually.

>>> import paddle.fluid as fluid
>>>
>>> def network(is_test):
>>>     img = fluid.layers.data(name='image', shape=[784])
>>>     hidden = fluid.layers.fc(input=img, size=200, act='relu')
>>>     hidden = fluid.layers.dropout(hidden, dropout_prob=0.5, is_test=is_test)
>>>     loss = fluid.layers.cross_entropy(
>>>                 input=fluid.layers.fc(hidden, size=10, act='softmax'),
>>>                 label=fluid.layers.data(name='label', shape=[1], dtype='int64'))
>>>     return loss
>>>
>>> train_program = fluid.Program()
>>> startup_program = fluid.Program()
>>> test_program = fluid.Program()
>>>
>>> with fluid.program_guard(train_program, startup_program):
>>>     with fluid.unique_name.guard():
>>>         loss = network(is_test=False)
>>>         sgd = fluid.optimizer.SGD(learning_rate=1e-3)
>>>         sgd.minimize(loss)
>>>
>>> # the test startup program is not used.
>>> with fluid.program_guard(test_program, fluid.Program()):
>>>     with fluid.unique_name.guard():
>>>         loss = network(is_test=True)

The two code snippets above will generate same programs.

prune(targets)

Prune operators and variables which are not needed to generate targets.

Notes: This is a very low level API. Users should not use this API directly. This API is in flux and not stable.

Parameters:targets (list|Variable|Operator) – A list of variables or operators need to be pruned
Returns:A new, pruned program.
Return type:Program
inference_optimize(export_for_deployment=True)

This method will create a new program and do following adjustments on it: 1. Remove all reader variables and their creator ops if exist.

  1. Remove the read_op if exists.

3. change the is_test attribute of operators to True. All the Parameter information will be lost.

Parameters:export_for_deployment (bool) – remove the read ops that are added by py_reader for cpp inference library

Notes: This API is a very low level API. Use Program.clone(for_test=True) instead.

Returns:The new program.
Return type:Program
static parse_from_string(binary_str)

Deserialize a program desc from protobuf binary string.

Notes: All information about parameters will be lost after serialization and deserialization.

Parameters:binary_str_type (str) – The binary prootbuf string.
Returns:A deserialized program desc.
Return type:Program
num_blocks

The number of blocks in this program.

random_seed

The default random seed for random operators in Program. Zero means get the random seed from random device.

Notes: It must be set before the operators have been added.

global_block()

Get the first block of this program.

block(index)

Get the index block of this program :param index: The index of block to get :type index: int

Returns:The index block
Return type:Block
current_block()

Get the current block. The current block is the block to append operators.

create_block(parent_idx=None)

Create a new block with the parent_idx and change the current block to new block.

Parameters:parent_idx (int) – The parent block index.
Returns:The new block.
Return type:Block
rollback()

Exit a code block, i.e., roll back to the parent block. :returns: None

copy_data_info_from(other)

Copy the information of data variables from other program.

Notes: This is a very low level API. Users should not invoke it directly.

Parameters:other (Program) – Other program
Returns:None
list_vars()

Get all variables from this Program. A iterable object is returned.

Returns:The generator will yield every variable in this program.
Return type:iterable

Operator

class paddle.fluid.Operator(block, desc, type=None, inputs=None, outputs=None, attrs=None)

In Fluid, all the operation are represented by Operator, and Operator is regarded as a build in an instruction of a Block. Users can use the build in instructions to describe their neural network.

Parameters:
  • block (Block) – The block has the current operator.
  • desc (core.OpDesc) – The protobuf description of Operator.
  • type (str) – The type of operator. Default None.
  • inputs (dict) – The input of this Operator. it is a dictionary, for every element, key is the input parameter name, and value is a list of variables. Default None.
  • outputs (dict) – The output of this Operator. it is a dictionary, for every element, key is the input parameter name, and value is a list of variables. Default None.
  • attrs (dict) – The attributes of this Operator. it is a dictionary, for every element, key is attribute name, and value is the attribute value. The attribute type should be as same as the type registered in C++ side. Default None.
Returns:

The initialized Operator.

Return type:

Operator

Raises:

ValueError – If the passed input, output and attrs doesn’t match the initializing Operator’s that registered in C++ side.

Notes

The constructor of operator should not be invoked directly. Use Block.append_op or Block._prepend_op instead.

Examples

cur_program = Program()
cur_block = cur_program.current_block()
# var1 += var2 + var3
cur_block.append_op(type="sum",
                    inputs={"X": [var1, var2, var3]},
                    outputs={"Out": [var1]})
to_string(throw_on_error)

Get debug string.

Parameters:throw_on_error (bool) – Whether to raise exception if self is not initialized.
Returns:The debug string.
Return type:str
input(name)

Get the input arguments according to the input parameter name.

Parameters:name (str) – The input parameter name.
Returns:return the list of argument names that associated with the specific parameter name.
Return type:list
rename_input(old_name, new_name)

Rename the old_name to new_name.

Parameters:
  • old_name (str) – The old name of the Operator’s input.
  • new_name (str) – The new name of the Operator’s input.
Returns:

None

rename_output(old_name, new_name)

Rename the old_name to new_name.

Parameters:
  • old_name (str) – The old name of the Operator’s output.
  • new_name (str) – The new name of the Operator’s output.
Returns:

None

output(name)

Get output arguments by the output parameter name.

Parameters:name (str) – The output parameter name.
Returns:return the list of argument names associated with the specific parameter name.
Return type:list
has_attr(name)

Whether this Operator has the attribute with name or not.

Parameters:name (str) – the attribute name.
Returns:True if has this attribute.
Return type:bool
attr_type(name)

Get the type of attribute by attribute’s name.

Parameters:name (str) – the attribute name.
Returns:the attribute type.
Return type:core.AttrType
set_attr(name, val)

Set the value of attribute by attribute’s name.

Parameters:
  • name (str) – the attribute name.
  • val (bool|int|str|float|list) – the value of the attribute.
Raises:

ValueError – If the type of value doesn’t match with desc.attr_type(name).

attr(name)

Get the attribute by name.

Parameters:name (str) – the attribute name.
Returns:The attribute value. The return value can be any valid attribute type.
Return type:bool|int|str|float|list
block_attr_id(name)

Get the block attribute’s id by name.

Parameters:name (str) – the attribute name.
Returns:the block index.
Return type:int
block_attr(name)

Get the block attribute by name.

Parameters:name (str) – the attribute name.
Returns:the block attribute.
Return type:block
blocks_attr(name)

Get the blocks attribute by name.

Parameters:name (str) – the attribute name.
Returns:list of the blocks attribute.
Return type:list
blocks_attr_ids(name)

Get the blocks attribute’s ids by name.

Parameters:name (str) – the attribute name.
Returns:list of the blocks ids.
Return type:list
all_attrs()

Get the attribute dict.

Returns:The Operator’s attribute dict, name->attr.
Return type:dict

Parameter

class paddle.fluid.Parameter(block, shape, dtype, **kwargs)

Parameter is derived from Variable. A parameter is a persistable Variable, and will be updated by optimizers after each iteration. The training of a neural network is essentially the updating of its parameters.

Relative to a general Variable, a Parameter has several its own member variables:

Parameters:
  • trainable (bool) – True if the parameter need to be updated after iterations.
  • optimize_attr (map) – Parameter attributes related with optimizing. Currently, it only contains ‘learning_rate’. Default: {‘learning_rate’: 1.0}
  • regularizer (WeightDecayRegularizer) – The Regularizer which will be applied on the parameter. Default: None
  • gradient_clip_attr (BaseGradientClipAttr) – The gradint clip strategy which will be applied on the parameter. Default: None
  • do_model_average (bool) – True if the model average strategy will be applied on this parameter.
to_string(throw_on_error, with_details=False)

To debug string.

Parameters:
  • throw_on_error (bool) – raise exception when self is not initialized when throw_on_error is True
  • with_details (bool) – more details about variables and parameters (e.g. trainable, optimize_attr, ...) will be printed when with_details is True

Returns(str): The debug string.

default_startup_program

paddle.fluid.default_startup_program()

Get default/global startup program.

The layer function in fluid.layers will create parameters, readers, NCCL handles as global variables. The startup_program will initialize them by the operators in startup program. The layer function will append these initialization operators into startup program.

This method will return the default or the current startup program. Users can use fluid.program_guard to switch program.

Returns:startup program
Return type:Program

default_main_program

paddle.fluid.default_main_program()

Get default/global main program. The main program is used for training or testing.

All layer function in fluid.layers will append operators and variables to the default_main_program.

The default_main_program is the default program in a lot of APIs. For example, the Executor.run() will execute the default_main_program when the program is not specified.

Returns:main program
Return type:Program

program_guard

paddle.fluid.program_guard(*args, **kwds)

Change the global main program and startup program with with statement. Layer functions in the Python with block will append operators and variables to the new main programs.

Examples

>>> import paddle.fluid as fluid
>>> main_program = fluid.Program()
>>> startup_program = fluid.Program()
>>> with fluid.program_guard(main_program, startup_program):
>>>     data = fluid.layers.data(...)
>>>     hidden = fluid.layers.fc(...)

Notes: The temporary Program can be used if the user does not need to construct either of startup program or main program.

Examples

>>> import paddle.fluid as fluid
>>> main_program = fluid.Program()
>>> # does not care about startup program. Just pass a temporary value.
>>> with fluid.program_guard(main_program, fluid.Program()):
>>>     data = ...
Parameters:
  • main_program (Program) – New main program inside with statement.
  • startup_program (Program) – New startup program inside with statement. None means do not change startup program.

get_var

paddle.fluid.get_var(name, program=None)

Get a variable by name from the global block of a program.

Parameters:
  • name (str) – name of the variable
  • program (Program|None) – program object.
  • None, default_global_program() will be used. (If) –
Returns:

Variable

Executor

class paddle.fluid.Executor(place)

An Executor in Python, only support the single-GPU running. For multi-cards, please refer to ParallelExecutor. Python executor takes a program, add feed operators and fetch operators to this program according to feed map and fetch_list. Feed map provides input data for the program. fetch_list provides the variables(or names) that user want to get after program run. Note: the executor will run all operators in the program but not only the operators dependent by the fetch_list. It store the global variables into the global scope, and create a local scope for the temporary variables. The local scope contents will be discarded after every minibatch forward/backward finished. But the global scope variables will be persistent through different runs. All of ops in program will be running in sequence.

Parameters:place (core.CPUPlace|core.CUDAPlace(n)) – indicate the executor run on which device

Note: For debugging complicated network in parallel-GPUs, you can test it on the executor. They has the exactly same arguments, and expected the same results.

close()

Close this executor.

You can no long use this executor after calling this method. For the distributed training, this method would free the resource on PServers related to the current Trainer.

Example

>>> cpu = core.CPUPlace()
>>> exe = Executor(cpu)
>>> ...
>>> exe.close()
run(program=None, feed=None, fetch_list=None, feed_var_name='feed', fetch_var_name='fetch', scope=None, return_numpy=True, use_program_cache=False)

Run program by this Executor. Feed data by feed map, fetch result by fetch_list. Python executor takes a program, add feed operators and fetch operators to this program according to feed map and fetch_list. Feed map provides input data for the program. fetch_list provides the variables(or names) that user want to get after program run.

Note: the executor will run all operators in the program but not only the operators dependent by the fetch_list

Parameters:
  • program (Program) – the program that need to run, if not provied, then default_main_program will be used.
  • feed (dict) – feed variable map, e.g. {“image”: ImageData, “label”: LableData}
  • fetch_list (list) – a list of variable or variable names that user want to get, run will return them according to this list.
  • feed_var_name (str) – the name for the input variable of feed Operator.
  • fetch_var_name (str) – the name for the output variable of fetch Operator.
  • scope (Scope) – the scope used to run this program, you can switch it to different scope. default is global_scope
  • return_numpy (bool) – if convert the fetched tensor to numpy
  • use_program_cache (bool) – set use_program_cache to true if program not changed compare to the last step.
Returns:

fetch result according to fetch_list.

Return type:

list(numpy.array)

Examples

>>> data = layers.data(name='X', shape=[1], dtype='float32')
>>> hidden = layers.fc(input=data, size=10)
>>> layers.assign(hidden, out)
>>> loss = layers.mean(out)
>>> adam = fluid.optimizer.Adam()
>>> adam.minimize(loss)
>>> cpu = core.CPUPlace()
>>> exe = Executor(cpu)
>>> exe.run(default_startup_program())
>>> x = numpy.random.random(size=(10, 1)).astype('float32')
>>> outs = exe.run(
>>>     feed={'X': x},
>>>     fetch_list=[loss.name])

global_scope

paddle.fluid.global_scope()

Get the global/default scope instance. There are a lot of APIs use global_scope as its default value, e.g., Executor.run

Returns:The global/default scope instance.
Return type:Scope

scope_guard

paddle.fluid.scope_guard(*args, **kwds)

Change the global/default scope instance by Python with statement. All variable in runtime will assigned to the new scope.

Examples

>>> import paddle.fluid as fluid
>>> new_scope = fluid.Scope()
>>> with fluid.scope_guard(new_scope):
>>>     ...
Parameters:scope – The new global/default scope.

_switch_scope

paddle.fluid._switch_scope(scope)

Trainer

class paddle.fluid.Trainer(train_func, optimizer_func, param_path=None, place=None, parallel=False, checkpoint_config=None)

A trainer wraps MultiGPU/MultiNode training loops and can be used to train a simple neural network easily.

This API takes a train_func. A train_func is a function that return loss as it first return value. The reset value can be fetched by EndStepEvent.metrics

This API also takes a optimizer_func that will return an optimizer instance.

For example, to train a MLP for MNIST dataset, the sample program is

>>> import paddle.fluid as fluid
>>>
>>> def mlp(image, layer_sizes=[200, 100], activation="relu", num_classes=10):
>>>     hidden = image
>>>     for layer_size in layer_sizes:
>>>         hidden = fluid.layers.fc(input=hidden, size=layer_size, act=activation)
>>>     return fluid.layers.fc(input=hidden, size=num_classes, act="softmax")
>>>
>>> def train_mnist_mlp():
>>>     img = fluid.layers.data(name='image', shape=[784])
>>>     label = fluid.layers.data(name='label', shape=[1], dtype='int64')
>>>     prediction = mlp(img)
>>>     return fluid.layers.mean(fluid.layers.cross_entropy(prediction, label))
>>>
>>> def optimizer():
>>>     return fluid.optimizer.Adam()
>>>
>>> trainer = Trainer(train_func=train_mnist_mlp,
>>>                   optimizer_func=optimizer,
>>>                   place=fluid.CUDAPlace(0),
>>>                   parallel=True)
>>>
>>> def train_callback(event):
>>>     if isinstance(event, fluid.EndStepEvent):
>>>         print "Epoch ID", event.epoch, "Step ID",    >>>             event.step, "AvgLoss", event.metrics[0]
>>>     elif isinstance(event, fluid.EndEpochEvent):
>>>         trainer.save_params("./model_{0}".format(event.epoch))
>>>
>>> trainer.train(num_epochs=100, event_handler=train_callback)

For more example, please see api_guide_high_level_api.

Parameters:
  • train_func (callable) – A function which will return loss. The loss must be a scalar tensor.
  • optimizer_func (callable) – A function that returns an Optimizer object.
  • place (CUDAPlace|CPUPlace) – The device place of this trainer. If parallel=True, all CUDA Places will be used if place is a CUDAPlace.
  • parallel (bool) – True if use multiple devices.
  • checkpoint_config (CheckpointConfig) – Configuration about how to save checkpoints.
stop()

stop training

train(num_epochs, event_handler, reader=None, feed_order=None)

Start the train loop to train the model.

Parameters:
  • num_epochs (int) – The number of epoch. An epoch will process all data in reader
  • event_handler (callable) – The event handler. A function with type (ev:Event)->void
  • reader (callable) – A reader creator object. See also api_guide_python_reader .
  • feed_order (list) – Feeding order of reader. None will following the defining order in program
Returns:

None

test(reader, feed_order)

Test the model on given test data

Parameters:
  • reader (callable) – The reader that yields test data.
  • feed_order (list) – Feeding order of reader. None will following the defining order in program
save_params(param_path)

Save all parameters into param_path.

Parameters:param_path (str) – The path to save parameters.
Returns:None

BeginEpochEvent

class paddle.fluid.BeginEpochEvent(epoch_id)

The begin of a training epoch.

Parameters:epoch_id (int) – The current epoch ID.

EndEpochEvent

class paddle.fluid.EndEpochEvent(epoch_id)

The end of a training epoch.

Parameters:epoch_id (int) – The current epoch ID.

BeginStepEvent

class paddle.fluid.BeginStepEvent(epoch_id, step_id)

The begin of a training epoch.

Parameters:
  • epoch_id (int) – The current epoch ID.
  • step_id (int) – The current step ID.
fetch_metrics = None

If fetch_metrics is true, the metrics will be fetched at the EndStepEvent. Default is True.

EndStepEvent

class paddle.fluid.EndStepEvent(epoch_id, step_id, metrics)

The end of a training step.

Parameters:
  • epoch_id (int) – The current epoch ID.
  • step_id (int) – The current step ID.
  • metrics (list) – A list of fetched tensor. The order of this list is same as the train_func returns.

CheckpointConfig

class paddle.fluid.CheckpointConfig(checkpoint_dir=None, max_num_checkpoints=3, epoch_interval=1, step_interval=10)

Parameter object for save_checkpoint and fluid.Trainer. Used to configuration how to save checkpoint.

Parameters:
  • checkpoint_dir (str) – Directory path to save check point. Default is the current directory.
  • max_num_checkpoints (int) – The max number of local check points.
  • epoch_interval (int) – Every number of epoch to save check point.
  • step_interval (int) – Every number of step to save check point.

Examples

>>> config = fluid.CheckpointConfig("./checkpoints")
>>> trainer = fluid.Trainer(train_func=train_program,
>>>                         place=place,
>>>                         optimizer_func=optimizer_func,
>>>                         checkpoint_config=config)
>>> trainer.train(...)

Inferencer

class paddle.fluid.Inferencer(infer_func, param_path, place=None, parallel=False)

Inferencer High Level API.

Parameters:
  • infer_func (Python func) – Infer function that will return predict Variable
  • param_path (str) – The path where the inference model is saved by fluid.io.save_params
  • place (Place) – place to do the inference
  • parallel (bool) – use parallel_executor to run the inference, it will use multi CPU/GPU.

Examples

def inference_program():
    x = fluid.layers.data(name='x', shape=[13], dtype='float32')
    y_predict = fluid.layers.fc(input=x, size=1, act=None)
    return y_predict

place = fluid.CPUPlace()
inferencer = fluid.Inferencer(
    infer_func=inference_program, param_path="/tmp/model", place=place)
infer(inputs, return_numpy=True)

Do Inference for Inputs

Parameters:
  • inputs (map) – a map of {“input_name”: input_var} that will be feed into the inference program
  • return_numpy (bool) – transform return value into numpy or not
Returns:

the predict value of the inference model for the inputs

Return type:

Tensor or Numpy

Examples

tensor_x = numpy.random.uniform(0, 10, [batch_size, 13]).astype("float32")
results = inferencer.infer({'x': tensor_x})

DistributeTranspiler

class paddle.fluid.DistributeTranspiler(config=None)

DistributeTranspiler

Convert the fluid program to distributed data-parallelism programs.

The main_program will be transformed to use a remote parameter server to do parameter optimization. And the optimization graph will be put into a parameter server program.

Examples

# Define your model before these codes.
port = os.getenv("PADDLE_PSERVER_PORT", "6174")
pserver_ips = os.getenv("PADDLE_PSERVER_IPS", "")
eplist = []
for ip in pserver_ips.split(","):
     eplist.append(':'.join([ip, port]))
pserver_endpoints = ",".join(eplist)
trainers = int(os.getenv("PADDLE_TRAINERS"))
current_endpoint = os.getenv("PADDLE_CURRENT_IP", "") + ":" + port
trainer_id = int(os.getenv("PADDLE_TRAINER_ID", "0"))
role = os.getenv("PADDLE_TRAINING_ROLE")

t = distribute_transpiler.DistributeTranspiler()
t.transpile(
     trainer_id, pservers=pserver_endpoints, trainers=trainers)
if role == "PSERVER":
     pserver_program = t.get_pserver_program(current_endpoint)
     pserver_startup_program = t.get_startup_program(current_endpoint,
                                                     pserver_program)
elif role == "TRAINER":
     trainer_program = t.get_trainer_program()
transpile(trainer_id, program=None, pservers='127.0.0.1:6174', trainers=1, sync_mode=True, startup_program=None)

Run the transpiler.

Parameters:
  • trainer_id (int) – id for current trainer worker, if you have n workers, the id may range from 0 ~ n-1
  • program (Program|None) – program to transpile, default is fluid.default_main_program().
  • pservers (str) – comma separated ip:port string for the pserver list.
  • trainers (int) – number of trainers in the distributed job.
  • sync_mode (bool) – Do sync training or not, default is True.
  • startup_program (Program|None) – startup_program to transpile, default is fluid.default_main_program().
get_trainer_program()

Get transpiled trainer side program.

Returns:trainer side program.
Return type:Program
get_pserver_program(endpoint)

Get parameter server side program.

Parameters:endpoint (str) – current parameter server endpoint.
Returns:the program for current parameter server to run.
Return type:Program
get_pserver_programs(endpoint)

Get pserver side main program and startup program for distributed training.

Parameters:endpoint (str) – current pserver endpoint.
Returns:(main_program, startup_program), of type “Program”
Return type:tuple
get_startup_program(endpoint, pserver_program=None, startup_program=None)

Deprecated

Get startup program for current parameter server. Modify operator input variables if there are variables that were split to several blocks.

Parameters:
  • endpoint (str) – current pserver endpoint.
  • pserver_program (Program) – deprecated, call get_pserver_program first.
  • startup_program (Program) – deprecated, should pass startup_program when initalizing
Returns:

parameter server side startup program.

Return type:

Program

InferenceTranspiler

class paddle.fluid.InferenceTranspiler

Convert the fluid program to optimized inference program.

There are several optimizations:

  • fuse convolution and batch normalization
  • fuse batch normalization and relu (MKLDNN only)

Examples:

# As InferenceTranspiler will modify the original program,
# please clone before use it.
inference_transpiler_program = program.clone()
t = fluid.InferenceTranspiler()
t.transpile(inference_transpiler_program, place)
transpile(program, place, scope=None)

Run the transpiler.

Parameters:
  • program (Program) – program to transpile
  • place (Place) – inference place
  • scope (Scope|None) – inference Scope

memory_optimize

paddle.fluid.memory_optimize(input_program, skip_opt_set=None, print_log=False, level=0)

Optimize memory by reusing var memory.

Note: it doesn’t not support subblock nested in subblock.
Parameters:
  • input_program – Input Program
  • print_log – whether to print debug log.
  • level – If level=0, reuse if the shape is completely equal, o
Returns:

release_memory

paddle.fluid.release_memory(input_program, skip_opt_set=None)

Modify the input program and insert delete_op to early drop not used variables. The modification will be performed inplace.

Notes: This is an experimental API and could be removed in next few releases. Users should not use this API.

Parameters:input_program (Program) – The program will be inserted delete_op.

DistributeTranspilerConfig

class paddle.fluid.DistributeTranspilerConfig

slice_var_up (bool): Do Tensor slice for pservers, default is True. split_method (PSDispatcher): RoundRobin or HashName can be used

try to choose the best method to balance loads for pservers.
min_block_size (int): Minimum splitted element number in block.
According:https://github.com/PaddlePaddle/Paddle/issues/8638#issuecomment-369912156 We can use bandwidth effiently when data size is larger than 2MB.If you want to change it, please be sure you see the slice_variable function.

ParallelExecutor

class paddle.fluid.ParallelExecutor(use_cuda, loss_name=None, main_program=None, share_vars_from=None, exec_strategy=None, build_strategy=None, num_trainers=1, trainer_id=0, **kwargs)

ParallelExecutor can run program in parallel.

Parameters:
  • use_cuda (bool) – Whether to use CUDA or not.
  • loss_name (str) – The loss name must set in training. Default None.
  • main_program (Program) – The program that need to run, if not provided, then default_main_program will be used. Default None.
  • share_vars_from (ParallelExecutor) – If provied, it will share variables from the specified ParallelExecutor. Default None.
  • num_trainers (int) – If greater than 1, NCCL will be initialized with multiple rank of nodes, each node should have same number of GPUs. Distributed training will be enabled then. Default 1.
  • trainer_id(int – Must use together with num_trainers. trainer_id is the “rank” of current node starts from 0. Default 0.
Returns:

The initialized ParallelExecutor object.

Return type:

ParallelExecutor

Raises:

TypeError – If share_vars_from is provided, but not ParallelExecutor object.

Examples

train_exe = fluid.ParallelExecutor(use_cuda=True, loss_name=loss.name)
test_exe = fluid.ParallelExecutor(use_cuda=True,
                                  main_program=test_program,
                                  share_vars_from=train_exe)

train_loss, = train_exe.run([loss.name], feed=feed_dict)
test_loss, = test_exe.run([loss.name], feed=feed_dict)
run(fetch_list, feed=None, feed_dict=None, return_numpy=True)

Run a parallel executor with fetch_list.

The feed parameter can be a dict or a list. If feed is a dict, the feed data will be split into multiple devices. If feed is a list, we assume the data has been splitted into multiple devices, the each element in the list will be copied to each device directly.

For example, if the feed is a dict:

>>> exe = ParallelExecutor()
>>> # the image will be splitted into devices. If there is two devices
>>> # each device will process an image with shape (24, 1, 28, 28)
>>> exe.run(feed={'image': numpy.random.random(size=(48, 1, 28, 28))})

For example, if the feed is a list:

>>> exe = ParallelExecutor()
>>> # each device will process each element in the list.
>>> # the 1st device will process an image with shape (48, 1, 28, 28)
>>> # the 2nd device will process an image with shape (32, 1, 28, 28)
>>> #
>>> # you can use exe.device_count to get the device number.
>>> exe.run(feed=[{"image": numpy.random.random(size=(48, 1, 28, 28))},
>>>               {"image": numpy.random.random(size=(32, 1, 28, 28))},
>>>              ])
Parameters:
  • fetch_list (list) – The fetched variable names
  • feed (list|dict|None) – The feed variables. If the feed is a dict, tensors in that dict will be splitted into each devices. If the feed is a list, each element of the list will be copied to each device. Default None.
  • feed_dict – Alias for feed parameter, for backward compatibility. This parameter has been deprecated. Default None.
  • return_numpy (bool) – Whether converts the fetched tensor to numpy. Default: True.
Returns:

The fetched result list.

Return type:

List

Raises:

ValueError – If the feed is a list, but its length is not equal the length of active places, or its element’s is not dict.

Notes

  1. If the feed’s type is dict, the number of data that feeds to ParallelExecutor must be bigger than active places. Otherwise, it will throw exception from C++ side. Special attention should be paid to check whether the last batch of the dataset is bigger than active places.
  2. If active places are more than one, the fetch results for each variable is a list, and each element of this list is the variable of respective active place.

Examples

pe = fluid.ParallelExecutor(use_cuda=use_cuda,
                            loss_name=avg_cost.name,
                            main_program=fluid.default_main_program())
loss = pe.run(feed=feeder.feed(cur_batch),
              fetch_list=[avg_cost.name]))

ExecutionStrategy

class paddle.fluid.ExecutionStrategy

BuildStrategy

class paddle.fluid.BuildStrategy

create_lod_tensor

paddle.fluid.create_lod_tensor(data, recursive_seq_lens, place)

Create a lod tensor from a numpy array, a list, or an existing lod tensor.

Create a lod tensor by doing the following:

  1. Check that the length-based level of detail (LoD) also known as recursive_sequence_lengths of the input is valid.
  2. Convert recursive_sequence_lengths to a offset-based LoD.
  3. Copy the data from a numpy array, a list or a existing lod tensor to CPU or GPU device (based on input place).
  4. Set the level of detail (LoD) using the offset-based LoD.

Examples

Suppose we want LoDTensor to hold data for sequences of word, where each word is represented by an integer. If we want to create a LoDTensor to represent two sentences, one of 2 words, and one of 3 words.

Then data can be a numpy array of integers with shape (5, 1). recursive_seq_lens will be [[2, 3]], indicating the length(# of words) in each sentence. This length-based recursive_seq_lens [[2, 3]] will be converted to offset-based LoD [[0, 2, 5]] inside the function call.

Please reference api_guide_low_level_lod_tensor for more details regarding LoD.

Parameters:
  • data (numpy.ndarray|list|LoDTensor) – a numpy array or a LoDTensor or a list holding the data to be copied.
  • recursive_seq_lens (list) – a list of lists indicating the length-based level of detail info specified by the user.
  • place (Place) – CPU or GPU place indicating where the data in the new LoDTensor will be stored.
Returns:

A fluid LoDTensor object with tensor data and recursive_seq_lens info.

create_random_int_lodtensor

paddle.fluid.create_random_int_lodtensor(recursive_seq_lens, base_shape, place, low, high)

Create a LoDTensor containing random integers.

This function is frequently used in the book examples. So we revised it based on the new create_lod_tensor API and put it here in the lod_tensor module to simplify the code.

The function does the following:

  1. Calculate the overall shape of the LoDTensor based on the length-based recursive_seq_lens input and the shape of the basic element in base_shape.
  2. Create a numpy array of this shape.
  3. Create the LoDTensor using create_lod_tensor API.

Suppose we want LoDTensor to hold data for sequences of word, where each word is represented by an integer. If we want to create a LoDTensor to represent two sentences, one of 2 words, and one of 3 words. Then ‘base_shape’ is [1], input length-based ‘recursive_seq_lens’ is [[2, 3]]. Then the overall shape of the LoDTensor would be [5, 1], holding 5 words for two sentences.

Parameters:
  • recursive_seq_lens (list) – a list of lists indicating the length-based level of detail info specified by the user.
  • base_shape (list) – the shape of the basic element to be held by the LoDTensor.
  • place (Place) – CPU or GPU place indicating where the data in the new LoDTensor will be stored.
  • low (int) – the lower bound of the random integers.
  • high (int) – the upper bound of the random integers.
Returns:

A fluid LoDTensor object with tensor data and recursive_seq_lens info.

LoDTensor

class paddle.fluid.LoDTensor
has_valid_recursive_sequence_lengths(self: paddle.fluid.core.LoDTensor) → bool
lod(self: paddle.fluid.core.LoDTensor) → List[List[int]]
recursive_sequence_lengths(self: paddle.fluid.core.LoDTensor) → List[List[int]]
set_lod(self: paddle.fluid.core.LoDTensor, arg0: List[List[int]]) → None
set_recursive_sequence_lengths(self: paddle.fluid.core.LoDTensor, arg0: List[List[int]]) → None

LoDTensorArray

class paddle.fluid.LoDTensorArray
append(self: paddle.fluid.core.LoDTensorArray, arg0: paddle.fluid.core.LoDTensor) → None

CPUPlace

class paddle.fluid.CPUPlace

CUDAPlace

class paddle.fluid.CUDAPlace

CUDAPinnedPlace

class paddle.fluid.CUDAPinnedPlace

Tensor

paddle.fluid.Tensor

alias of LoDTensor

ParamAttr

class paddle.fluid.ParamAttr(name=None, initializer=None, learning_rate=1.0, regularizer=None, trainable=True, gradient_clip=None, do_model_average=False)

Parameter attributes object. To fine-tuning network training process, user can set parameter’s attributes to control training details. Such as learning rate, regularization, trainable, do_model_average and the method to initialize param.

Parameters:
  • name (str) – The parameter’s name. Default None.
  • initializer (Initializer) – The method to initial this parameter. Default None.
  • learning_rate (float) – The parameter’s learning rate. The learning rate when optimize is \(global\_lr * parameter\_lr * scheduler\_factor\). Default 1.0.
  • regularizer (WeightDecayRegularizer) – Regularization factor. Default None.
  • trainable (bool) – Whether this parameter is trainable. Default True.
  • gradient_clip (BaseGradientClipAttr) – The method to clip this parameter’s gradient. Default None.
  • do_model_average (bool) – Whether this parameter should do model average. Default False.

Examples

w_param_attrs = fluid.ParamAttr(name="fc_weight",
                                learning_rate=0.5,
                                regularizer=fluid.L2Decay(1.0),
                                trainable=True)
y_predict = fluid.layers.fc(input=x, size=10, param_attr=w_param_attrs)

WeightNormParamAttr

class paddle.fluid.WeightNormParamAttr(dim=None, **kwargs)

Used for weight Norm. Weight Norm is a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. Weight Norm has been implemented as discussed in this paper: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks.

Parameters:
  • dim (list) – The parameter’s name. Default None.
  • kwargs – Any field in ParamAttr. Default None.

Examples

data = fluid.layers.data(name="data", shape=[3, 32, 32], dtype="float32")
fc = fluid.layers.fc(input=data,
                     size=1000,
                     param_attr=WeightNormParamAttr(
                          dim=None,
                          name='weight_norm_param'))

DataFeeder

class paddle.fluid.DataFeeder(feed_list, place, program=None)

DataFeeder converts the data that returned by a reader into a data structure that can feed into Executor and ParallelExecutor. The reader usually returns a list of mini-batch data entries. Each data entry in the list is one sample. Each sample is a list or a tuple with one feature or multiple features.

The simple usage shows below:

place = fluid.CPUPlace()
img = fluid.layers.data(name='image', shape=[1, 28, 28])
label = fluid.layers.data(name='label', shape=[1], dtype='int64')
feeder = fluid.DataFeeder([img, label], fluid.CPUPlace())
result = feeder.feed([([0] * 784, [9]), ([1] * 784, [1])])

If you want to feed data into GPU side separately in advance when you use multi-GPU to train a model, you can use decorate_reader function.

place=fluid.CUDAPlace(0)
feeder = fluid.DataFeeder(place=place, feed_list=[data, label])
reader = feeder.decorate_reader(
    paddle.batch(flowers.train(), batch_size=16))
Parameters:
  • feed_list (list) – The Variables or Variables’name that will feed into model.
  • place (Place) – place indicates feed data into CPU or GPU, if you want to feed data into GPU, please using fluid.CUDAPlace(i) (i represents the GPU id), or if you want to feed data into CPU, please using fluid.CPUPlace().
  • program (Program) – The Program that will feed data into, if program is None, it will use default_main_program(). Default None.
Raises:

ValueError – If some Variable is not in this Program.

Examples

# ...
place = fluid.CPUPlace()
feed_list = [
    main_program.global_block().var(var_name) for var_name in feed_vars_name
] # feed_vars_name is a list of variables' name.
feeder = fluid.DataFeeder(feed_list, place)
for data in reader():
    outs = exe.run(program=main_program,
                   feed=feeder.feed(data))
feed(iterable)

According to feed_list and iterable, converters the input into a data structure that can feed into Executor and ParallelExecutor.

Parameters:iterable (list|tuple) – the input data.
Returns:the result of conversion.
Return type:dict
feed_parallel(iterable, num_places=None)

Takes multiple mini-batches. Each mini-batch will be feed on each device in advance.

Parameters:
  • iterable (list|tuple) – the input data.
  • num_places (int) – the number of devices. Default None.
Returns:

the result of conversion.

Return type:

dict

Notes

The number of devices and number of mini-batches must be same.

decorate_reader(reader, multi_devices, num_places=None, drop_last=True)

Converter the input data into a data that returned by reader into multiple mini-batches. Each mini-batch will be feed on each device.

Parameters:
  • reader (fun) – the input data.
  • multi_devices (bool) – the number of places. Default None.
  • num_places (int) – the number of places. Default None.
  • drop_last (bool) – the number of places. Default None.
Returns:

the result of conversion.

Return type:

dict

Raises:
  • ValueError – If drop_last is False and the data batch which cannot
  • fit for devices.

Scope

class paddle.fluid.Scope
drop_kids(self: paddle.fluid.core.Scope) → None
find_var(self: paddle.fluid.core.Scope, arg0: unicode) → paddle.fluid.core.Variable
new_scope(self: paddle.fluid.core.Scope) → paddle.fluid.core.Scope
var(self: paddle.fluid.core.Scope, arg0: unicode) → paddle.fluid.core.Variable