Single-node training


To perform single-node training in PaddlePaddle Fluid, you need to read Prepare Data and Set up Simple Model . When you have finished reading Set up Simple Model , you can get two fluid.Program, namely startup_program and main_program . By default, you can use fluid.default_startup_program() and fluid.default_main_program() to get global fluid.Program .

For example:

import paddle.fluid as fluid

image ="image", shape=[784])
label ="label", shape=[1])
hidden = fluid.layers.fc(input=image, size=100, act='relu')
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
loss = fluid.layers.mean(

sgd = fluid.optimizer.SGD(learning_rate=0.001)

# Here the fluid.default_startup_program() and fluid.default_main_program()
# has been constructed.

After the configuration of model, the configurations of fluid.default_startup_program() and fluid.default_main_program() have been finished.

Initialize Parameters

Random Initialization of Parameters

After the configuration of model,the initialization of parameters will be written into fluid.default_startup_program() . By running this program in fluid.Executor() , the random initialization of parameters will be finished in global fluid.global_scope() .For example:

exe = fluid.Executor(fluid.CUDAPlace(0))

Note that in multi-GPU training, the parameters should be initialized on GPU0 and then will be distributed to multiple graphic cards through fluid.ParallelExecutor .

Load Predefined Parameters

In the neural network training, predefined models are usually loaded to continue training. For how to load predefined parameters, please refer to Save, Load Models or Variables & Incremental Learning.

Single-card Training

Single-card training can be performed through calling run() of fluid.Executor() to run training fluid.Program . In the runtime, feed data with run(feed=...) and get persistable data with run(fetch=...) . For example:

loss = fluid.layers.mean(...)

exe = fluid.Executor(...)
# the result is an numpy array
result ={"image": ..., "label": ...}, fetch_list=[loss])


  1. About data type supported by feed, please refer to the article Transfer Train Data to Executor.
  2. The return value of is the variable value of fetch_list=[...] .The fetched Variable must be persistable. fetch_list can be fed with either Variable list or name list of variables . returns Fetch result list.
  3. If the fetched data contain sequence information, you can set, ...) to directly get fluid.LoDTensor . You can directly access the information in fluid.LoDTensor .

Multi-card Training

In multi-card training, you can use fluid.ParallelExecutor to run training fluid.Program. For example:

train_exe = fluid.ParallelExecutor(use_cuda=True,,
                             main_program=fluid.default_main_program())[], feed={...})


  1. The constructor of ParallelExecutor needs to be set with fluid.Program to be run which can not be modified at runtime. The default value is fluid.default_main_program() .
  2. ParallelExecutor should be indicated whether to use CUDA to train. In the mode of graphic card training, all graphic cards will be occupied. Users can configure CUDA_VISIBLE_DEVICES to change graphics cards that are being used.