optimizer

SGD

paddle.fluid.optimizer.SGD

alias of SGDOptimizer

Momentum

paddle.fluid.optimizer.Momentum

alias of MomentumOptimizer

Adagrad

paddle.fluid.optimizer.Adagrad

alias of AdagradOptimizer

Adam

paddle.fluid.optimizer.Adam

alias of AdamOptimizer

Adamax

paddle.fluid.optimizer.Adamax

alias of AdamaxOptimizer

DecayedAdagrad

paddle.fluid.optimizer.DecayedAdagrad

alias of DecayedAdagradOptimizer

SGDOptimizer

class paddle.fluid.optimizer.SGDOptimizer(learning_rate, **kwargs)

Simple SGD optimizer without any state.

MomentumOptimizer

class paddle.fluid.optimizer.MomentumOptimizer(learning_rate, momentum, use_nesterov=False, **kwargs)

Simple Momentum optimizer with velocity state

AdagradOptimizer

class paddle.fluid.optimizer.AdagradOptimizer(learning_rate, epsilon=1e-06, **kwargs)

Simple Adagrad optimizer with moment state

AdamOptimizer

class paddle.fluid.optimizer.AdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, **kwargs)

Implements the Adam Optimizer

AdamaxOptimizer

class paddle.fluid.optimizer.AdamaxOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, **kwargs)

Implements the Adamax Optimizer

DecayedAdagradOptimizer

class paddle.fluid.optimizer.DecayedAdagradOptimizer(learning_rate, decay=0.95, epsilon=1e-06, **kwargs)

Simple Decayed Adagrad optimizer with moment state

Adadelta

class paddle.fluid.optimizer.AdadeltaOptimizer(learning_rate, epsilon=1e-06, rho=0.95, **kwargs)

Adadelta Optimizer Simple Adadelta optimizer with average squared grad state and average squared update state. The details of adadelta please refer to this ADADELTA: AN ADAPTIVE LEARNING RATE METHOD.

\[\begin{split}E(g_t^2) &= \rho * E(g_{t-1}^2) + (1-\rho) * g^2 \\ learning\_rate &= sqrt( ( E(dx_{t-1}^2) + \epsilon ) / ( \ E(g_t^2) + \epsilon ) ) \\ E(dx_t^2) &= \rho * E(dx_{t-1}^2) + (1-\rho) * (-g*learning\_rate)^2\end{split}\]
Parameters:
  • learning_rate (float) – global leraning rate
  • rho (float) – rho in equation
  • epsilon (float) – epsilon in equation

Examples

optimizer = fluid.optimizer.Adadelta(
    learning_rate=0.0003, epsilon=1.0e-6, rho=0.95)
_, params_grads = optimizer.minimize(cost)