Learning Rate Scheduler¶
Scheduler Factory¶
mindcv.scheduler.scheduler_factory.create_scheduler(steps_per_epoch, scheduler='constant', lr=0.01, min_lr=1e-06, warmup_epochs=3, warmup_factor=0.0, decay_epochs=10, decay_rate=0.9, milestones=None, num_epochs=200, num_cycles=1, cycle_decay=1.0, lr_epoch_stair=False)
¶
Creates learning rate scheduler by name.
PARAMETER | DESCRIPTION |
---|---|
steps_per_epoch |
number of steps per epoch.
TYPE:
|
scheduler |
scheduler name like 'constant', 'cosine_decay', 'step_decay', 'exponential_decay', 'polynomial_decay', 'multi_step_decay'. Default: 'constant'.
TYPE:
|
lr |
learning rate value. Default: 0.01.
TYPE:
|
min_lr |
lower lr bound for 'cosine_decay' schedulers. Default: 1e-6.
TYPE:
|
warmup_epochs |
epochs to warmup LR, if scheduler supports. Default: 3.
TYPE:
|
warmup_factor |
the warmup phase of scheduler is a linearly increasing lr,
the beginning factor is
TYPE:
|
decay_epochs |
for 'cosine_decay' schedulers, decay LR to min_lr in
TYPE:
|
decay_rate |
LR decay rate. Default: 0.9.
TYPE:
|
milestones |
list of epoch milestones for 'multi_step_decay' scheduler. Must be increasing. Default: None
TYPE:
|
num_epochs |
Number of total epochs. Default: 200.
TYPE:
|
num_cycles |
Number of cycles for cosine decay and cyclic. Default: 1.
TYPE:
|
cycle_decay |
Decay rate of lr max in each cosine cycle. Default: 1.0.
TYPE:
|
lr_epoch_stair |
If True, LR will be updated in the beginning of each new epoch and the LR will be consistent for each batch in one epoch. Otherwise, learning rate will be updated dynamically in each step. Default: False.
TYPE:
|
Source code in mindcv/scheduler/scheduler_factory.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
|
mindcv.scheduler.dynamic_lr
¶
Meta learning rate scheduler.
This module implements exactly the same learning rate scheduler as native PyTorch,
see "torch.optim.lr_scheduler" <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>
_.
At present, only constant_lr
, linear_lr
, polynomial_lr
, exponential_lr
, step_lr
, multi_step_lr
,
cosine_annealing_lr
, cosine_annealing_warm_restarts_lr
, one_cycle_lr
, cyclic_lr
are implemented.
The number, name and usage of the Positional Arguments are exactly the same as those of native PyTorch.
However, due to the constraint of having to explicitly return the learning rate at each step, we have to
introduce additional Keyword Arguments. There are only three Keyword Arguments introduced,
namely lr
, steps_per_epoch
and epochs
, explained as follows:
lr
: the basic learning rate when creating optim in torch.
steps_per_epoch
: the number of steps(iterations) of each epoch.
epochs
: the number of epoch. It and steps_per_epoch
determine the length of the returned lrs.
In all schedulers, one_cycle_lr
and cyclic_lr
only need two Keyword Arguments except lr
, since
when creating optim in torch, lr
argument will have no effect if using the two schedulers above.
Since most scheduler in PyTorch are coarse-grained, that is the learning rate is constant within a single epoch.
For non-stepwise scheduler, we introduce several fine-grained variation, that is the learning rate
is also changed within a single epoch. The function name of these variants have the refined
keyword.
The implemented fine-grained variation are list as follows: linear_refined_lr
, polynomial_refined_lr
, etc.
mindcv.scheduler.dynamic_lr.cosine_decay_lr(decay_epochs, eta_min, *, eta_max, steps_per_epoch, epochs, num_cycles=1, cycle_decay=1.0)
¶
update every epoch
Source code in mindcv/scheduler/dynamic_lr.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
mindcv.scheduler.dynamic_lr.cosine_decay_refined_lr(decay_epochs, eta_min, *, eta_max, steps_per_epoch, epochs, num_cycles=1, cycle_decay=1.0)
¶
update every step
Source code in mindcv/scheduler/dynamic_lr.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
|
mindcv.scheduler.dynamic_lr.cyclic_lr(base_lr, max_lr, step_size_up=2000, step_size_down=None, mode='triangular', gamma=1.0, scale_fn=None, scale_mode='cycle', *, steps_per_epoch, epochs)
¶
Cyclic learning rate scheduler based on '"Cyclical Learning Rates for Training Neural Networks" https://arxiv.org/abs/1708.07120'
PARAMETER | DESCRIPTION |
---|---|
base_lr |
Lower learning rate boundaries in each cycle.
TYPE:
|
max_lr |
Upper learning rate boundaries in each cycle.
TYPE:
|
step_size_up |
Number of steps in the increasing half in each cycle. Default: 2000.
TYPE:
|
step_size_down |
Number of steps in the increasing half in each cycle. If step_size_down is None, it's set to step_size_up. Default: None.
DEFAULT:
|
div_factor |
Initial learning rate via initial_lr = max_lr / div_factor. Default: 25.0.
|
final_div_factor |
Minimum learning rate at the end via min_lr = initial_lr / final_div_factor. Default: 10000.0.
|
mode |
One of {triangular, triangular2, exp_range}. If scale_fn is not None, it's set to None. Default: 'triangular'.
TYPE:
|
gamma |
Constant in 'exp_range' calculating fuction: gamma**(cycle_iterations). Default: 1.0
DEFAULT:
|
scale_fn |
Custom scaling policy defined by a single argument lambda function. If it's not None, 'mode' is ignored. Default: None
DEFAULT:
|
scale_mode |
One of {'cycle', 'iterations'}. Determine scale_fn is evaluated on cycle number or cycle iterations. Default: 'cycle'
DEFAULT:
|
steps_per_epoch |
Number of steps per epoch.
TYPE:
|
epochs |
Number of total epochs.
TYPE:
|
Source code in mindcv/scheduler/dynamic_lr.py
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 |
|
mindcv.scheduler.dynamic_lr.one_cycle_lr(max_lr, pct_start=0.3, anneal_strategy='cos', div_factor=25.0, final_div_factor=10000.0, three_phase=False, *, steps_per_epoch, epochs)
¶
OneCycle learning rate scheduler based on '"Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates" https://arxiv.org/abs/1708.07120'
PARAMETER | DESCRIPTION |
---|---|
max_lr |
Upper learning rate boundaries in the cycle.
TYPE:
|
pct_start |
The percentage of the number of steps of increasing learning rate in the cycle. Default: 0.3.
TYPE:
|
anneal_strategy |
Define the annealing strategy: "cos" for cosine annealing, "linear" for linear annealing. Default: "cos".
TYPE:
|
div_factor |
Initial learning rate via initial_lr = max_lr / div_factor. Default: 25.0.
TYPE:
|
final_div_factor |
Minimum learning rate at the end via min_lr = initial_lr / final_div_factor. Default: 10000.0.
TYPE:
|
three_phase |
If True, learning rate will be updated by three-phase according to "final_div_factor". Otherwise, learning rate will be updated by two-phase. Default: False.
TYPE:
|
steps_per_epoch |
Number of steps per epoch.
TYPE:
|
epochs |
Number of total epochs.
TYPE:
|
Source code in mindcv/scheduler/dynamic_lr.py
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 |
|