HuberLossOptimizerIntegration
public static void OptimizerIntegrationGuide()
Language: C#
Documentation: Using HuberLoss with different optimizers
Pattern 1: With SGD Optimizer
var optimizer = new SgdOptimizer(learningRate: 0.01f, momentum: 0.9f);
var loss_fn = new HuberLoss(delta: 1.0f, gpuOps: backend as IGpuMathOps);
for each batch:
float loss = await loss_fn.ComputeLossAsync(pred, target);
var grads = await loss_fn.ComputeGradientAsync(pred, target);
await optimizer.ApplyGradientsAsync(weights, grads);
Pattern 2: With AdamW Optimizer (Recommended)
var optimizer = new AdamWOptimizer(
learningRate: 0.001f,
weightDecay: 0.01f,
gpuOps: backend as IGpuMathOps
);
var loss_fn = new HuberLoss(delta: 1.0f, gpuOps: backend as IGpuMathOps);
for each batch:
float loss = await loss_fn.ComputeLossAsync(pred, target);
var grads = await loss_fn.ComputeGradientAsync(pred, target);
await optimizer.ApplyGradientsAsync(weights, grads);
All Operations Stay on GPU:
Loss computation: All GPU ops (Abs, Where, Sum)
Gradient computation: All GPU ops (Abs, Where, Mul)
Optimizer step: All GPU ops (Mul, Add, Div)
No CPU Marshaling:
Data never transferred to CPU during training
Gradients computed directly on GPU
Weights updated in-place on GPU
Only loss value returned to CPU for monitoring