Automatic Differentiation by OverLoading in C++
The Basic Idea
⇓ Operator overloading (C++)
VectorfunctioninC/C++: F:IRn→IRm:x→y=F(x)
Internal representation of F (≡tape)
Interpretation ⇓
Reverse mode
yj =yj(x0,x1,…,xj) ⇓
∂yj = ∂yj−i ∂xi ∂x0
= Aj−i (x0,x1,…,xj−i) =⇒ Gradients (adjoints)
Forward mode
x(t) = xjtj
y(t)=yjtj +Otd+1
=⇒ Directional derivatives
∂y0 = ∂y1 = ∂y2 ∂x0 ∂x1 ∂x2 ∂y1 = ∂y2 ∂x0 ∂x1
y0 = F (x0)
y1 =F′(x0)x1
y2 = F′ (x0)x2 + 21F′′ (x0)x1x1
y3 =F′(x0)x3+F′′(x0)x1x2+61F′′′(x0)x1x1x1
=A0 =F′(x0)
=A1 =F′′(x0)x1
= A2 = F′′ (x0)x2 + 1F′′′ (x0)x1x1
=A3 =F′′(x0)x3+F′′′(x0)x1x2+1F(4)(x0)x1x1x1 6
= ∂y3 ∂x3 = ∂y3 ∂x2 ∂y2 = ∂y3
∂x1 ∂y3 ∂x0
ADOL-C Automatic Differentiation by OverLoading in C++
Application
Operator overloading concept ⇒ Code modification
• Inclusion of appropriate ADOL-C headers
• Retyping of all involved variables to active data type adouble • Marking active section to be “taped” (trace on/trace off)
• Specification of independent and dependent variables (<<=/>>=) • Specification of differentiation task(s)
• Recompilation and Linking with ADOL-C library libad.a
#include
adouble foo ( adouble x )
{ adouble tmp;
tmp = log(x);
return 3.0*tmp*tmp + 2.0;
int main (int argc, char* argv[])
double x[2], y;
adouble ax[2], ay;
x[0]=0.3; x[1]=2.3;
trace_on(1);
ax[0]<<=x[0]; ax[1]<<=x[1];
ay=ax[0]*sin(ax[1])+ foo(ax[1]);
trace_off();
double g[2];
gradient(1,2,x,g);
x[0]+=0.1; x[1]+=0.3;
gradient(1,2,x,g);
// inlusion of ADOL-C headers
// some activated function
// main program or other procedure
// declaration of active variables
// starting active section
// marking independent variables
// function evaluation
// marking dependend variables
// ending active section
// application of ADOL-C routine
// application at different argument
ADOL-C Automatic Differentiation by OverLoading in C++
Drivers for Optimization and Nonlinear Equations (C/C++)
min f (x) , x
F(x)=0m, function(tag,m,n,x[n],y[m])
gradient(tag,n,x[n],g[n])
hessian(tag,n,x[n],H[n][n])
f : IRn → IR F :IRn →IRm
jacobian(tag,m,n,x[n],J[m][n])
vec jac(tag,m,n,repeat?,x[n],u[m],z[n]) jac vec(tag,m,n,x[n],v[n],z[m])
hess vec(tag,n,x[n],v[n],z[n])
lagra hess vec(tag,m,n,x[n],v[n],u[m],h[n])
jac solv(tag,n,x[n],b[n],sparse?,mode?)
Example: Solution of F (x) = 0 by Newton’s method
F (x0) ∇f (x0)
F ′ (x0) uT F ′ (x0) F ′ (x0) v
∇2f (x0) v uT F ′′ (x0) v
F ′ (x0) w = b
double x[n], r[n];
initialize(x);
function(ftag,n,n,x,r);
while (norm(r) > EPSILON)
{ jac_solv(ftag,n,x,r,0,2);
for (i=0; i
• higher-order vector forward; computes y0 = F (x0), Y1 = F ′ (x0) X1, . . . , where x = x0, X = [X1,X2,…,Xd] and y = y0, Y = [Y1,Y2,…,Yd]
hos forward(tag,m,n,d,keep,x[n],X[n][d],y[m],Y[m][d])
hov forward(tag,m,n,d,p,x[n],X[n][p][d],y[m],Y[m][p][d])
Programming Help
Automatic Differentiation by OverLoading in C++
fos reverse(tag,m,n,u[m],z[n])
Reverse Mode (C/C++)
first-order scalar reverse; computes zT = uT F ′ (x)
after calling zos forward, fos forward, or hos forward with keep = 1
first-order vector reverse; computes Z = UF′ (x)
after calling zos forward, fos forward, or hos forward with keep = 1
higher-order scalar reverse; computes the adjoints z0T = uT F ′ (x0 ) = uT A0 , z1T = uTF′′ (x0)x1 = uTA1, …, where Z = [z0,z1,…,zd]
after calling fos forward or hos forward with keep = d + 1 > 1
higher-order vector reverse; computes the adjoints Z0 = U F ′ (x0 ) = U A0 , Z1 = UF′′ (x0)x1 = UA1, …, where Z = [Z0,Z1,…,Zd]
after calling fos forward or hos forward with keep = d + 1 > 1 optional nonzero pattern nz (⇒ manual)
fov reverse(tag,m,n,q,U[q][m],Z[q][n])
hos reverse(tag,m,n,d,u[m],Z[n][d+1])
hov reverse(tag,m,n,d,q,U[q][m],Z[q][n][d+1],nz[q][n])
double x[n], y[m], **I, **J;
I=myallocI2(m);
J=myalloc2(m,n);
initialize(x);
zos_forward(ftag,m,n,1,x,y);
fos_reverse(ftag,m,n,m,I,J);
// allocation of identity matrix
// allocation of Jacobian matrix
// setting up the argument x
// computing the Jacobian by
// reverse mode of AD
ADOL-C Automatic Differentiation by OverLoading in C++
Low-level Differentiation Routines Forward Mode (C++ interfaces)
forward(tag,m,n,d,keep,X[n][d+1],Y[m][d+1]) hos, fos, zos
forward(tag,m=1,n,d,keep,X[n][d+1],Y[d+1])
forward(tag,m,n,d=0,keep,x[n],y[m])
forward(tag,m,n,keep,x[n],y[m])
forward(tag,m,n,p,x[n],X[n][p],y[m],Y[m][p])
forward(tag,m,n,d,p,x[n],X[n][p][d],
y[m],Y[m][p][d])
Reverse Mode (C++ interfaces)
reverse(tag,m,n,d,u[m],Z[n][d+1])
forward(tag,m=1,n,d,u,Z[n][d+1])
reverse(tag,m,n,d=0,u[m],z[n])
reverse(tag,m=1,n,d=0,u,z[n])
hos, fos, zos zos
zos fov hov
reverse(tag,m,n,d,q,U[q][m],Z[q][n][d+1],nz[q][n]) hov reverse(tag,m=1,n,d,q,U[q],Z[q][n][d+1],nz[q][n]) hov reverse(tag,m=1,n,d,Z[m][n][d+1],nz[m][n])(U=Im) hov
reverse(tag,m,n,d=0,q,U[q][m],Z[q][n]) fov reverse(tag,m,n,q,U[q][m],Z[q][n] fov reverse(tag,m=1,n,d=0,q,U[q],Z[q][n]) fov
ADOL-C Automatic Differentiation by OverLoading in C++
Drivers for Ordinary Differential Equations (C/C++)
ODE: x′ (t) = y(t) = F (x(t)), x(0) = x0
recursive forward computation of xd +1,…,xd from x0,…,xd (by xi+1 = 1 yi) old old 1+i
application with dold = 0 delivers truncated Taylor series d0 xjtj at base point x0
reverse computation of Aj = ∂yj , j = 0, . . . , d after calling forodec with degree d ∂x0
optional nonzero pattern nz (⇒ manual)
accumulation of total derivatives Bj = dxj , j = 0, . . . , d from the partial derivatives dx0
Aj = ∂yj , j = 0,…,d after calling hov reverse ∂x0
optional nonzero pattern nz (⇒ manual)
Special C++ interfaces can be found in file SRC/DRIVERS/odedrivers.h!
double x[n], **I, **X, ***A, ***B;
I=myallocI2(n); // allocation of identity matrix
X=myalloc2(n,5); // allocation of matrix X
A=myalloc3(n,n,4); B=myalloc3(n,n,4); // allocation of tensors A and B
initialize(X);
forodec(ftag,n,1.0,0,4,X);
hov_reverse(ftag,n,n,3,n,I,A,NULL);
accodec(ftag,n,1.0,3,A,B,NULL);
forodec(tag,n,tau,dold,d,X[n][d+1])
C++: Example:
hov reverse(tag,n,n,d-1,n,I[n][n],A[n][n][d],nz[n][n])
accodec(n,tau,d-1,A[n][n][d],B[n][n][d],nz[n][n])
// setting up the argument x_0
// compute x_1,…,x_4
// compute A_0,…,A_3
// accumulate B_0,…,B_3
浙大学霸代写 加微信 cstutorcs
Automatic Differentiation by OverLoading in C++
ADOL-C provides
• Low-level differentiation routines (forward/reverse) • Easy-to-use driver routines for
– the solution of optimization problems and nonlinear equations – the integration of ordinary differential equations
– the evaluation of higher derivative tensors (⇒ manual)
• Derivatives of implicit and inverse functions (⇒ manual) • Forward and backward dependence analysis (⇒ manual)
Recent developments
• Efficient detection of Jacobian/Hessian sparsity structure
• Exploitation of Jacobian/Hessian sparsity by matrix compression • Integration of checkpointing routines
• Exploitation of fixpoint iterations
• Differentiation of OpenMP parallel programs
Future developments
• Internal optimizations to reduce storage needed for reverse mode • Recovery of structure for internal function representation
• Differentiation of MPI parallel programs
Contact/Resources
• WWW: http://www.coin-or.org/projects/ADOL-C.xml