Prev Next

exp_eps: Second Order Reverse Sweep

Purpose
In general, a second order reverse sweep is given the first order expansion for all of the variables in an operation sequence. Given a choice of a particular variable, it computes the derivative, of that variables first order expansion coefficient, with respect to all of the independent variables.

Mathematical Form
Suppose that we use the algorithm exp_eps.hpp to compute exp_eps(xepsilon) with x is equal to .5 and epsilon is equal to .2. For this case, the mathematical function for the operation sequence corresponding to exp_eps is  \[
f ( x , \varepsilon ) =   1 + x + x^2 / 2  
\] 
The corresponding derivative of the partial derivative with respect to  x is  \[
\begin{array}{rcl}
\Dpow{2}{x} f ( x , \varepsilon ) & = &  1
\\
\partial_\varepsilon \partial_x f ( x , \varepsilon ) & = &  0
\end{array}
\] 


epsilon
Since  \varepsilon is an independent variable, it could included as an argument to all of the  f_j functions below. The result would be that all the partials with respect to  \varepsilon would be zero and hence we drop it to simplify the presentation.

f_7
In reverse mode we choose one dependent variable and compute its derivative with respect to all the independent variables. For our example, we chose the value returned by exp_eps.hpp which is  v_7 . We begin with the function  f_7 where  v_7 is both an argument and the value of the function; i.e.,  \[
\begin{array}{rcl}
f_7 \left( 
     v_1^{(0)} , v_1^{(1)} , \ldots , v_7^{(0)} , v_7^{(1)} 
\right) 
& = & v_7^{(1)} 
\\
\D{f_7}{v_7^{(1)}} & = & 1
\end{array}
\] 
All the other partial derivatives of  f_7 are zero.

Index 7: f_6
The last operation has index 7,  \[
\begin{array}{rcl}
     v_7^{(0)} & = &   v_4^{(0)} + v_6^{(0)}  
     \\
     v_7^{(1)} & = &   v_4^{(1)} + v_6^{(1)}  
\end{array}
\] 
We define the function  f_6 \left( v_1^{(0)} , \ldots , v_6^{(1)} \right)  as equal to  f_7 except that  v_7^{(0)} and  v_7^{(1)} are eliminated using this operation; i.e.  \[
f_6  = 
f_7 \left[ v_1^{(0)} , \ldots , v_6^{(1)} , 
     v_7^{(0)} \left( v_4^{(0)} , v_6^{(0)} \right)  ,
     v_7^{(1)} \left( v_4^{(1)} , v_6^{(1)} \right)  
\right]
\] 
It follows that  \[
\begin{array}{rcll}
\D{f_6}{v_4^{(1)}} 
& = & \D{f_7}{v_4^{(1)}} + 
     \D{f_7}{v_7^{(1)}} * \D{v_7^{(1)}}{v_4^{(1)}} 
& = 1
\\
\D{f_6}{v_6^{(1)}} 
& = & \D{f_7}{v_6^{(1)}} + 
     \D{f_7}{v_7^{(1)}} * \D{v_7^{(1)}}{v_6^{(1)}} 
& = 1
\end{array}
\] 
All the other partial derivatives of  f_6 are zero.

Index 6: f_5
The previous operation has index 6,  \[
\begin{array}{rcl}
     v_6^{(0)} & = & v_5^{(0)} / 2
     \\
     v_6^{(1)} & = & v_5^{(1)} / 2
\end{array}
\] 
We define the function  f_5 \left( v_1^{(0)} , \ldots , v_5^{(1)} \right)  as equal to  f_6  except that  v_6^{(0)} and  v_6^{(1)} are eliminated using this operation; i.e.  \[
f_5 = 
f_6 \left[ v_1^{(0)} , \ldots , v_5^{(1)} , 
     v_6^{(0)} \left( v_5^{(0)} \right) ,
     v_6^{(1)} \left( v_5^{(1)} \right) 
\right]
\] 
It follows that  \[
\begin{array}{rcll}
\D{f_5}{v_4^{(1)}} 
& = & \D{f_6}{v_4^{(1)}} 
& = 1
\\
\D{f_5}{v_5^{(1)}} 
& = & \D{f_6}{v_5} + 
     \D{f_6}{v_6^{(1)}} * \D{v_6^{(1)}}{v_5^{(1)}} 
& = 0.5
\end{array}
\] 
All the other partial derivatives of  f_5 are zero.

Index 5: f_4
The previous operation has index 5,  \[
\begin{array}{rcl}
     v_5^{(0)} & = & v_3^{(0)} * v_1^{(0)}
     \\
     v_5^{(1)} & = & v_3^{(1)} * v_1^{(0)} + v_3^{(0)} * v_1^{(1)}
\end{array}
\] 
We define the function  f_4 \left( v_1^{(0)} , \ldots , v_4^{(1)} \right)  as equal to  f_5  except that  v_5^{(0)} and  v_5^{(1)} are eliminated using this operation; i.e.  \[
f_4 = 
f_5 \left[  v_1^{(0)} , \ldots , v_4^{(1)} ,
     v_5^{(0)} \left( v_1^{(0)}, v_3^{(0)} \right) ,
     v_5^{(1)} \left( v_1^{(0)}, v_1^{(1)}, v_3^{(0)} , v_3^{(1)} \right) ,
\right]
\] 
Given the information from the forward sweep, we have  v_1^{(0)} =  0.5 ,  v_3^{(0)} =  0.5 ,  v_1^{(1)} =  1 ,  v_3^{(1)} =  1 , and the fact that the partial of  f_5 with respect to  v_5^{(0)} is zero, we have  \[
\begin{array}{rcll}
\D{f_4}{v_1^{(0)}}
& = & \D{f_5}{v_1^{(0)}}
  +   \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_1^{(0)}} 
& = 0.5
\\
\D{f_4}{v_1^{(1)}}
& = & \D{f_5}{v_1^{(1)}}
  +   \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_1^{(1)}} 
& = 0.25
\\
\D{f_4}{v_3^{(0)}}
& = & \D{f_5}{v_3^{(0)}}
  +   \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_3^{(0)}} 
& = 0.5
\\
\D{f_4}{v_3^{(1)}}
& = & \D{f_3}{v_1^{(1)}}
  +   \D{f_5}{v_5^{(1)}} * \D{v_5^{(1)}}{v_3^{(1)}} 
& = 0.25
\\
\D{f_4}{v_4^{(1)}}
& = & \D{f_5}{v_4^{(1)}}
& = 1
\end{array}
\] 
All the other partial derivatives of  f_5 are zero.

Index 4: f_3
The previous operation has index 4,  \[
\begin{array}{rcl}
     v_4^{(0)} = 1 + v_3^{(0)}
     \\
     v_4^{(1)} = v_3^{(1)}
\end{array}
\] 
We define the function  f_3 \left( v_1^{(0)} , \ldots , v_3^{(1)} \right)  as equal to  f_4  except that  v_4^{(0)} and  v_4^{(1)} are eliminated using this operation; i.e.  \[
f_3 = 
f_4 \left[ v_1^{(0)} , \ldots , v_3^{(1)} ,
     v_4^{(0)} \left( v_3^{(0)} \right) ,
     v_4^{(1)} \left( v_3^{(1)} \right)  
\right]
\] 
It follows that  \[
\begin{array}{rcll}
\D{f_3}{v_1^{(0)}} 
& = & \D{f_4}{v_1^{(0)}}
& =  0.5
\\
\D{f_3}{v_1^{(1)}} 
& = & \D{f_4}{v_1^{(1)}}
& =  0.25
\\
\D{f_3}{v_2^{(0)}} 
& = & \D{f_4}{v_2^{(0)}}  
& = 0
\\
\D{f_3}{v_2^{(1)}} 
& = & \D{f_4}{v_2^{(1)}}  
& = 0
\\
\D{f_3}{v_3^{(0)}} 
& = & \D{f_4}{v_3^{(0)}} 
  +   \D{f_4}{v_4^{(0)}} * \D{v_4^{(0)}}{v_3^{(0)}} 
& = 0.5
\\
\D{f_3}{v_3^{(1)}} 
& = & \D{f_4}{v_3^{(1)}} 
  +   \D{f_4}{v_4^{(1)}} * \D{v_4^{(1)}}{v_3^{(1)}} 
& = 1.25
\end{array}
\] 


Index 3: f_2
The previous operation has index 3,  \[
\begin{array}{rcl}
     v_3^{(0)} & = & v_2^{(0)} / 1
     \\
     v_3^{(1)} & = & v_2^{(1)} / 1
\end{array}
\] 
We define the function  f_2 \left( v_1^{(0)} , \ldots , v_2^{(1)} \right)  as equal to  f_3  except that  v_3^{(0)} and  v_3^{(1)} are eliminated using this operation; i.e.  \[
f_2 = 
f_3 \left[ v_1^{(0)} , \ldots , v_2^{(1)} ,
     v_3^{(0)} \left( v_2^{(0)} \right) ,
     v_3^{(1)} \left( v_2^{(1)} \right)
\right]
\] 
It follows that  \[
\begin{array}{rcll}
\D{f_2}{v_1^{(0)}} 
& = & \D{f_3}{v_1^{(0)}}
& =  0.5
\\
\D{f_2}{v_1^{(1)}} 
& = & \D{f_3}{v_1^{(1)}}
& =  0.25
\\
\D{f_2}{v_2^{(0)}} 
& = & \D{f_3}{v_2^{(0)}}  
  +   \D{f_3}{v_3^{(0)}} * \D{v_3^{(0)}}{v_2^{(0)}}
& = 0.5
\\
\D{f_2}{v_2^{(1)}} 
& = & \D{f_3}{v_2^{(1)}}  
  +   \D{f_3}{v_3^{(1)}} * \D{v_3^{(1)}}{v_2^{(0)}}
& = 1.25
\end{array}
\] 


Index 2: f_1
The previous operation has index 1,  \[
\begin{array}{rcl}
     v_2^{(0)} & = & 1 * v_1^{(0)}
     \\
     v_2^{(1)} & = & 1 * v_1^{(1)}
\end{array}
\] 
We define the function  f_1 \left( v_1^{(0)} , v_1^{(1)} \right)  as equal to  f_2  except that  v_2^{(0)} and  v_2^{(1)} are eliminated using this operation; i.e.  \[
f_1 = 
f_2 \left[  v_1^{(0)} , v_1^{(1)} , 
     v_2^{(0)} \left( v_1^{(0)} \right)  ,
     v_2^{(1)} \left( v_1^{(1)} \right)  
\right]
\] 
It follows that  \[
\begin{array}{rcll}
\D{f_1}{v_1^{(0)}} 
& = & \D{f_2}{v_1^{(0)}}
  +   \D{f_2}{v_2^{(0)}} * \D{v_2^{(0)}}{v_1^{(0)}}
& =  1
\\
\D{f_1}{v_1^{(1)}} 
& = & \D{f_2}{v_1^{(1)}}
  +   \D{f_2}{v_2^{(1)}} * \D{v_2^{(1)}}{v_1^{(1)}}
& = 1.5
\end{array}
\] 
Note that  v_1 is equal to  x , so the second partial derivative of exp_eps(xepsilon) at x equal to .5 and epsilon equal .2 is  \[
\Dpow{2}{x} v_7^{(0)}
= \D{v_7^{(1)}}{x} 
= \D{f_1}{v_1^{(0)}}
= 1
\] 
There is a theorem about algorithmic differentiation that explains why the other partial of  f_1 is equal to the first partial of exp_eps(xepsilon) with respect to  x .

Verification
The file exp_eps_rev2.cpp contains a routine that verifies the values computed above. It returns true for success and false for failure. It only tests the partial derivatives of  f_j that might not be equal to the corresponding partials of  f_{j+1} ; i.e., the other partials of  f_j must be equal to the corresponding partials of  f_{j+1} .

Exercises
  1. Consider the case where  x = .1 and we first preform a zero order forward mode sweep for the operation sequence used above (in reverse order). What are the results of a first order reverse mode sweep; i.e., what are the corresponding values for  \D{f_j}{v_k} for all  j, k such that  \D{f_j}{v_k} \neq 0 .
  2. Create a modified version of exp_eps_rev2.cpp that verifies the values you obtained for the previous exercise. Also create and run a main program that reports the result of calling the modified version of exp_eps_rev2.cpp .

Input File: introduction/exp_apx/exp_eps.omh