This package contains several two and three dimensional ragged array structures (triangular matrices and arrays of triangular matrices).
Libor process.
The intended domain of application is the Libor process.
To simulate the time step T_t-> T_{t+1}
of the forward Libors
we need the matrix CV[t]
with entries
CV[t]_ij=int_{T_t}^{T_{t+1}}sigma_i(s)sigma_j(s)rho_ijds
and its Cholesky root L[t]
. These matrices are path
independent and so path simulation can be sped up by storing them in the
LiborProcess
object. Since L[t]
is lower
triangular and CV[t]
is symmetric we have the option to store
the lower triangular half of these matrices only.
However in the simulation of the time steps the natural index range for the matrix entries is not zero based. Moreover matrix multiplication could be used to formulate the time step in vector form. A path simulation will go through the sequence of these matrices in a three dimensional loop and is consequently computationally intensive. Speed is thus of the utmost importance.
A desire to store the sequence of these matrices in a custom data structure (rather than a raw java array) results from various sources:
The finding of (naive) experiments ({@link Examples.Array.ArrayTiming}) is that raw java arrays provide the best performance. Contiguous memory allocation has very little beneficial effect, even the akward index arithmetic such as in the class {@link ArrayClasses.UTRContiguousArray}) does not impose a significant penalty but accessor function calls add significant overhead. Quite possibly accessor functions obscure the way in which loops traverse memory so compiler optimizations are no longer possible.
Other schemes such as storing the base address (reference) of a the subarray which is entered by a loop before the loop is entered and so reducing index calculations in the loop also has no benficial effect and can even be harmful.
However reducing the size of the data has a significant effect. For example if triangular matrices are stored as square matrices with zeroes (approximately doubling the size) speed slows down by a factor of nearly two. This happens even if only the nonzero entries are accessed and both the smaller and larger structures are far too large to fit in the L2-cache.
The upshot is that raw java arrays and straightforward loops provide the fastest performance and the size of these arrays should be kept as small as the problem allows.
Consequently the structures {@link ArrayClasses.LTRMatrixArray} and
{@link ArrayClasses.UTRMatrixArray} have been chosen to implement the above
mentioned matrix sequences in the LiborProcess
, see
{@link Libor.LiborProcess.FactorLoading}).
The classes {@link ArrayClasses.LowerTriangularArray} and
{@link ArrayClasses.UpperTriangularArray}
are also used in the {@link Libor.LiborProcess}.
The other classes are here merely for speed experiments.
Loops should follow the layout of arrays in memory:
two dimensional java arrays A
which are allocated with repeated
new
are layered into memory row by row.
Make sure each loop traverses the array in the same
manner. In other words assume A
was allocated as
double[][] A=new double[n][];
for(int i=0;i<n;i++) A[i]=new double[n];
for(int i=0;i<n;i++)
for(int j=0;j<n;j++){ ...A[i][j]...}
for(int j=0;j<n;j++)
for(int i=0;i<n;i++){ ...A[i][j]...}
even if this corresponds to the natural order of the problem. Just think about how the second loop jumps around in memory and how blocks will be moved back and forth from memory to the cache with minimal use of the data in the block.