Adam Florence
Research
Fast Guass Transform
C Implementation

Van Loan and I have submitted a paper to the SIAM Journal on Matrix Analysis and Applications. The paper presents numerical results which were generated using the code on this page.

There is no warranty of any kind on this code or its documentation. All code is copyrighted (c) 2000, 2001 by Adam Florence.

Download the code as a zipped file. The zip file contains: README.txt, fgt.c, fgt.h, fgt.m, error1.m, timing.c, and timing.h. The README.txt more or less duplicates the instructions here.

The code can be run in three ways: as a stand-alone program, included in another program, or called from Matlab.

  1. To run the code stand-alone, compile fgt.c, telling the compiler to define FGT_STAND_ALONE. For example, using GCC, you would type
    gcc -lm -DFGT_STAND_ALONE fgt.c
    
    (The -lm flag includes the math library.) When the code is run, it will set up a small number of sources and targets, and compute the FGT.

  2. To include the code in another program, no macros need to be defined. The code exports one typedef and two functions:
    typedef unsigned long int flops;
    
    /************************************************************************
    fgt_prelim
    
    Given the user's error tolerance epsilon, either:
    1. Determines the values of p and r to achieve that maximum error.
       Also determines a bound on the relative error in each element
       of the output. In this case, returns 0.
    2. If the user's requested error tolerance is too tight, determines good
       values of p and r to produce a small error. Determines a bound on the
       relative error. In this case, returns 1.
    
    Input:  double delta   : standard deviation, 0 < delta < 1/4
            int d          : number of dimensions, >= 1
            double epsilon : desired error bound, 0 < epsilon < 1
    Output: double* R      : no-name parameter
            int* S         : number of boxes in each direction, >= 1
            int* p         : number of terms kept, >= 2
            int* r         : interaction radius, >= 0
            double* error  : error bound for |E_H| + |E_T| + |E_r|
            flops* fl      : updated flop count
    Return: 0 if the error tolerance epsilon can be met, 1 if not
    ************************************************************************/
    int fgt_prelim(double delta, int d, double epsilon, double* R, int* S,
      int* p, int* r, double* error, flops* fl);
    
    /************************************************************************
    Fast Gauss Transform
    
    Perform the FGT. See my paper for a description of the algorithm.
    
    For both the sources and the targets, point i subscript j must be at
    (i * d + j).
    
    Consumes exactly
        (M+N) ( (3 d + 2) p^d + 6 d + 1 )
      + 2 p^d S^d (2r+1)^d + d (2r+1)^d + d p^d
      + S^d ( p^d + ceil( p^d / 2) + d + 1)
      + 20 r p + 2 r + 10 p + 5
    flops.
    
    Input:  double* s    : array of N sources, each in d dimensions
            double* q    : array of length N, magnitudes of sources
            double* t    : array of M targets, each in d dimensions
            double delta : standard deviation, 0 < delta < 1/4
            int d        : number of dimensions, >= 1
            double R     : no-name parameter
            int S        : number of boxes in each direction, >= 1
            int p        : number of terms kept in expansions, >= 2
            int r        : interaction radius, >= 0
            int N        : number of sources, >= 1
            int M        : number of targets, >= 1
    Output: double* y    : array of length M, approximate FGT
            flops* fl    : updated flop count
    Return: int          : always returns 0
    ************************************************************************/
    int fgt(double *s, double *q, double *t, double delta, int d, double R,
      int S, int p, int r, int N, int M, double* y, flops* fl);
    
    See the paper for definitions of these parameters. To compute a FGT, first call fgt_prelim to determine the parameters R, S, p, r and the error bound error. Then, call fgt to compute the transform.

    See timing.c for an example of how to set up the sources and targets, and compute a FGT.

  3. To call the code from Matlab, you must first compile it using the Matlab command mex. (For more information about the Matlab compiler, see the on-line Matlab documentation.) Make sure that the Matlab files matrix.h and mex.h are in your include path. Then type:
    mex -DFGT_MATLAB fgt.c
    
    mex produces a file called fgt.dll, which can be called directly from Matlab. See error1.m for an example.

    If you type "help fgt" in Matlab, the contents of fgt.m will be displayed.

The file timing.c was used to generate the numerical results in the paper. It is a good example of how to call the FGT code from another program.

If you have any questions about the code or these instructions, please e-mail me.


Back to my FGT page.

Back to my home page.

Last updated 13 August 2001.