问题
I am creating a ruby wrapper for the fftw3 library for the Scientific Ruby Foundation which uses nmatrix objects instead of regular ruby arrays.
I have a curious problem in returning the transformed array in that I am not sure how to do this so I can check the transform has been computed correctly against octave or (something like this) in my specs
I have an idea that I might be best to cast the output array out
which is an fftw_complex
type to a VALUE to pass it to the nmatrix object before returning but I am not sure whether I should be using a wisdom and getting the values from that with fftw.
Here is the method and the link to the spec output on travis-ci
static VALUE
fftw_r2c_one(VALUE self, VALUE nmatrix)
{
VALUE cNMatrix = rb_define_class("NMatrix", rb_cObject);
fftw_plan plan;
VALUE shape = rb_funcall(nmatrix, rb_intern("shape"), 0);
const int size = NUM2INT(rb_funcall(cNMatrix, rb_intern("size"), 1, shape));
double* in = ALLOC_N(double, size);
for (int i = 0; i < size; i++)
{
in[i] = NUM2DBL(rb_funcall(nmatrix, rb_intern("[]"), 1, INT2FIX(i)));
printf("IN[%d]: in[%.2f] \n", i, in[i]);
}
fftw_complex* out = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * size + 1);
plan = fftw_plan_dft_r2c(1,&size, in, out, FFTW_ESTIMATE);
fftw_execute(plan);
fftw_destroy_plan(plan);
xfree(in);
fftw_free(out);
return nmatrix;
}
Feel free to clone the repo from github and have a play about, if you like.
Note: I am pretty new to fftw3
and have not used C
(or ruby
) much, before starting this project. I had got more used to java
, python
and javascript
to date so haven't quite got my head around lower level concepts like memory management but am getting the with this project. Please bear that in mind in your answers, and try to see that they are clear for someone and who up to recently has mainly got used to an object orientated approach up to now by avoiding jargon (or taking care to point it out) as that would really help.
Thank you.
回答1:
I got some advice from Colin Fuller and after some pointers from him I came up with this solution:
VALUE fftw_complex_to_nm_complex(fftw_complex* in) {
double real = ((double (*)) in)[1];
double imag = ((double (*)) in)[2];
VALUE mKernel = rb_define_module("Kernel");
return rb_funcall(mKernel,
rb_intern("Complex"),
2,
rb_float_new(real),
rb_float_new(imag));
}
/**
fftw_r2c
@param self
@param nmatrix
@return nmatrix
With FFTW_ESTIMATE as a flag in the plan,
the input and and output are not overwritten at runtime
The plan will use a heuristic approach to picking plans
rather than take measurements
*/
static VALUE
fftw_r2c_one(VALUE self, VALUE nmatrix)
{
/**
Define and initialise the NMatrix class:
The initialisation rb_define_class will
just retrieve the NMatrix class that already exists
or define a new class altogether if it does not
find NMatrix. */
VALUE cNMatrix = rb_define_class("NMatrix", rb_cObject);
fftw_plan plan;
const int rank = rb_iv_set(self, "@rank", 1);
// shape is a ruby array, e.g. [2, 2] for a 2x2 matrix
VALUE shape = rb_funcall(nmatrix, rb_intern("shape"), 0);
// size is the number of elements stored for a matrix with dimensions = shape
const int size = NUM2INT(rb_funcall(cNMatrix, rb_intern("size"), 1, shape));
double* in = ALLOC_N(double, size);
fftw_complex* out = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * size * size);
for (int i = 0; i < size; i++)
{
in[i] = NUM2DBL(rb_funcall(nmatrix, rb_intern("[]"), 1, INT2FIX(i)));;
}
plan = fftw_plan_dft_r2c(1,&size, in, out, FFTW_ESTIMATE);
fftw_execute(plan);
for (int i = 0; i < 2; i++)
{
rb_funcall(nmatrix, rb_intern("[]="), 2, INT2FIX(i), fftw_complex_to_nm_complex(out + i));
}
// INFO: http://www.fftw.org/doc/New_002darray-Execute-Functions.html#New_002darray-Execute-Functions
fftw_destroy_plan(plan);
xfree(in);
fftw_free(out);
return nmatrix;
}
The only problem which remains it getting the specs to recognise the output types which I am looking at solving in the ruby core Complex API
回答2:
If you want to see any performance benefit from using FFTW then you'll need to re-factor this code so that plan generation is performed only once for a given FFT size, since plan generation is quite costly, while executing the plan is where the performance gains come from.
You could either
a) have two entry points - an initialisation routine which generates the plan and then a main entry point which executes the plan
b) use a memorization technique so that you only generate the plan once, the first time you are called for a given FFT dimension, and then you cache the plan for subsequent re-use.
The advantage of b) is that it is a cleaner implementation with a single entry point; the disadvantage being that it breaks if you call the function with dimensions that change frequently.
来源:https://stackoverflow.com/questions/25411643/return-transformed-nmatrix-array-with-fftw3