We generate graphs for huge datasets. We are talking 4096 samples per second, and 10 minutes per graph. A simple calculation makes for 4096 * 60 * 10 = 2457600 samples per lineg
This makes we render about 25M samples in a single screen.
No you don't, not unless you've got a really really large screen. Given that the screen resolution is probably more like 1,000 - 2,000 pixels across, you really ought to consider decimating the data before you graph it. Graphing a hundred lines at 1,000 points per line probably won't be much of a problem, performance wise.
I'd like to comment on your assertion that you cannot omit samples, on the back of tgamblin's answer.
You should think of the data that you're drawing to the screen as a sampling problem. You're talking about 2.4M points of data, and you're trying to draw that to a screen that is only a few thousand points across (at least I assuming that it is, since you're worried about 30fps refresh rates)
So that means that for every pixel in the x axis you're rendering in the order of 1000 points that you don't need to. Even if you do go down the path of utilising your gpu (eg. through the use of opengl) that is still a great deal of work that the gpu needs to do for lines that aren't going to be visible.
A technique that I have used for presenting sample data is to generate a set of data that is a subset of the whole set, just for rendering. For a given pixel in the x axis (ie. a given x axis screen coordinate) you need to render an absolute maximum of 4 points - that is the minimum y, maximum y, leftmost y and rightmost y. That will render all of the information that can be usefully rendered. You can still see the minima and maxima, and you retain the relationship to the neighbouring pixels.
With this in mind, you can work out the number of samples that will fall into the same pixel in the x axis (think of them as data "bins"). Within a given bin, you can then determine the particular samples for maxima, minima etc.
To reiterate, this is only a subset that is used for display - and is only appropriate until the display parameters change. eg. if the user scrolls the graph or zooms, you need to recalculate the render subset.
You can do this if you are using opengl, but since opengl uses a normalised coordinate system (and you're interested in real world screen coordinates) you will have to work a little harder to accurately determine your data bins. This will be easier without using opengl, but then you don't get the full benefit of your graphics hardware.
Wrap the library in a gentler, kinder 2D library with the Z and rotations all set to 0.
-Adam
You don't need to eliminate points from your actual dataset, but you can surely incrementally refine it when the user zooms in. It does you no good to render 25 million points to a single screen when the user can't possibly process all that data. I would recommend that you take a look at both the VTK library and the VTK user guide, as there's some invaluable information in there on ways to visualize large datasets.
Thank you very much. This is exactly what I was looking for. It seems VTK uses hardware to offload these kind of rendering, too. Btw, i guess you mean valuable ;). Second, the user does get information of the example i gave. However not really concise, the overview of the data can really be pure gold for the scientist. It is not about processing all the data for the user, it is about getting valuable information out of the rendering. Users seem to do this, even in the very 'zoomed out' representation of the dataset.
Any more suggestions?
A really popular toolkit for scientific visualization is VTK, and I think it suits your needs:
It's a high-level API, so you won't have to use OpenGL (VTK is built on top of OpenGL). There are interfaces for C++, Python, Java, and Tcl. I think this would keep your codebase pretty clean.
You can import all kinds of datasets into VTK (there are tons of examples from medical imaging to financial data).
VTK is pretty fast, and you can distribute VTK graphics pipelines across multiple machines if you want to do very large visualizations.
Regarding:
This makes we render about 25M samples in a single screen.
[...]
As this is scientific data, we cannot omit any samples. Seriously, this is not an option. Do not even start thinking about it.
You can render large datasets in VTK by sampling and by using LOD models. That is, you'd have a model where you see a lower-resolution version from far out, but if you zoom in you would see a higher-resolution version. This is how a lot of large dataset rendering is done.
You don't need to eliminate points from your actual dataset, but you can surely incrementally refine it when the user zooms in. It does you no good to render 25 million points to a single screen when the user can't possibly process all that data. I would recommend that you take a look at both the VTK library and the VTK user guide, as there's some invaluable information in there on ways to visualize large datasets.
Not sure if this is helpful, but could you use time as a dimenion? i.e. one frame is one z? That might make things clearer, perhaps? Then perhaps you could effectively be applying deltas to build up (i.e on z axis) the image?