So, these are my reasons for doing data processing in gnuplot. Of course, in most cases, I do that using an external script, and piping it from the gnuplot prompt. But that would deprive windows users from the pleasures of doing something fancier:)

By the way, if you look at some of the demo scripts of gnuplot 4.3, you will realise that the developers of gnuplot have already parted with the idea of "one task - one tool".

The kind of data processing we will do today requires a "recursive" access to data points, in particular, I will demonstrate how we can calculate the derivative and the integral of a series of data points. Calculating the derivative and integral of an analytic function is trivial, even if you do not have the analytical answer: for the derivative, you can simply plot f(x+dx)-f(x), where dx is just some pre-defined number, and for the integral you can apply a recursive scheme, detailed on the gnuplot demo web site. But they work on known functions, and we would like to calculate something based on "measured" data. For this, I will take the weekly oil price from this year, which you can download from the link. Let us suppose that we plot the oil price, but we also want to highlight the time intervals where the price is dropping. This requires us to calculate the derivative at each point (In this particular case, we could get away just by comparing the value and the next, but we have got to get access to the values anyway, and we would get it by fitting a linear function, whose linear coefficient gives the derivative. So, for better of worse, I will just stick with the derivative.)

For a starter, let us see our two scripts! As I have already pointed out, we will have to walk through the values, one by one, so we will need a 'for' loop, which means two scripts: a "master", and one that we call repeatedly. I have discussed this method at great length in my recent posts.

reset unset key A=0; sa=1; sb=0; integ=0.0 f(x) = a*x+b g(x) = (x>0?GPVAL_Y_MIN:GPVAL_Y_MAX) set xlabel 'Time [weeks]' set ylabel 'Crude oil price [$]' p 'oil.dat' u 0:8 xmax = GPVAL_DATA_X_MAX set print 'oild.dat' append l 'derint_r.gnu' set print plot 'oild.dat' u 1:(g($2)) w filledcurve lc rgb "#eeeeee", 'oil.dat' u 0:8 w l

and our 'for' loop, that we call 'derint_r.gnu'

a=1.0; b=40.0 fit [A:A+1] f(x) 'oil.dat' u 0:8 via a, b integ = integ+(a*A+b+a*(A+1)+b)/2.0 if(sa*a<0 a="" b="" integ="" print="" sa="a;" sb="b<br">if(sa*a>0) print A, a, b, integ; sa=a; sb=b A=A+1 if(A<xmax) reread

The first part of the main script should be clear: we set up the style of our figure, and do a dummy plot to learn something about our data file, 'oil.dat'. f(x) is our linear function, while g(x) is just a helper, whose role will become evident a bit later. Having set up the plot, we call 'derint_r.gnu' 'xmax' times. So, what is in 'derint_r.gnu'? We simply fit our linear function over a range that contains only two points, and we step through the whole file by moving those two points by one in each call to the loop. Then we calculate the integral by adding the area of the next trapezium, and then depending on whether the sign of the derivative, 'a', is different to the previous one, we print out the results once (if the signs are identical), or twice (if they are different). This distinction is necessary because we want to plot rectangles where the derivative is negative. After we are done with the data processing, we close our new file by setting the print, and plot the results, to get the following figure

Note that in the plotting, we called the function g(x), which takes on the value of the minimum of the yrange, if the argument is negative (the price is dropping), while returns the maximum of the yrange otherwise. Of course, there are many things that we could do to improve the image, but I wanted only to demonstrate how we can do the data processing part. You can dress up the image as you like, and for that you can also draw on many a post of mine.

A great blog you have here. Can you suggest a simple method just for highlighting a desired data range? For example, xrange is 0-100 and yrange is 0-20, and I just want to highlight the chart area between x = 20 and 40. Thanks very much.

ReplyDeleteHello Ben,

ReplyDeleteI am not quite sure of whether I understand your intentions, but let me have a go at it! So, if you want to plot everything, and then highlight (i.e., plot it in a different colour) those values that fall in the range 20-40, then you can do the following:

f(x) = (x < 20 ? x : (x > 40 ? x : 1/0))

g(x) = (x > 20 ? (x < 40 ? x : 1/0) : 1/0)

plot 'foo' u (f($1)):2 with lines lt 1, '' u (g($1)) with lines lt 2

If, however, you simply want to have a different background colour for that range, then you can either follow the recipe given in the post (with some obvious modifications), or you can use the rectange object as

set object 1 rectangle from graph 20,0 to graph 40,100 behind fillcolor rgb "#ff0000"

which will place a red rectangle behind your plot.

I hope this helps!

Cheers,

Zoltán

How might you plot a 3d object but only show the extreme edges as lines? For instance, how would you show just a "V" when plotting an upside down cone given by: splot sqrt(x**2+y**2)*tan(45*pi/180)

ReplyDeleteI don't think it can be done automatically. I guess, one has just got to know the parametrisation of the surface, and then restrict it in a way that it gives you the cross-section.

ReplyDeleteCheers,

Zoltán

Thanks very much for your help.

ReplyDeleteBen

Dear Gnuplotter,

ReplyDeleteI have a problem in gnuplote to load an external .dat file and to draw it graphically.

I am working on my thesis with simulator called peersim. the problem is that when i run my simulator in eclipse so compile successfully and i have make a .dat file to which simulation result automatically gone.

I have tried different command to load that file and draw graph but it gives error. I can mail that file to whether the data contained in it is excutable or not. I am working from two months but got nothing.

Plz help me in this regards.