A puzzle

I am always interested in some of the subtle effects that coding can have on not just the behavior of code, but also its performance. Recently, I stumbled across some benchmarking code – I cannot recall how I actually found it. It was not written in C, so I translated it.

In the process, I realized that a literal translation might not be ideal …

The code is to do with Mandelbrot sets. My first C translation looked like this:

int mandelbrot(float a)
{
float b;
int iterations = 50;
int i;
b = a;
for (i=1; i<=iterations; i++)
{
if (abs(a) > 2)
return (i - 1);
a = pow(a, 2) + b;
}
return (iterations);
}

I then saw a couple of optimizations that I could make:

#define ITERATIONS 50
int mandelbrot(float a)
{
float b;
int i;
b = a;
for (i=1; i<=ITERATIONS; i++)
{
if (abs(a) > 2)
return (i - 1);
a = a * a + b;
}
return (ITERATIONS);
}

I made it clear that iterations was a constant. I could have simply qualified it with const, but I chose to use #define. Call me old fashioned if you like. Both would most likely produce the same result. A good compiler may well optimize this without my help.

Since the code was translated from a language that included an exponentiation operator, using the pow() library function seemed logical, as C has no such operator. As a is simply squared, multiplication seemed better. Again, a smart compiler may well have addressed this.

I then saw another possible enhancement that might be effective:

#define ITERATIONS 50
int mandelbrot(float a)
{
float b;
int i;
b = a;
for (i=1; i<=ITERATIONS; i++)
{
if (abs(b) > 2)
return (i - 1);
b = b * b + a;
}
return (ITERATIONS);
}

Does this improve matters? If so, why? Please let me know your thoughts by email or comment. Sorry, no prizes, but I will follow up this posting with my further thoughts sometime soon.

Post Author

Posted October 22nd, 2012, by

Post Tags

, , ,

Post Comments

6 Comments

About The Colin Walls Blog

This blog is a discussion of embedded software matters - news, comment, technical issues and ideas, along with other passing thoughts about anything that happens to be on my mind. The Colin Walls Blog

Comments

6 comments on this post | ↓ Add Your Own

Commented on 22 October 2012 at 11:57
By Peter Bushell

Can’t see why the second change would make any difference, because I would expect a compiler to treat a parameter and a local variable in the same way – probably keeping both in registers, in this case.

I would add a third “optimisation”, but this is just a tidying exercise, really, as I don’t suppose this would make any difference, either: just do the b=a assignment at the declaration of b.

Commented on 22 October 2012 at 13:55
By Colin Walls

Good input Peter.

Firstly, for many CPU architectures, parameter passing in registers is not a possibility, except for static functions, where the definition and calls are all seen at once. This is even more true for older compilers. The optimisation of auto variables to registers, even without the register keyword being used, is more common.

Your proposed further change is, I suppose neater to look at. However, as an assignment of an auto variable on definition is only shorthand [as assignment will need some code, unlike with a static], I actually prefer to keep them separate. Just personal taste.

Commented on 25 October 2012 at 14:48
By Peter Bushell

Following your reply to mine, I had a further thought. If there is a floating-point coprocessor present, then maybe it does its calculations using an internal stack (think of Reverse Polish Notation – RPN). If so, it may be advantageous to write

const float b = a;

then work with a as the “accumulating” variable, as originally.

The compiler, knowing that b is constant (for the current function invocation) has the opportunity to push it onto the FP stack just once, before the loop is entered, and to pop it just once before returning. Otherwise, it is possible that the supposedly variable b will be pushed and popped for every loop iteration.

Commented on 25 October 2012 at 14:56
By Colin Walls

Peter: That makes sense, if indeed a coprocessor does have a stack that can be used in that way [i.e. for passing FP variables]. As, for me, RPN is the only obvious way to do arithmetic, I’d like to think that this is correct.

Commented on 4 November 2012 at 09:47
By Grygorii Tertychnyi

It’s not about optimization, but
shouldn’t it be fabsf() instead of abs() ?

Commented on 5 November 2012 at 10:01
By Colin Walls

You are quite correct Grygorii. Well spotted.

Add Your Comment

You must be logged in to post a comment.

Archives