# The floating point argument

I am on vacation and, all being well, by the time this posting goes live, I will be sunning myself on a Greek island. A couple of weeks ago, I posted a blog about the use of floating point. My colleague Brooks Moses [who did a guest blog post a while back] made a comment on that posting, pointing out that I had over-simplified my example. I am always happy to get such feedback.

It transpired that Brooks had some bigger issues with what I had to say, so I was pleased to offer him an opportunity to have his say. Over to you Brooks and, waiter, bring me another beer …

I was surprised to see Colin’s conclusion in a recent posting that “floating point should only be used if it is essential and only after every creative way to do the calculations using integers has been investigated and eliminated.” My background is in computational fluid dynamics, and there floating-point arithmetic is our bread and butter. Even now that I’m working in high-performance embedded computing, almost everything I do uses floating point.

In both of those fields, performance is king. So, if Colin’s right, why aren’t we using integers? There are several reasons:

**Integer instructions aren’t faster on many processors**. This has been true in the world of high-performance computing for a while; it’s recently become true in the middle range of embedded systems. These days you can get a hardware floating-point unit in something as small as an ARM Cortex-M4 microcontroller; most processors powerful enough to do a significant amount of computation will have one.

For example, consider the ARM Cortex-M4 and the Freescale e600 Power Architecture cores. Like most modern processors, these are both pipelined architectures. The processor can start an integer or floating-point instruction (almost) every cycle if it doesn’t have to wait for an input, but it may take multiple cycles to produce the output. On the M4, floating-point instructions have a latency of one additional cycle, which means if the next instruction needs their output, the processor will stall for a cycle to catch up — but a good compiler can easily arrange the instructions so this is rare. On the e600, floating-point instructions have a latency of 5 cycles, but it’s also slow with 32-bit integer multiplication so in a lot of cases floating-point still comes out ahead.

**Fixed-point multiplication is complicated**. Since we usually care about numbers with fractional parts, we would have to use fixed-point arithmetic to do computations with integers. The idea is simple; consider the integers to be scaled by a constant factor. (For example, an integer value of 34 would represent 34/32768, if our scale factor is 2^{15}.) This doesn’t affect addition, but it means that for multiplication we need compute all the high and low bits and then right-shift the answer. This takes several instructions, whereas with floating-point we can do the whole multiplication in one instruction.

**And, often, instruction speed isn’t even the bottleneck**. Doing a lot of computations means processing a lot of data, and that data has to come from somewhere. In a lot of the programs I work with, the limiting factor in performance is often the speed of moving data into and out of memory — especially when the data set doesn’t fit into the CPU cache. When that’s the case, 32 bits to move is 32 bits to move, regardless of whether the data in it is an integer or floating-point value.

On the other hand… There was a similar debate, years ago, in computational fluid dynamics: Why use single-precision floating-point values, when most processors of the day were just as fast in double precision? So most software used double-precision … up until memory speeds were the bottleneck, and it started to matter that double-precision values were twice as large. Nowadays, if your algorithm can be written to use only 16-bit integers, the same argument applies.

And then there’s the idea of using things like Xilinx’s Zynq — with an ARM CPU and a large FPGA — for this sort of number crunching. Floating-point arithmetic really is slow and painful on an FPGA, so we high-performance programmers will have to start learning to use integers after all if we want to use a system like that.

So the real answer to whether to use floating point or integers? It all depends on your hardware and what you need to do with it.

## Post Author

Posted September 10th, 2012, by **Colin Walls**

## Post Tags

ARM, Cortex, e600, embedded software, floating point, FPGA, Freescale, M4, Xilinx, Zynq

## Post Comments

## About The Colin Walls Blog

This blog is a discussion of embedded software matters - news, comment, technical issues and ideas, along with other passing thoughts about anything that happens to be on my mind. The Colin Walls Blog

## @colin_walls tweets

- #ColinWallsBlog Securing #IoT devices https://t.co/iNQ34lWWzV https://t.co/AeybuUDXaB
- #ColinWallsBlog Embedded software video blog about the C keyword static https://t.co/Dk0n3TaouE
- I recently came across an excellent book: The Art of Readable Code by Dustin Boswell and Trevor Foucher.

## One comment on this post | ↓ Add Your Own

Commented on 29 June 2016 at 22:14

By Lance Harvie

Colin’s argument holds in low cost high volume embedded systems with the penalty being clock cycles and memory use when using floating point. Sure in high performance embedded systems when cost is not a consideration then float point may be the better option. Horses for courses.