Calculating the next power of 2-keithnwsuaf-ChinaUnix博客

A fairly common problem in systems programming is to locate the "next highest power of 2," since many data structures are much more efficient that way. Memory pages on a 32-bit OS are usually 4096 bytes long (2¹²), and textures on graphics cards usually need to have power-of-2 dimensions.

So, suppose I have some object of a certain size (947, for example), and I need to find the smallest power-of-2 that's large enough to hold my object (in this case, 2¹⁰ or 1024). We'll start with a fairly simple algorithm:

int val = 947, powof2 = 1;
// Double powof2 until >= val
while( powof2 < val ) powof2 <<= 1;

This will work, and if you're in a hurry, this is probably what will get used. If you're just calculating the size for a texture, then this is unlikely to be performance-critical code, so it may not be worth your time to optimize this. This might become a bottleneck if you're writing a memory manager, however; how can you do this more quickly?

This code has a few weaknesses. Firstly, it uses a loop, which means that the code has to run a test and then split in two possible directions after every iteration. Conditional branches of this sort aren't a huge deal on a modern processor, but they're still slower than avoiding branches altogether. Secondly, the loop can iterate quite a few times--up to 31, in fact. Thirdly, if the value stored in 'val' is over 2³¹, and if you're using unsigned integers, then this can become an infinite loop. Hardly desirable!

It turns out that there's a fairly slick algorithm to solve this problem. The basic idea is to take your number (947 again: 0000 0011 1011 0011), and use bit operations to convert it to a solid string of 1's with the same number of digits (in this case, 0000 0011 1111 1111). Then, add 1, and you'll have the next power of 2 (0000 0100 0000 0000, or 1024). To convert any number into a solid string of 1's, we'll use right-shift and or repeatedly:

val = (val >> 1) | val;
val = (val >> 2) | val;
val = (val >> 4) | val;
val = (val >> 8) | val;
val = (val >> 16) | val;

Why do we do it five times? Well, that's because val is a 32-bit integer, and every time you iterate, you're doubling the number of digits you fill. After five steps, you've filled 2⁵ digits, and you're done. If you're on a 64-bit architecture (2⁶) you'll need to repeat 6 times. The steps on 947 look like:

0000 0011 1011 0011
0000 0011 1111 1011
0000 0011 1111 1111
... with the remaining steps staying at this value.

There's one small problem with this algorithm: it doesn't work properly for the powers of two themselves. It will give the next highest power of 2. This is pretty easy to fix, though: Just subtract one from the value before you begin, and you'll get the right answer in every case. Here's the final algorithm:

val = ... // Get input
val--;
val = (val >> 1) | val;
val = (val >> 2) | val;
val = (val >> 4) | val;
val = (val >> 8) | val;
val = (val >> 16) | val;
val++; // Val is now the next highest power of 2.

One final thought: What does it do on zeroes? It turns out that if you pass the algorithm a zero or any value greater than 2³¹, it will return zero (try it if you're not sure why). Unless you're certain you will never pass either of these values, you should handle having zero as a return value.

That's all for today. Happy bit-twiddling!