Fast CIDR calculation: is this network containing that ip? (Is it in yet? - from Ali`G Indahouse)
I try to implement in c fast calculating for cidr for checking.
The cidr format is very easy to read but a little bit complicated for a calculating: what is the upper limit and how can we calculate the containing if the network is bigger than /24?
The big idea (whoa, I think this is older than the dirt) : we can convert the ip and the netmask to unsigned longs and now we can calculate with it. First of all the we splited the cidr into five char: four address pieces and one mask.
uint32_t base;
uint32_t range;
uint32_t upperlimit;
base = 16777216*ip1 + 65536*ip2 + 256*ip3 + ip4;
range=pow(2,(32-mask))-1;
upperlimit=base+range;
Great. And I hope that CPU containing mathematical coprocessor because the pow function and the multiplication and the lot of unvisible conversation are hungry.
Of course have better way.
First of all, we splited the cidr into five unsigned char integer: four address pieces and one mask.
And how can we ignore the multiplication and the pow function?
The solution: with the union and the bit rotating:
typedef struct ipv4address {
        unsigned char ip[4];
} ipv4address;
typedef union Uaddr {
    ipv4address ipv4;
    uint32_t Lipv4;
} Uaddr;
typedef struct Scidr {
    Uaddr addr;
    Uaddr upaddr;
    unsigned char mask;
} Scidr;
ipv4address : ip address
union Uaddr is one tricky thing but this could be architecture dependent because the structure of the uint32_t could be different! in this case the first char is the lowest part of the ip address!
In this case the 192.168.0.0/23 looks like this:
Scidr addr;
Scidr addr2;
addr.mask=23;
addr.addr.ipv4.ip[0]=0;
addr.addr.ipv4.ip[1]=0;
addr.addr.ipv4.ip[2]=168;
addr.addr.ipv4.ip[3]=192;
Here we replace the pow function with array:
rangeshift[32] = 0x00;
rangeshift[31] = 0x01;
rangeshift[30] = 0x03;
rangeshift[29] = 0x07;
rangeshift[28] = 0x0f;
.
.
rangeshift[0] = 0xffffffff;
addr.upaddr.Lipv4 =Â addr.addr.Lipv4 + rangeshift[addr.mask];
This is working fine and everyone could read it and understand it.Â
Here we replace the pow function with some tricky step:
The memory access for the cpu isn`t to expensive, if have cache but have more faster which isn`t request to much memory access for the calculation:
addr.upaddr.Lipv4 = addr.addr.Lipv4;
if (addr.mask != 32) {
       uint32_t range = 0xffffffff;
       range >>= cmask;
       addr.upaddr.Lipv4 += range;
}
Could be faster? I think yes!
addr.upaddr.Lipv4 = addr.addr.Lipv4;
unsigned char cmask = addr.mask; //I think in this case in assembly could be faster the code:
if (cmask ^ 32) {
       uint32_t range = ~0x00;
       range >>= cmask;
       addr.upaddr.Lipv4 += range;
}
cmask: I will use twice: one byte wide xor and one shift operand. the cmask could be one register in the cpu in execution time.
When I try to find byte wide solution I found one test about this silly question: the value of the cmask is egual with the bit size of the cmask. Here is the if expression:
if ((cmask >> cmask ) != cmask) {
Why need the if expression because the bit rotating fill the range with zeros from left? And now here is one twilight zone of the c coding:Â
https://msdn.microsoft.com/en-us/library/f96c63ed.aspx
"The result of a shift operation is undefined if the second operand is negative, or if the right operand is greater than or equal to the width in bits of the promoted left operand."
And what happened if we try it? Try it, the result could be funny if you imagine the bits in the variable as a part of some train which run into the tunnel.... almost :D (I tried in gnu c 4.9 and online visual c++ compiler: http://webcompiler.cloudapp.net/ the result is same
#include <iostream>
using namespace std;
int main(){
 unsigned long a = 0xffffffff;
 for (int i = 0; i <= 80; ++i ) {
   cout << i << " : " << (a>>i) << endl;
 }
 return (0);
}
I think the gcc and the visual c++ make an modulo like division with the second operand: the variable (32 bit integer) A >> 48 is equal A >> 16 .... Really! Just try it....
Why faster this like this code: uint32_t range = ~0xffffffff; ? Because this at least  6 byte of code (if i remember exactly) and the first one is just 4 byte and no memory access. Maybe this could be faster:
uint32_t range = 0x00;
--range;
I don`t like the memory access for variable if I can replace with register based operations. If some variable in the code block used often I think better if we load once into local variables: maybe the compiler make one cpu register from the variable. And in this case I use bitwise operations with it....
Finally here is one sample code:
#include <stdio.h>
#include <stdint.h>
typedef struct ipv4address {
        unsigned char ip[4];
} ipv4address;
typedef union Uaddr {
    ipv4address ipv4;
    uint32_t Lipv4;
} Uaddr;
typedef struct Scidr {
    Uaddr addr;
    Uaddr upaddr;
    unsigned char mask;
} Scidr;
int main() {
    Scidr addr;
    Scidr addr2;
    addr.mask=23;
    addr.addr.ipv4.ip[0]=0;
    addr.addr.ipv4.ip[1]=0;
    addr.addr.ipv4.ip[2]=168;
    addr.addr.ipv4.ip[3]=192;
    printf("Ipv4 network begining: %u.%u.%u.%u ::\n", addr.addr.ipv4.ip[3], addr.addr.ipv4.ip[2], addr.addr.ipv4.ip[1], addr.addr.ipv4.ip[0]);
    printf("Ipv4 as num: %lu ::\n", addr.addr.Lipv4);
    addr.upaddr.Lipv4 = addr.addr.Lipv4;
    unsigned char cmask = addr.mask;
    if ((cmask ^ 32 )) {
        uint32_t range = ~0x00;
        range >>= cmask;
        addr.upaddr.Lipv4 += range;
    }
    printf("Ipv4 upper limit as num: %lu ::\n", addr.upaddr.Lipv4);
    printf("Ipv4 upper limit as ip %u.%u.%u.%u ::\n", addr.upaddr.ipv4.ip[3], addr.upaddr.ipv4.ip[2], addr.upaddr.ipv4.ip[1], addr.upaddr.ipv4.ip[0]);
//At last some test: Â Â
    addr2.mask=32;                  //This is uncessary but this is requested by CIDR format.
    addr2.addr.ipv4.ip[0]=10;
    addr2.addr.ipv4.ip[1]=1;
    addr2.addr.ipv4.ip[2]=168;
    addr2.addr.ipv4.ip[3]=192;
//See the second address    Â
    printf("Ipv4 address %u.%u.%u.%u ::\n", addr2.addr.ipv4.ip[3], addr2.addr.ipv4.ip[2], addr2.addr.ipv4.ip[1], addr2.addr.ipv4.ip[0]);
//Ugly I know (with more memory access, i think in prod environment try to decrease the memory access opertions if using one variable more times)
    if ( addr.addr.Lipv4 <= addr2.addr.Lipv4) {
        if (addr.upaddr.Lipv4 >= addr2.addr.Lipv4) {
            printf("Contains\n");
        } else {
            printf("Not contains\n");
        }
    } else {
        printf("Not contains\n");
    }