I\'m learning C and I can\'t figure out one of the K&R exercises, the listing:
Exercise 2-3, Write the function
htoi(s)
, which conv
try to explain with my rude english :(
My code (assume that all inputs are corrects. Avoid defensive programing)
#include <stdio.h>
enum { SZ = 11 };
unsigned int htoi(const char *s);
int main()
{
char buff[SZ]; //Max 11 char: 0x XX XX XX XX '\0' (2 + 8 + 1)
while(fscanf(stdin, "%s", buff) != EOF)
printf("%X\n", htoi(buff) );
return 0;
}
unsigned int htoi(const char *s)
{
unsigned int i, r = 0;
for(i = (s[1] == 'x') ? 2 : 0; s[i] != '\0'; i++)
r = ( r << 4 ) + ( (s[i] > '9') ? 0x9 : 0x0 ) + ( s[i] & 0xF );
return r;
}
Ok, first of all, assign r = 0. Then, when we start for-bucle, we give an init value to index variable i. We have to check if string has 0x format or not. We only need to check position 1 to know if we are treating an input string with 0x format or without it.
Now, we have an index pointing to first correct character! For each iteraion we displace 4 bits to the left. We gain 4 zeros. A perfect gap to add a new hex digit! Example:
Input: 0xBE1234
Is s[1] == 'x' ? true then i = 2;
r = 0;
iter 1: r = 0x0; r = 0x0; r = 0xB;
iter 2: r = 0xB; r = 0xB0; r = 0xBE;
iter 3: r = 0xBE; r = 0xBE0; r = 0xBE1;
iter 4: r = 0xBE1; r = 0xBE10; r = 0xBE12;
iter 5: r = 0xBE12; r = 0xBE120; r = 0xBE123;
iter 6: r = 0xBE123; r = 0xBE1230; r = 0xBE1234
May be this is a bit complicate:
r = ( r << 4 ) + ( (s[i] > '9') ? 0x9 : 0x0 ) + ( s[i] & 0xF );
First of all, we displace 4 bits, same as multiplication per 16 but more efficient. Then, we look if we have an ASCII character bigger than '9'. If it's true, we are working with A, B, C, D, E, F or a, b, c, d, e, f. Remember, we assume that we have a correct input. Ok, now take a look to ASCII table:
A = 0100 0001 - a = 0110 0001
...
F = 0100 0110 - f = 0110 0110
but we want something like this:
A = 0000 1010 - a = 0000 1010
...
F = 0000 1111 - f = 0000 1111
How we do it? After displacement, we clear 4 most significant bit with mask s[i] & 0xF:
s[2] == 'B' == 0100 0010
s[2] & 0xF == 0000 0010
and add 9 for adapt to an integer value ( only in case that s[i] in { 'A'...'F', 'a' ... 'f' } )
s[2] & 0xF + 0x9 = 0000 0010 + 0000 1001 = 0000 1011 (0xB)
Finally, we add to displaced r value and assign to r. Execution sequence for second iteration (s[3]):
r == 0xB, s[3] == 'E' == 0100 0101 (start iter 2)
(r << 4) == 0xB0, s[3] == 'E' == 0100 0101 (displacement r << 4 )
(r << 4) == 0xB0, (s[3] & 0xF + 0x9) == 0000 1110 == 0xE (clear most significant bits of s[3] and add 0x9)
r = (r << 4) + ( s[3] & 0xF + 0x9 ) == 0xBE == 1011 1110 (add all and assign to r)
What's happen if we have a number character like s[4]?
s[4] == '1' == 0011 0001
s[4] & 0xF == 0000 0001
Displacement r four positions, add 0 (nothing), add result of logic operation s[i] & 0xF and finally, assign to r.
r == 0xBE, s[4] == '1' == 0011 0001 (start iter 3)
(r << 4) == 0xBE0, s[4] == '1' == 0011 0001 (displacement r << 4 )
(r << 4) == 0xBE0, (s[4] & 0xF + 0x0) == 0000 0001 (clear most significant bits of s[4] and add 0)
r = (r << 4) + s[4] & 0xF == 0xBE1 == 1011 1110 0001 (add all and assign)
Remember, we shift 4 so we don't mesh digit bits because we are adding less significant bits with a gap of four zeros.
PD: I promise improve my english for explain better, sorry.
Mitch has the basic idea right, but let's take it in a little more detail.
A hex number is just base 16, which means the digits (from right to left) have the values as
digit × 160 (ie, 1)
digit × 161 (ie, 16)
digit × 162 (256)
and so on. So, 0xE is 14, for example.
What you'll want is a loop starting at the right end of the string. Let's say the string is s, length(s) is the length of the string. In pseudocode, you want
value = 0
r = 1 // ask yourself "what values does r take as this proceeds?"
for i from length(s)-1 to 0 // Ask yourself "why length(s)-1?"
value = value + (digitval(s[i])*r)
// get ready for the next digit
r = r * 16
digitval(char c)
needs to be a function that translates a checract in "0123456789ABCDEF" into the values between 0 and 15 (inclusive.) I'll leave that as an exercise, with one hint: "arrays".
be careful with one extra issue; since you could have a leading "0" or "0x" you need to make sure you handle those cases.
I'm probably not making a great contribution, there are good answers above. But I'll give it a try.
As others did before me, I'm leaving some functionality for you to implement.
int htoi(const char* x)
{
unsigned int current_position;/*current position is to be defined*/
int prefixed=0;
int dec=0;
char* y = x;
if (x && x+1 && (*(x+1)=='x' || *(x+1)=='X')){ /*Is 0x or 0X prefix present?*/
prefixed= PREFIXED;
}
if (prefixed) y+=2; /*Jumps over 0x or 0X*/
while (*y){
/*getPos(const char*) and singleHexToDec(const char*,unsigned int) functions to be implemented*/
current_position=getPos(y);
dec+=singleHexToDec(y,current_position);
}
return dec;
}
Processing the string from left to right is simpler and arguably more readable for those comfortable with math. The strategy is realizing that, for example, 1234 = (((1 x 10) + 2) x 10 + 3) x 10 + 4
In other words, as you process each digit from left to right, multiply the previous total by the base, effectively "moving it left" one position, then add the new digit.
long decFromHexStr(const char *hexStr)
{
int i;
long decResult = 0; // Decimal result
for (i=0; i < strlen(hexStr); ++i)
{
decResult = 16 * decResult + decFromHexChar(hexStr[i]);
}
return decResult;
}
Experienced programmers would probably use a pointer to step through the string instead of treating it as an array:
long decFromHexStr(const char *pHex)
{
long decResult = 0;
while (*pHex != '\0')
{
decResult = 16 * decResult + decFromHexChar(*pHex++);
}
return decResult;
}
Since you're learning, it's worth studying the coding style and deciding whether you find it helpful or not, so you'll build good habits early.
Have fun!
Yesterday I wrote a function like this. You can see my code below.
/* Converting a hex string to integer, assuming the heading
0x or 0X has already been removed and pch is not NULL */
int hex_str_to_int(const char* pch) {
int value = 0;
int digit = 0;
for (; *pch; ++pch) {
if (*pch >= '0' && *pch <= '9') {
digit = (*pch - '0');
} else if (*pch >= 'A' && *pch <= 'F') {
digit = (*pch - 'A' + 10);
} else if (*pch >= 'a' && *pch <= 'f') {
digit = (*pch - 'a' + 10);
} else {
break;
}
// Check for integer overflow
if ((value *= 16) < 0 || (value += digit) < 0) {
return INT_MAX;
}
}
return value;
}
Here is the testing code:
int main(void) {
printf("%d %d\n", hex_str_to_int("0"), 0x0);
printf("%d %d\n", hex_str_to_int("A"), 0xA);
printf("%d %d\n", hex_str_to_int("10"), 0x10);
printf("%d %d\n", hex_str_to_int("A1"), 0xA1);
printf("%d %d\n", hex_str_to_int("AB"), 0xAB);
printf("%d %d\n", hex_str_to_int("100"), 0x100);
printf("%d %d\n", hex_str_to_int("1A2"), 0x1A2);
printf("%d %d\n", hex_str_to_int("10A"), 0x10A);
printf("%d %d\n", hex_str_to_int("7FFFFFF"), 0x7FFFFFF);
printf("%d %d\n", hex_str_to_int("7FFFFFF1"), 0x7FFFFFF1);
printf("%d %d\n", hex_str_to_int("7FFFFFF2"), 0x7FFFFFF2);
printf("%d %d\n", hex_str_to_int("7FFFFFFE"), 0x7FFFFFFE);
printf("%d %d\n", hex_str_to_int("7FFFFFFF"), 0x7FFFFFFF);
printf("%d %d\n", hex_str_to_int("80000000"), 0x7FFFFFFF + 1);
printf("%d %d\n", hex_str_to_int("80000001"), 0x7FFFFFFF + 2);
printf("%d %d\n", hex_str_to_int("10AX"), 0x10A);
printf("%d %d\n", hex_str_to_int("203!"), 0x203);
return 0;
}
It outputs the following values:
0 0
10 10
16 16
161 161
171 171
256 256
418 418
266 266
134217727 134217727
2147483633 2147483633
2147483634 2147483634
2147483646 2147483646
2147483647 2147483647
2147483647 -2147483648
2147483647 -2147483647
266 266
515 515
What does a hexadecimal number actually mean? Let's take 15FA. It means
1 * 16^3 + 5 * 16^2 + 15 * 16^1 + 10 * 16^0
Note that A represents ten, B eleven and so on up to F which represents fifteen. Also 16^0 is equal to 1.
So all we need to do is calculate the value of the above expression! The simplest way is probably to do it in this order:
10 * 1
15 * 16
5 * 256 //256 = 16 * 16
1 * 4096 //4096 = 16 * 16 * 16
This can continue further if there are more digits. All you really need is a loop and few variables.
There is another method of doing it which is explained by factorising the above expression like this:
((1 * 16 + 5) * 16 + 15) * 16 + 10
If you wish, try each of these methods.
More advanced information:
Basically, computers use base 2 (also called binary) for all their numbers and calculations. Even the string "1A6DC0" is encoded with 1s and 0s, which eventually get displayed on the screen as letters and numbers.
Sometimes you can take advantage of the fact that computers use binary, but usually you don't need to think about it.
For instance, when you do
x = (11 + y) * 6;
you don't need to worry that 11 and 6 will be represented as a series of high and low voltages at some stage. It just works as you expect. Converting between decimal (the number system we use) to binary and back is a simple process that computers can do easily, and so they do this for us automatically to make our work easier.
However, when converting between hexadecimal and binary, there is a shortcut. Since four binary digits are identical to a single hex digit, you can simply convert each hex digit to binary individually, then string them together.
For instance, 15FA would expand like this:
1 -> 0001
5 -> 0101
F -> 1111
A -> 1010
15FA -> 0001 0101 1111 1010
Note that this generally can't be done directly, and usually involves logical-or and bit shifts (|
and <<
). Fun stuff.