For an assignment, I\'m trying to make some code in C that uses only bit manipulation to test if an integer is an ASCII uppercase letter. The letter will be given by its ASCII c
Since OP is stuck on case 0x7fffffff
, exclude it by extending the otherwise working solution.
!((~(((x & 32)>>5))<<31))>>31) & !(x ^ 0x7fffffff)
Pedantically, just code as below and let the compiler simplify.
isupper = (!(x ^ 'A')) | (!(x ^ 'B')) | (!(x ^ 'C')) ... (!(x ^ 'Z'));
You could use unsigned integer division, if that's allowed:
!((x-0x41)/26)
But that's probably not in the spirit of the original question. Consider what happens when you subtract 0x3B from any upper case letter:
A: 0x41 - 0x3B = 0x06
Z: 0x5A - 0x3B = 0x1F
The interesting feature here is that any value initially larger than 0x5A will have one of the high bits set (~0x1F). You can perform the same shifting for moving 'A' down to zero, so anything initially less than 'A' would have the high bits set. In the end a solution requires only subtractions, an or, and some bit-wise ands:
!(((x-0x3B) & ~(0x1F)) || ((x-0x41) & ~(0x1F)))
I believe that does what you want. Given the nature of conditional (short circuit) evaluation in C, this has an implicit conditional embedded in it though. If you want to remove that, minimize the computation, and maximize the obscurity you could do this:
!(((x-0x3B) | (x-0x41)) & ~(0x1F))
or my new personal favorite:
!((('Z'-x) | (x-'A')) & ~(0x1F))
You can test if an ASCII letter c
is upper case by checking its 0x20
bit, it must be 0
for uppercase and 1
for lowercase:
if (!(c & 0x20))
printf("ASCII letter %c is uppercase\n", c);
but be aware that this test does not work if you don't already know that c
is a letter. It would erroneously match '@'
and '['
, '\\'
, ']'
, '^'
and '_'
, and the whole range of characters with the high bit set from 192 to 223, which are not part of ASCII but are valid unsigned char
values.
If you want a single test to verify if c
is an uppercase ASCII letter, try:
if ((unsigned)(c - 'A') <= (unsigned)('Z' - 'A'))
printf("%c is an uppercase ASCII letter\n", c);
EDIT: it is unclear what you mean by I am not allowed to use if statements, or any kind of type casting operations. I must test to see if the number is between the two numbers, including numbers far outside the range of the ASCII code, and return 1 if it is or else 0.
c
is a letter, both !(c & 0x20)
and (((c >> 5) & 1) ^ 1)
will have value 1
if c
is uppercase and 0
if not.c
can be any integer value, just write the regular comparison (c >= 'A' && c <= 'Z')
and the compiler will produce better code than you would by attempting hazardous bit-twiddling tricks.EDIT again:
Since c
can be any integer value and you are only allowed bit manipulations, here is another solution: !((c >> 5) ^ 2) & (0x07fffffeU >> (c & 31))
. Below is a program to test this:
#include <stdio.h>
#include <stdlib.h>
static int uppertest(int c) {
return !((c >> 5) ^ 2) & (0x07fffffeU >> (c & 31));
}
int main(int argc, char *argv[]) {
for (int i = 1; i < argc; i++) {
int c = strtol(argv[i], NULL, 0);
printf("uppertest(%d) -> %d\n", c, uppertest(c));
}
return 0;
}
... to see if a letter is uppercase
Simplification:
Let us assume ranges [A-Z] and [a-z] char
differ by the same value which is a power of 2. So 'B'-'b'
equals 'X'-'x'
, etc.
#define CASE_MASK ('A' ^ 'a')
// Is letter uppercase?
int is_letter_upper(int ch) {
return (ch & CASE_MASK) == ('A' & CASE_MASK);
}
// Is letter lowercase?
int is_letter_lower(int ch) {
return (ch & CASE_MASK) == ('a' & CASE_MASK);
}
This works for ASCII and EBCIDIC
A more "bit manipulation" answer
int is_letter_upperBM(int ch) {
return !((ch & CASE_MASK) ^ ('A' & CASE_MASK));
}