[ Curiosity,Experimentation ]

Random stuff from the parallel universe of Ones and Zeroes

Character case conversion using Bitwise operators

Posted by appusajeev on July 12, 2009


I found this somewhere and hope it might come handy.

ASCII(American Standard Code for Information Interchange) is a method commonly used to represent characters(including alpha numerals,punctuations,control characters) in memory.In this method,each character is uniquely defined by a 7-bit number(or 8 to include some extra characters). For eg. the ASCII code for ‘A’  is 65 and that of ‘a’ is 97 and that of  ‘#’ is 35.

There exists a striking similarity between the ASCII values of upper case alphabets and lower case alphabets . The binary representation of an upper case alphabet and the lower case of the same are the same except for the fifth bit !

for eg . the ASCII code of  ‘ B’   is  66 which is            1 0 0 0 0 0 1 in binary

and the  ASCII code of  ‘b’ is 98 which is                     1 1 0 0 0 0 1 in binary .

Notice that there is a change only in the fifth bit and the rest are the same.

The fifth bit is set for a lower case alphabet and it is cleared for an upper case alphabet.

With that knowledge,bitwise operators can be used to detect if the character is in upper case of not.

The python statement to detect the case is (assuming x contains an alphabet)

ord(x)  &  (1<<5)

If the above statement is true,the character contained in x is a lower case character. ord() is a built in function which returns the ASCII code of a character.

and the corresponding C statement would be :

  x  &  (1<<5)

The advantage of this method is that there is no need to the check for the range of ASCII values to determine if the character is  upper case or not.

Now to character conversion. Suppose that a contains a lower case character. To convert it into upper case,just set the fifth bit to zero . The following python code illustrates the method

a=chr(ord(a)&~(1<<5))

and the corresponding C code would be

a=a &~(1<<5))

This method may sound lame when the easier method to add 32 to the ASCII code exists but it just demonstates the application of bitwise operators.

Advertisements

8 Responses to “Character case conversion using Bitwise operators”

  1. neo2904 said

    thats a cool method..

  2. Kain said

    Don’t you mean the SIXTH bit?
    And yeah you can do the same, easier by using a bitwise AND with the value of ‘_’ or 95 (hex 5f)
    Thus y = x & 0x5f
    where x == ‘a’
    y == ‘A’

    also a bitwise exclusive OR with the value of ‘ ‘ (32, or 0x20) will do the inverse function
    Thus y = x ^ 0x20
    where x == ‘A’
    y == ‘a’

  3. Adding 25 to the ASCII code works only if you know the letter is upper case. Setting the bit works whatever the current case of the letter. So the bitwise technique saves a conditional. As long as you don’t mind converting @[\]^_ to `{|}~delete

  4. When you say add 25 to the ASCII code, surely you mean 32 (hex 20), not 25?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: