Decoding Google’s First Tweet in Python
Most of you must have read the news that Google finally jumped into the Twitter Bandwagon. In their trademark style, they have chosen to announce this in a cryptic way. Their first tweet was essentially this:
I’m 01100110 01100101 01100101 01101100 01101001 01101110 01100111 00100000 01101100 01110101 01100011 01101011 01111001 00001010
I will explain in this post how to crack this simple code with the help of some Python one-liners (Google’s favourite language). If you are a Google aspirant (who isn’t? ;) ), this might help you clear the interview. So pay attention.
To most people it is immediately obvious that it is a text encoded in binary. Since each binary word is 8 characters long, it is most probably written in the extended 8-bit ASCII code. In fact, it is and you can read this with a simple ASCII chart.
But they have made it slightly difficult for you by writing in binary. Since most charts would provide you a lookup from decimal or hexadecimal numbers to ASCII representations only. So how do you convert from binary to decimal? It’s quite simple:
decimal = lambda s: sum(int(j) * pow(2,i) for i,j in enumerate(reversed(s)))
This line defines a function decimal which works in a manner similar to how we would manually convert binary numbers into decimal. Each position is multiplied by increasing powers of two from the right. Then, these numbers are added together. for e.g. ‘1010′ will be 1 * 8 + 0 * 4 + 1 * 2 + 0 * 1 = 10.
Next, we split the binary part of the tweet string and apply the decimal function on each part
tweet = "01100110 01100101 01100101 01101100 01101001 01101110 01100111 00100000 01101100 01110101 01100011 01101011 01111001 00001010"
print ''.join(chr(decimal(s)) for s in tweet.split())
The result is something that you might have already guessed seeing the first 2 words:
“I’m feeling lucky\n”
Hope you learnt some interesting python constructs. If there are other ways of decoding this in Python, please comment below.
Distantly Related posts:
Trackbacks and Pingbacks
Leave a Reply
Additional comments powered by BackType
Try:
decimal(”01100110″) == int(”01100110″, 2)
Not a one-liner but something that I hacked together the other day in python. I was messing around with a friend and we were sending each other both 7-bit ascii characters separated by spaces as well as 8-bit ascii characters all joined. So the script can handle both:
byte = '' string = '' while True: buffer = stdin.read(1) if not buffer: break show = False if buffer not in '01': show = True else: byte += buffer if (show and byte) or len(byte) == 8: stdout.write(chr(int(byte, 2))) byte = ''
try: int(”101″, 2)
Actually, decimal(s) is builtin: int(s, 2).
Another construct – ‘int’ takes an optional base. In this case, base 2:
That is, your ‘decimal(s)’ is the same as ‘int(s, 2)’
“”.join(chr(int(s, 2)) for s in tweet.split())
@Simon Wittber, @ThomasWaldmann, @Georg Brandl: Thanks a lot for the much simpler solution. I forgot about int’s capability to perform conversion to any base. In fact, I was looking for a standard function to do this, but in all the wrong places like binascii.
@buge: Thanks a lot for posting the code. But I am not sure how it works. Will check it out with some test files.
Python 2.6 and 3.0 have the new binary literals:
nice post.Bookmarked!
My blog>
:roll:
@Heikki Toivonen:
Brilliant! This was exactly what I was looking for :)