Posts tagged ‘endianness’

2009-01-26

Endianness

Once again I came across the term Endianness. I’ve never really cared enough or felt I had the general knowledge of the field to understand what endianness really means, but today I finally felt differently. Endianness has an article on Wikipedia and I decided I would read some of it and finally get an understanding of what the term means.

Endianness basically has to do with what comes first. Take the number 128 as an example. The “1” has the most significant meaning because it is the highest number, or the number with the greatest value because of its position. The same goes for any numbering notation that is positional of any base. Take the binary number 00100100 for example (the ASCII code for “$” and Bender’s apartment number). The first 0 represents 0 * 128, the second one 0 * 64, then the first 1 represents 1 * 32, and so on. As you can see, the farther we go to the right, the smaller the value of the position. This is called a big-endian order, that is, the information with the most significance comes first. Then there is little-endian order which would quite simply just make the number 128 be read as 821, but still have the same value.

Big-endian and little-endian order can be something that is important to deal with when writing a computer program, especially for applications that communicate over a network and run on different architectures. This is discussed in the Wikipedia article. You can however have a lot of luck, for example with our dollar sign: 00100100 is a palindrome which means it’s written the same if we write it backwards. Most words and numbers are not palindromes.

Now we get to the practical part of endianness in regular writing and reading on a piece of paper. Say you’re reading an article on astronomy and it gives you some astronomical number that has tonnes of digits, and the writer doesn’t use prefixes or scientific notation because let’s say it’s a popular-science magazine and the target readers aren’t used to either of them.

Say the article talks about a distance in space of 819273987123781233 km. That’s fun to read. If you would be reading the article to one of your friends, you’d likely take a few seconds to first determine how big that number actually is (millions/billions/trillions/etc.), and then start to slowly traverse the big number. Now, we’ve since long invented something called the thousands separator, which would transform the huge number into something slightly more readable, but not by much: 819,273,987,123,781,233. The problem is that we don’t see big numbers like this often enough to “see” how big it is immediately. If we see the number “100,000” or even “100000” we might be able to determine its true size much faster, because those numbers are much more common. But not this one.

The number is spoken as: “eight-hundred and nineteen quadrillion two-hundred and seventy-three trillion nine-hundred and eighty-seven billion one-hundred and twenty-three million seven-hundred and eighty-one thousand two-hundred and thirty-three”. Not only does it take long to say, but even longer to form the phrase in your head when you only have the digits to start with. What would seemingly make things easier would be to move over to a little-endian notation where the smallest number comes first, such that we would have 332,187,321,789,372,918 but it would at the same time represent the same value as before. However, this would force us to say “three, thirty, two-hundred, one thousand, eighty thousand, seven-hundred thousand, three million” and so on. Even if it is easier to start reading and saying the number much quicker, this is still as inefficient as the old way, or worse, since we have to say “thousand” and “million” and “billion” as so on for each digit that we come to.

This is where I propose the adoption of a more peculiar style of what I call little-endian thousands-separated notation, which is based on the fact that we like to group things by the thousands, and multiples and exponentiations of one thousand, in our numbering system. The basic idea is to either say the first 332 as “two-hundred and thirty-three” and keep a pure little-endian literal notation, or the even more peculiar, but most likely underestimated, notation that looks like this: 233,781,123,987,273,819, which you would read as “two-hundred and thirty-three seven-hundred and eighty-one thousand one-hundred and twenty-three million nine-hundred and eighty-seven billion two-hundred and seventy-three trillion eight-hundred and nineteen quadrillion”. This way, we can compress not only the way we actually think the huge number in our head but also the way we say it. We can also benefit from being able to start reading and saying the number right away without having to scan more than 3 digits at a time, which would come more and more naturally as this notation becomes more widely adopted. As an added bonus, saying really huge numbers could add excitement, because as it is now, we say the highest number first, thereby ruining the surprise of just how big the number really is.