Wednesday, 7 October 2015

OverFlow or Flaw?

Have you ever ran out of hard disk space? No, then you are lying. Every single human being (of our age) has ran out on hard disk space. And when it happens either you delete some unnecessary things (you often find a lot of them) to free space or if you got bucks buy a new hard disk. But what happens when carefully designed protocols and standards run out space or we can say reach their limit. Billions are invested in them and millions of people are using them, so you cannot replace them over night (and in some cases ever). Let's see about four of such cases.

1. FAT32
Did you ever had that annoying error when I you try to copy a large file to your USB stick or memory card? It says 'The file 'xxxxxx' is too large for destination file system'. And you are like, 'what the hack? The pendrive is 16 gigs in size and the file is just like 4.5 or 5 Gb.' And what you do next is, as it always been, getting super annoyed and asking to an invisible computer overload as Joey* did in 'Friends', "Why? Computer, Why? Why are you doing this to me?" And yes, I agree that it is the right thing to do. Now what if I told you that you had to go through this mental anguish because top minds at Microsoft couldn't see ten years in future. Okay, no one can see future, but when one is designing a file system, one must estimate and accommodate the future needs. The file system most USB stick use is called FAT32 (File Allocation Table 32bit) and it was introduced by Microsoft in 1996 for Windows 95. With a 32 bit Address the largest file it can accommodate is 4 GB and largest partition size is 2 TB. May be 4Gb was a large file in 1996, but only after 9 years game called GTA San Andreas was larger than 4 GB. So, their file system can’t accommodate something that can be used commonly by users after 10 years. Although, this file system is replaced by NTFS for most Windows based computers it is still used widely in in USB sticks and memory cards. You can format your USB drive with NTFS, but then there can be compatibility issue with other devices, especially smartphone OTG connections as I have noticed. There is also exFAT which solves this problem and will mount on smartphones, but it won’t work with Windows XP if it is not updated. Therefore, if you own a 16 GB USB stick which you want to be able to store large files and work with your phone, so you format it with exFAT and then at college you plug it in to a Windows XP which is not updated since like beginning of time and boom…you are screwed.

2. IPv4
You hadn’t seen this problem on your computer and if you are not a techy you probably haven't heard about it. But this problem is much bigger than the one we previously described. As you have guessed it is related to the Internet. IPv4 stand for Internet Protocol version 4 and IP address is the address assigned to every device and website connected to the internet. The routers use this address to guide your data to the correct server and same way guides the data server sends to your device. It looks like fairly efficient system, what is the problem then? The problem is that we have ran out of IPv4 addresses. As 'computerphile' puts it 'Internet is full', there is no room for anything new. You will think that "Well, I see a new site popping out every day". This is done with many little tricks such as recycling IPs and using NAT. Actually, your device's IP is not dedicated IP, a new IP is assigned to you by your ISP every time you connect to Internet. A static IP is luxury these days. All this problems are occurring because IPv4 uses a 32 bit address which can accommodate about 4 billion addresses. And with millions of websites, lots of online services, defense departments running their servers and a smartphone in every hand, these addresses have ran out. Who is to blame for this serious shortage of IP addresses and providing a topic for debate to techies for years? (IPv4 vs. IPv6 debate is a decade old and is still hot) IPv4 was designed in 1981 and first deployed on ARPANET (predecessor of Internet) in 1983. May be 4 billion was a large number then with just thousands of computer attached to it at that time. But after 30 years the protocol reached its limit is, as I see, a failure in part of the protocol design and deployment. In fact world's population was about 4.5 billion at that time and was rapidly growing. The protocol designers certainly didn’t had a vision of computer in every hand or idea that it will last this long, but still a small mistake at that time is giving a hard time to many organizations and people. The solution to this problem is IPv6 which uses 128 bit address, which means 2128 unique address this are gazillions (3.4 * 1038 to be exact, enough if each person is assigned a billion billion) of addresses and are unlikely to run out. The problem is there is so much infrastructure that works on IPv4 and so much investment made on it that IPv4 is unlikely to phase out any time soon (or ever, as some expert believe). The future of Internet is as a hybrid network.

3. Unix Time
One can defend the FAT32 and IPv4 situation by saying that it is very hard to predict how computing world will grow and what will come next and it was even more difficult in those years were computing world was growing exponentially in its capabilities (Moore's Law). But these excuse would not work here as this situation did not involve any more future prediction that 'I will probably still alive in 2038 or many people alive today will be still alive in 2038 or many computers (servers mostly) we are using today will be still in use in 2038'. Have you ever wondered even when you completely switch off your computer and turn it on again it still shows the right time, even if not connected to Internet. This is managed by a counter powered by a tiny electric cell on your motherboard. This counter is incremented by one every second and it is ticking that way since 1970 (which is 0, negative numbers can be used to denote date as back as 1901). The problem is that the 32 bit (again 32? What's wrong with 32 bit ) data structure Unix operating system uses to access this time runs out in 2038, when it happens it will reset to zero and it would be a chaos. But what is UNIX? I have not used it. I have not seen it. I have never heard of it. Why are we worrying about such operating system? Many of you will ask. The bad news is Unix is the mother of all modern operating system a whole class of operating system called 'Unix-like OS', so not only Unix but Windows (Not Unix-Like but still uses 32 bit time stamp), Mac OS X (which is basically Unix), Linux (Free alternative to Unix), Android and many other operating systems uses this 32 bit time stamp. So we are talking about most of computers in the world. Second, 2038 seems far away, why are we worrying about that? We will figure something out by that time, right. Wrong, this time stamp is ubiquitous and used extensively in embedded system (which cannot be reprogramed). Furthermore, it is used so widely that changing it to 64 bit would create massive code incapability. Lastly, it cannot wait for 2038 for all programs, programs that use future dates need it be fixed earlier. Like, for a program which uses dates from 20 years in future needs it fixed by 2018. The good thing is expert are worried and already working on it, but as some experts say there will still be some systems using 32 bit time even in 2038. The solution to this problem is 64 bit time stamp which will cover us through year 292,277,026,596, which is far greater than lifetime of the earth. Think, if they have used 64 bit time-stamp (or 48 bit, though it would be odd) from the beginning there would be no such problem. But at that time with 8 and 16 bit computing 32 seemed good enough and 64 was something too large.

4. Y2K Problem
This problem is even more ridiculous then the previous one. However, it has already passed and didn't cause much trouble, but it is worth remembering. Many computers from the 1980's used only two digits to store the year part of the date. Which meant that it would reach limit '99' in the year 1999, and consider 2000 as 00. You may feel what big deal, I always use only two digits for dates and never confuse 2015 with 1915 or 1815 or 0015. And I won’t be bothered if the computer showed me 00 as year instead of 2000. But, unfortunately computer can’t understand it will surely confuse 2000 with 0000. Actually, when computing elapsed time between dates, computers subtracts dates in a predefined way. If a bank computer is programmed to subtract date of deposit from current date with to calculate interest, with year set to 00 it will subtract 99 (which is the date of deposit) from 00(which it thinks is current date). It will be a negative number. As per the formula of interest, you will be paying a negative interest and that also of 99 years! Next thing is that you are bankrupt. It is ironical that best minds of the world couldn’t see what was inevitable in a decade. What they would have said to a CEO of multibillion dollar bank, "Mr. CEO, this is a machine we are selling you for tens of thousands of dollars. It is reliable and will be manage accounts worth billions of dollars. And yes last thing, it will fail after 15 years, while tearing your entire system apart".

*Joey is a beloved character of all-time favorite TV series 'Friends'

Friday, 2 October 2015

Reviving the Blog

It has been quite a time, seven and half month to be exact, since I last published something on this blog. To be political I had taken a hiatus from blogging for a while. To be blunt and honest the blog was dead for 7 months. However, even though I hadn't posted anything on blog for a long time it doesn't mean that I haven't written anything during that time. I have been writing my personal log fairly regularly during this months as I have done before. Actually, a very few selected items from my personal log (which contains anything that comes in to my mind) goes in to my web log (which is this Blog). In the following days I will keep posting some of the writings from that time as well as anything new I write.