Thoughts on Source Code Packaging

Ever since the computer became a personal computer and came within easy reach of masses, writing and sharing software has been an essential feature eventually leading to the Free Software movement.

In old days, software was being share by means of paper media, magnetic media etc. In the nascent days of computer networks, the software sharing came in the form of the bulletin boards, maturing along with technology to public software repositories generously hosted by big corporations and research institutes. These days we have dedicated software repositories such as GitHub, Sourceforge etc. teaming with hundreds of thousands of developers and teams.

Ever since the first software source code was distributed, there has been a need to provide easy compilation mechanism. Sometimes this came as instruction set in a README file, INSTALL file etc. It was sufficient at the time, as the platforms hosting this software were limited to a very few Operating Systems.

With the advent of electronics and computer hardware technology, new powerful and relatively cheaper hardware platforms emerged, backed by equally varied and powerful Operating Systems and software development tools. To cope up with them, the code distribution mechanism had to adapt itself. It evolved. It tried to offer configurability and scalability. Makefile and batch files are among the popular ones.

And then there came a big explosion - of demand. Demand was from every angle, every direction conceivable. It was in terms of hardware, in software, in development tools and methodologies, in terms of adaptability, in terms of complexity of software, in terms of ease of compiling software and deployment across as wide range of platforms as possible with as minimal cost as possible.

Code distribution mechanism had to adapt, evolve quickly. It adapted again. Along came autoconfig, configuration mechanisms catering to wider range of platforms and Operating Systems and a combination thereof.

An important aspect in this journey of code distribution is one important attribute: packaging of the compilation mechanism was always an integral part of the source code. It offered a huge convenience of the developer and builder community. Nobody had to go anywhere looking for instructions and mechanisms to build the source code. Everything was there, everything was readily available. The source code was self contained.

With the advent of Internet, software development communities spread across the globe, contributing code round the clock. A need was felt to have a swift cycle of building-test-modifying code. With self-contained source code, it was easier for individual developers to complete this cycle easily and to some extent even automate it.

Now we have evolved into highly distributed computing era - the Cloud computing and virtualization era. Various technologies such as virtualization, continuous integration evolved enabling developers to utilize the hardware resources to the fullest extent possible, without compromising on flexibility and cost. Concept of virtual machines revolutionized the way a computer can be used. Options even sleeker than Virtual Machines, such as containers came into existence.

As one wheel has covered this much distance, the other wheel - our code distribution and packaging seems to have stuck a bit behind. No more the code is self contained. The code packages have started fragmenting. Developers willing to take advantage of VMs and containers faced multiple problems. They could no more find easy compilation scripts along with the code they downloaded. They discovered multiple sources offering build mechanism for the same package, there by losing standardization and introducing non-uniformity. There is no guarantee that every package will have build script available for virtual environment.

Even now, software packages developers do not always have access to every real/virtual, hardware/software platforms available in the world, hence it is even more necessary to rely on community to obtain the build scripts for these virtual environments.

Taking example of Docker, a popular virtualization tool available today, developer is presented with a difficult scenario. Developers are trying to adopt softwares to Docker containers on various platforms which may include newer platforms. For a pioneer, it is a considerable effort. It is therefore logical and desirable to share the build mechanism - called as Docker file - with other fellow developers who may be attempting to build the software on a particular platform. And, following the same evolutionary mechanism, it is but logical to have these mechanisms and files available right along with the code.

However, the current stance of Software package owners and communities is shockingly strange and unexpected. Nobody is willing to package these instructions, these artefacts along with the code.  General advice one gets these days is to contribute these files at some different places. Isn't that deviating from the normal and logical evolution of source code packaging scheme? Isn't that going to add inconvenience to developers at large and non-uniformity across packages? Isn't that going to limit the freedom to have a software package to any platform required? Isn't inclusion of this mechanism, say a Docker file, going to ease everything?

I think the time has come for Open Source developer communities to ponder on these points and to open themselves to the natural and logical evolution of software code packaging. Two heads are better than one, more even better! I think now is the time for serious discussions and debates, and to let best and fittest option emerge which would be acceptable to all, and which would take code compilation and porting back to ease.

Comments

Popular posts from this blog

Security in Linux Kernel - Part 2

Linux Kernel Security: Protecting the Heart of Your Operating System

Types of CI Engines