Sample Chapter: Coding for Sport

(During Apr-Jun, I beat 25k words of random thoughts out of my keyboard on bootstrapping a developer, only to forget about it very soon. Here is a sample chapter, shared for merciless feedback.)

Programming can be adventurous. Sometimes you are excavating software graveyards to find treasures from the past. Sometimes you are writing code to decode totally non-computational mysteries from centuries ago. Sometimes you are writing code simply because you cannot help it. Those weird characters and constructs somehow attract you, making you write more of them, chase and squash bugs, and once the waters are fairly calm, transforming you into an artist who designs logos and an instructor who writes meaningful manuals.

True, as a professional, you should be capable of working on things that do not interest you. But software development in general should be your hobby. If you are someone who writes code for sport, what awaits you in the industry will be less fearsome.

Investigations and Virtual Excavations

People have spent entire lives chasing UFOs, trying to figure out who the Zodiac Killer is, and attempting to prove or disprove the Goldbach Conjecture. Some of such attempts are arguably a waste of time, but there is something about the human brain that keeps us attracted to such mysteries and gets us involved in the effort to decode them. If that is inevitable, why not exploit the situation to yield some material benefit as well?

Generally speaking, almost any science project is about decoding the mystery that is the universe, and computers are at the centre of it. But let’s not get that poetic for the time being. Instead, let’s focus on individual and immediate puzzles, that too directly computational in nature. Also, if this section looks intimidating instead of motivating, feel free to skip to the rest of this chapter.

Solving many of the historical mysteries requires various kinds of scans and a lot of number crunching, at a pace that only computers can have. From inception, computers have been considered code breaking machines by the military. But they are involved in a lot of civilian code breaking as well. An interesting recent case would be the Vesuvius Challenge. It was launched in March 2023 to gather researchers across the world to read the Herculaneum scrolls that were carbonized in 79 AD when Mount Vesuvius erupted. The scrolls were discovered in 1750 AD and a significant portion of them remain unopened due to their fragility. The goal of the Vesuvius Challenge is to read these scrolls without physically opening them. The foundation for this challenge was laid when back in 2015, researchers successfully read the En-Gedi scroll, a different one from Israel, using X-ray tomography and computer vision.

The grand prize was awarded in 2023 itself for a breakthrough, but there is more to be done and the challenge is still on.

If you are interested in working on similar things, you can look up the list of puzzles from ancient languages/scripts that nobody has been able to read (including the Harappan Script) to cryptic books that remain unbroken (like the extremely mysterious Voynich Manuscript). Remember that it doesn’t have to be ancient in order for something to be challenging. The fourth encrypted message in the Kryptos sculpture located right on the grounds of the Central Intelligence Agency, USA remains unsolved. It was dedicated in 1990. Various lists of other unsolved codes can be found online.

Since our focus is on learning programming, maybe you should start with codes that are already broken. Get some hints and try to solve the rest using your programming skills. That way you won’t be confronted by two mysteries at the same time–the code itself and programming. Better pick extremely simple challenges listed online for the exercise of beginner cryptanalysts (people who try to break codes).

But what is there to be programmed exactly? One possibility is to write a program that tries to break the code all by itself. Another possibility is to write an interactive program that lets you break the code by trying out various things.

Real-world code-breaking involves figuring out both the encryption scheme as well as the key used (for example, if the encrypted form of APPLE is DSSOH, the scheme is rotation and the key is 3, since you are shifting the letters by three). For our purpose, it is better if you already know the scheme or a set of possible schemes.

The effectiveness of your program comes down to whether it is able to detect when the code gets broken, and reduce the number of key combinations tried instead of performing sheer bruteforce (which might never finish). For this, you have to incorporate dictionaries, and heuristics like frequency analysis (for example, the character that appears the most in a code is likely to be the substitute for e, because that is the English letter with the highest frequency).

If you want to try something real-world, maybe you should look into the Zodiac letters.

A series of murders terrorized America in the 1960s, claimed to have been committed by a person who referred to himself as “the Zodiac". He kept sending the media cryptic messages containing explanations and threats for years. One of them even contains a sentence “My name is..." Both the law enforcement and civilians have been trying to decode these messages for years, cracking some almost instantly, some decades later, with some remaining still unbroken.

I remember having tried to break an already broken Zodiac letter with the help of programming just for fun. If I remember correctly, I learned what the scheme was, didn’t get the key or read the decipherment, and went on writing a script that would make trial-and-error easier. Something even a beginner can write in half an hour.

It is important that you don’t read the decrypted forms or even the keys of these letters before your attempts. Just gather the info that is enough to limit your programming efforts to a reasonable level (like knowing it is a polyalphabetic substitution cipher). That way it’ll give you goosebumps when those messages unravel on your screen, words written by a serial killer decades earlier, words that sent shivers through the spines of a generation.

The interesting thing about these kinds of challenges is that there is no trust issue. You don’t have to employ proven algorithms or hesitate to tweak things that should’ve worked. Trial-and-error is perfectly okay because once you come up with some solutions, you and others can manually verify it easily (unless it is about something really short like the name of the Zodiac Killer). This means you can incorporate machine learning as well, in case you are a person who stays away from it because it sometimes feels like black magic.

Start With a Real Project

It’s now common for newcomers to start by stuffing their workstation with a lot of development tools and following some beginner-friendly tutorial on the hottest languages and frameworks found online. While this works, one problem with this approach is that many a times the learner doesn’t know why they’re using a particular language or framework that they’re using. Yes, the tutorials could be giving some reasons, but it could be biased, and the learner wouldn’t be even having Hello World level experience in any other technology to make such comparisons themselves.

Second problem is that, the learner doesn’t realize that there could be simpler and leaner ways to accomplish something. That’s how we end up with programs meant for trivial tasks that still weigh hundreds of megabytes.

Finally, either the learner learns some concepts and “patterns" alone, or gets some hands-on experience but only in terms of silly, cliched and unrealistic examples like a To-Do app that doesn’t store anything or a mock chat front-end that doesn’t involve any real communication machinery. Even with end-to-end examples, the learner is just re-typing and redoing what the tutor has already done.

Real-world software development involves a lot of skills like navigating through several kinds of documentation, reading the theory and implementing stuff that you have no idea how to implement, and finally, causing, tracing, and fixing bugs.

What is the solution to all these shortcomings? Start with a project of your own instead of starting with a language or framework. Let that project be something you’ve always wanted to use by yourself, maybe something fresh or an alternative to similar things that are there but felt unsatisfactory to you. That way you’ll face more challenges but it’ll still be more fun.

Whenever I’ve started with a project, I’ve ended up learning a new language, library, protocol, or something like that. Whenever I’ve started learning a language for the sake of just learning it, I’ve never reached anywhere.

Maybe this was the software architect in me talking. Maybe the method I just recommended would cause you miss some trending keywords in your CV because your passion project didn’t require Django, ReactJS or whatever companies are looking for. So I can suggest a variant: come up with a bunch of personal projects, think about the implementation details of each project, and pick the one that would require what’s trending out there.

Develop for Yourself

It’s a common practice among engineers to look back and realize that many things they’d created were for themselves. Sometimes as solutions to unique and immediate problems, sometimes just reinventing the wheel to have the pleasure of using own tools.

When you’ve been programming for a while, you’ll automatically have similar stories to tell, although not necessarily legendary. I have a custom static website generator which is technically a collection of hacked-up scripts and programs. The current iteration of the generator got mature by 2020 or so, and I’ve never had to look back. All I have to do to rebuild the site after adding a file is simply run the make command. It’ll only touch the pages that actually need updating. After that, I run make tryupload to perform a dry run to see which files on the remote (server) will be affected, and if everything looks good, I’ll run make upload to reflect the changes on the server. Updating a website has never been easier.

But this doesn’t mean I have the world’s best static site generator. It works for me perfectly, that’s all. It’d take a lot of effort to generalize it. (By the way, how I changed the architecture of my site generator is interesting. It involves the reversal of the thought process, which is explained in section 7.5.)

If you are looking for ideas to develop something for yourself, here’s one: develop a script to help declutter your files. It can do anything from simply listing out rarely used files to sorting out all the receipts in your Downloads directory. Finding unused files is only a matter of checking the last accessed time of each file. Sorting receipts is an extremely useful task that sounds sophisticated, but the solution can be developed easily as a wrapper to some pdf-to-text programs and Optical Character Recognition utilities that are capable of reading text from images.

Be Your Own Client

When you develop for yourself, some of the works can be extremely custom–like the scripts you write to process a custom markup language that you invented to make your bookwriting or blogging easier, or the scripts you write to automate the monitoring and management of your homelab and a couple of cloud instances that you own. Some of it can feel like fully developed products, but can still be custom and require a lot of effort to generalize.

Even in cases like this, you’ll already be coding for sport. But the key to growth is to act like a client and annoy yourself with feature requests and modifications. That way you’ll improve the usability, extensibility, documentation, and the overall quality of your projects. Also, you’ll learn to write code in a way that is capable of tolerating ever-changing requirements.

But when you know a solution that you’re developing could be useful to others and it isn’t that difficult to develop it into a generalized product, never miss the opportunity. Become an average user and keep asking for improvements until you get real users.

TeX is a typesetting system created by computer scientist Donald Knuth in the 70s. It has become the de-facto standard for preparing documents, papers, and books in science and engineering, and still holds more than four decades after its inception¹. If you have not used TeX yet, think of it like a geek-friendly replacement for word processors like LibreOffice Writer and Microsoft Word. Although there are some IDEs, TeX at its heart is a code-based system, like HTML. You can generate PDF and many other kinds of output from TeX documents (including HTML). What makes TeX unique is its capability to deal with and lay out complicated things like mathematical equations and its ability to automate everything without getting in your way.

The most interesting thing is, TeX was originally written by Knuth to typeset his phenomenal work The Art of Computer Programming, because he was dissatisfied with the existing typesetting system at some point. He didn’t hesitate to spend years working on it and finally release it into the public domain.

Things don’t have to be that legendary. Even your silly projects can have some impact or at least help you in your journey as a learner. I wanted to be able to type Unicode Malayalam back in my school days. The Inscript keyboard layout, a standard which lets you type in a dozen Indian scripts, was well supported by GNU/Linux, but I had no practice in it. Inscript isn’t phonetic, so it takes a while to get used to. Being able to find no phonetic typing tool for Malayalam like the ones that were available for Windows, I decided to develop one on my own. I could do that without too much trouble in Python and GTK. It was called Parayumpole (meaning like you speak).

The interesting thing is, not long after it was developed, I started getting good at Inscript. But things didn’t end there. I started porting it and extending it. Extending the program made me learn some other Indian scripts. I wrote an in-browser version of it using JavaScript, which eventually got me into Web development. Someone else and I ported it to Firefox OS² simultaneously. Extensions for Firefox and Chrome were released which haven’t been updated since. An updated Web version and desktop version are still available, which some people still use.

A more interesting one would be an extremely minimal digital painting application I developed in 2023, which started out as a technology demonstration for a source-to-source compiler that I am working on.

Watch and Record Your Project Grow

Frequently checking the source line count of the project that I’m working on at the moment has always been my secret pleasure. Sometimes it’s like I’m sitting for another coding session just to see my project grow in the most superfluous manner. Surprisingly, a significant decrease in the code size is also pleasing when you are doing refactoring. I think it’s all about the delta. You add a thousand lines, you feel happy. You shed a thousand lines, you feel happy. You take more effort and do both in the same session, it feels like you’ve accomplished nothing.

Do yourself a favour and record some coding sessions and test runs. Sure, they’ll capture the baby steps of your project one at a time, bringing back sweet memories later like any journal. But there’s another important purpose. There’s no guarantee that you’ll be able to build and run your project a couple of years from now thanks to ever-evolving operating systems, libraries, and hardware. Even virtual machines will be of no use if your app has network interaction–the services and protocols you depend on won’t be there forever. Screen recordings are the only way you can preserve your work in terms of how they look and feel.

Meta Activities: the Good and the Bad

There is a certain joy in non-programmatic activities like designing the logo, writing the manual, and releasing a package. You start believing yourself that your product is important. Sometimes you start a project just to go through these kinds of activities, reflecting the child in you that wrote a pseudomanual for an imaginary television set after flipping through the pages of a real one. Although you are spending hours and hours on the meta-stuff instead of the core, such activities and the joy they offer can keep you motivated, while improving the product’s overall reach, usability, and quality. A mediocre program with a great manual is more helpful than a great program with no manual, for instance.

However, as a beginner developer who sincerely wants to be the best, there is another way you’d spend a lot of time: researching and debating less important and subjective matters. You spend days surfing through forums and blog posts to decide which language or framework to pick up first. You spend a million keystrokes typing comments explaining why tabs are better than spaces or emacs is superior to vim. It’ll be too late by the time you realize you can’t still write anything beyond a Hello World in any language.

Web of Inspiration

Passionate programming isn’t just about making learning more interesting or getting small things done. Once you are into it, you simply cannot stop doing it. Sometimes such projects grow out of hand. There are examples all around, especially in the Free/Open Source world.

Fabrice Bellard has several projects listed on his homepage bellard.org. The list includes curious ones like QuickJS, a “small but complete" JavaScript engine, and TinyGL, a small subset of OpenGL. But buried in that list, there lies QEMU, the popular emulation/virtualization solution, and FFMPEG, something more than a video conversion engine that we have all used directly or indirectly. These projects started out as his solo projects to become highly-regarded community projects.

Andreas Kling started SerenityOS as a solo project in 2018, to fill the void after a rehabilitation program. He used to work at Apple and Nokia on the WebKit engine before that. SerenityOS soon became his full-time activity, and took off as a community project. GitHub shows that by May 2024, the project has had contributions from more than a thousand people.

Serenity’s 1st anniversary page says it had to be simple in order to be compatible with its own browser, which means the OS had a working browser that was capable of parsing and rendering HTML pages, by the end of the first year itself.

An interesting thing about the project is its side product, the Ladybird Web browser. It has spun out to be a cross-platform project that has received substantial funding and has a couple of people hired to work on it. The browser is developed from scratch, which is impressive considering the amount of heavylifting a modern browser is expected to do. But you’ll understand its importance only when you realize that many of the mainstream browsers out there–including Microsoft Edge and Brave–are derivatives of Chromium. Most of the rest use the same engines used by Chromium and Firefox. It wouldn’t be an exaggeration to say the world doesn’t know how a different browser would feel like.

On a different plane of existence, there is TempleOS, a limited, isolated, yet extremely curious operating system. It was developed by Terry A. Davis, who is sadly no more. The project features its own C dialect HolyC and a flight simulator.

Ad-hoc Entertainment

Sometimes software development is extremely boring because all you’re doing is calling a bunch of API endpoints offered by an external service. Sometimes it requires a lot of thinking and hard work. In either case, it can be stressful when you have to fix obscure bugs and deal with broken things that are not under your control while the deadline is right around the corner.

The real thing to keep you motivated in such situations is of course the end goal. What does the final product mean to you and the society. However, there are some ad-hoc resorts as well.

The computer science book market is dominated by textbooks and tutorial books, but that aren’t the only ones that you can find. A popular example is The Pragmatic Programmer by David Thomas and Andrew Hunt. The book is meant to help readers master the art and craft of programming and software development, and is written in a way that interleaves advice, stories, and examples, without limiting the discussion to any particular programming language or methodology. The book is said to have popularized interesting concepts like rubber duck debugging, a technique in which a programmer tries to debug a program by patiently explaining the code in natural language (to a rubber duck in case nobody is around). The reasoning behind the said technique is that many a times you can find the solution to a problem all by yourself if you try to explain the situation thoroughly.

Joel Spolsky has been writing about the technical and managerial aspects of software on his site joelonsoftware.com. The site says it has over a thousand articles to date. Some of them have been compiled into a printed book with the same title as of the site. His credentials might boost your interest in reading the book or the blog: he used to work at Microsoft and went on to co-create popular and useful things including StackOverflow.com.

Books that inspire you to program need not be programming-related ones at all. The list can include introductory books on cryptography, combinatorics, and even linguistics. One such book would be The Code Book by Simon Singh, which interleaves the history of code making, code breaking, and the role of computers.

I seriously doubt there is any Internet-using programming enthusiast who’s never come across at least one strip of Randall Munroe’s stick-figure comics that cover various topics from programming to physics, that geeks can easily relate to. In case you happen not to know the name, it is xkcd, and it doesn’t stand for anything.

In xkcd.com/1323, the artist claims to have discovered a way to get computer scientists to listen to any boring story. The trick? Just name the characters Alice, Bob, and Eve, which happen to be the standard placeholders in the description of cryptographic protocols.

xkcd strips are available on xkcd.com, and is free to use for non-commercial purposes. Munroe used to work for NASA and other another interesting work from him is the book What If? that tries to answer “important questions you never thought to ask."

I do not want to get into subreddits that is full of programming memes, because someone might have picked up this book as their first step to stop the endless scrolling and start some actual work.

Speaking of working, you can look into esoteric programming languages, which have weird syntax, strange paradigms, or unreasonable constraints. There are interesting competitions like the International Obfuscated C Code Contest as well, in which you are expected to write C programs in the most obscure manner (exactly the opposite of what software engineering textbooks would teach you). Fabrice Bellard, who is mentioned elsewhere in this chapter, is a winner of the said contest.

Innovations From the Past

The scene "It’s a UNIX system! I know this!" from Jurassic Park (1993) is iconic. In the scene, one of the lead characters is trying to bring back various systems in the park online, which is done by navigating through the filesystem and activating certain files. The most interesting thing about this scene is the file browser shown on screen, which happens to be 3D. Files are presented as solid blocks with interconnecting lines to denote directory structure. Navigating using mouse makes the visuals move as if the user is flying by in a helicopter.

A 3D file navigator is definitely an overkill, but we’re used to the made-up on-screen graphics in Hollywood hacking scenes, right? Turns out the file manager featured in the movie was a real one. It’s called fsn (“fusion"), a program from Silicon Graphics made for their IRIX operating system.

Silicon Graphics was just trying to showcase their amazing graphics capabilities with this program. Although it’s an overkill in terms of purpose, it makes us wonder what else was there in the stone ages of software–overkill or not.

Let’s limit the discussion to consumer-oriented aspects like user interface and communication, which will be relatable to everyone.

When you check the early GUI demos from Xerox Palo Alto Research Center recorded in the 70s, you’ll be surprised to see how user-friendly they look compared to the early versions of Microsoft Windows released years later. To quote the YouTube channel of Computer History Museum, “years ahead of its time, the 1972 Xerox Alto featured Ethernet networking, a full page display, a mouse, laser printing, e-mail, and a windows-based user interface." Machines from that era even had drag-and-drop and motion graphics, that too in a meaningful way.

Considering developers are also users–users of programming language tools and IDEs, maybe we should look into the user-friendliness of developer solutions as well. Although Visual Basic from the past and no-code solutions from the present are considered inferior by many experts for one reason or another, but looking at such beginner-friendly solutions and their demos from the past might make us sometimes think if development has lost its charm at some point.

Consumer software used to be way more artistic in the 2000s. System-wide theming was a thing. It is still there, but nobody cares. The GNOME desktop environment doesn’t even have custom theming enabled out-of-the-box. All you can do is choose between Dark and Light schemes.

While system-wide themes are nothing unfamiliar, application skins should feel totally strange to those who started using computers in the late 2010s. Skins let application windows take arbitrary shapes and colours. Although there were exotic and futuristic skins, one popular appearance for media player applications was the look of real media player appliances. Maybe that’s one reason skins aren’t popular now–everything has become software in our lives, and the look of real hardware doesn’t look real anymore.

Extreme artistic minimalism can be seen in the case of icons, music, etc. Look how three-dimensional and realistic the icons from 2000s were compared to today’s flat and less-detailed design. Check out the quantity and quality of music present in Windows XP from 2001. It was there from the installation wizard to the tour and test files.

This is not to say that that’s how things should’ve stayed. Most of that was the result of computers being new and exciting. Creators couldn’t help creating, companies couldn’t stop showing off, users couldn’t resist the temptation. Things have moved on.

Also, those of us who are nostalgic should remember that many of it felt annoying back then. Custom system-wide themes always broke at least one application, so GNOME not supporting them is maybe a good thing.

However, there is no doubt that we should revisit the enthusiasm that developers had back then. Not only for inspiration, but because we might find ways to bring back some of those ideas without having the issues their previous implementations had.

That was about creativity in presentation; what about creativity in coding? Programmers from the past decades had to be very clever and creative in order to get around the limitations hardware had back then. Some of this is covered in chapter 7.