Thursday, July 07, 2005

The game begins

I hereby declare war on the unintuitive, the hard to understand, the cryptic. There are many things in life that are deservedly complicated. All efforts to simplify them end up sacrificing some of their essence. But there are many things that do not need to be complicated, yet are. Law. Programming. Teaching. Mathematics. Simplifying these things occupies my mind, and they are what I will do most of my blogging about. Please comment where you feel my writing is less accessible.

Lately I have been thinking about what the ideal programming language would look like. Computers only understand 1s and 0s. Software developers are human, preferring their native tongues to bits. But human languages are slippery. You can write one sentence and have three people interpret it in three different ways. The same word can possess different meanings in different contexts. Although powerful, natural languages are ill suited for telling a device that requires precise instructions what to do. There is only one way the computer will read the 1s and 0s.

Programming languages are a compromise. Programming directly in 0s and 1s requires intimate knowledge of a computer's inner workings and is inefficient. For the reasons stated above, English is equally inappropriate. Programming languages usually consist of a few core words, most often English, that enable a programmer to write out a series of instructions for the computer, without having to go into the gritty detail 0s and 1s require. Unlike their English equivalents, these words have well defined meanings. These words are then translated by another piece of software into 1s and 0s that the computer can understand.

Different programming languages take different approaches and are designed with different goals. C was developed when faster computer hardware was prohibitively expensive, requiring programmers to meticulously manage memory by hand with the goal of maximum efficiency. Java was designed to run seamlessly on different types of computers with no code changes, and to enhance productivity by representing units of code as objects. Python has been engineered to be highly dynamic, and less strict about setting limits on what the programmer can and can't do. Every language has its advantages and shortcomings.

There are a lot of programming languages out there. Surely every Computer Science undergraduate at one point or another thinks about what the "perfect" programming language would resemble. Half of them have the hubris to implement it. Less than 1% create a language that receives widespread adoption. Hacker Paul Graham has some interesting ideas about what makes a programming language popular. The short version: the more effective and easy to use, the more popular the programming language becomes.

Graham is a fan of the language Lisp. One of the key sources of Lisp's power is a feature called macros. Macros are made possible by an interesting peculiarity of Lisp: the code you write is almost identical to how the 0s and 1s translator interprets your code. In other languages, the code you write is broken down into various structures that look nothing like what you wrote, before finally turning into 1s and 0s. In Lisp, there is a unity.

Python has received praise for being very easy to read compared to other programming languages. Authors of computer science textbooks often create a "pseudocode" programming language for their examples, a fantastical language that doesn't have any of the quirks that encumber real languages, so that students can focus on whatever technique is being taught instead of the details of implementing the technique in one particular language. Python, more than any other language I have encountered, can pass for pseudocode. The brain simply needs to perform less translation when reading it. In Python, there is a glimmer of a unity between what the code appears to do and what it actually does.

In the ideal language, I imagine more unities. Unity between code and interpretation, between commands and intent, and between pedagogy and practice. On this last point I think languages have failed to make much headway. Algorithms, the instruction recipes programmers create to solve specific problems, are often explained to students in terms of graphs, but I have yet to see a language that includes graphs as a built-in datatype. Manuals for object oriented programming routinely use the term "inheritance" to describe one type of object having the qualities of another type of object, but in Java you use the command "extends," instead of the much more pedagogically obvious "inherits."

I am currently surveying languages, looking for unities among other things. I believe that more unities will lead to increase intuitiveness, and easier programming. Everyone benefits.