Warning: x = x + 1 May Be Hazardous to Your Brain

By Dan McCreary
November 16, 2008 | Comments: 5

xrx-big.jpg

If you take introductory computer science classes today, 90% of the courses teach the beginning programmer languages like Java or C++. My daughter is taking a Java class at her high school with the hopes of getting Advanced Placement credit at a college.

But most mainstream introductory courses were designed with assumption that your computer has only a single CPU. The fact is, by the time my daughter graduates from college five or six years from now the average desktop computer may have 100 CPUs. So my question is: what language should we be teaching this new generation of computer scientists? My answer is I hope that they at least get some exposure to functional languages before their ability to learn new languages calcifies.

We all know that the best time to teach someone a foreign language is when they are young. When people are young, the sections of their brain that perform linguistic processing are still rich with unused neural connections. As we age the learning process destroys unused neural connection and learning radically new concepts gets harder. As the saying goes, you can't teach an old dog new tricks.

Google found that when they hired new college graduates, the single CPU mentality was rampant in the cognitive styles of many talented software engineers. But when they asked them to write software to take advantage of 10,000 CPUs they were sometimes too focused on imperative programming that did not scale well. Google embarked on an ambitious effort to teach functional programming skills to take advantage of their now famous MapReduce algorithm.

Today there is much discussion on how the next generation of software engineers can make the leap from imperative to functional programming, and if the current imperative languages can or should be modified to meet the complexity of the 100-CPU-era. One of the new contenders I would like you to consider is the XQuery language. Many people don't realize that we already have a mature near-functional language supported by IBM, Oracle and Microsoft. An although most XQuery developers are not using XQuery as a pure functional language today, XQuery has all the hooks we need to call higher-order functions and some implementations already make this possible today.

At the core of functional language is the fact that all "variables" are immutable. Once they are set, they can never be changed. Technically, in the highly mathematical world of lambda calculus they are called symbols. The construct x = x + 1 is not permitted because it changes the value of x. The novice programmer may feel that not being able to change a variable is a burden. But it turns out that this makes things much easier for people trying to get your code to run fast on 100 CPUs. But the "no side effect" nature of functional programs allows you to easily fork off 100 separate threads that will all work simply and elegantly together.

I do admit that when I first started writing XSLT programs I felt that not being able to change XSLT variables was a bit like having one hand tied behind my back. But I found out that getting used to using symbols rather then variables I really not that difficult, but it does require the user to rethink some of the problems in new ways.

One of the phrases that I use in my classes is All Computation is Transformation. Almost any calculation can be thought of as taking XML trees in and creating XML trees as output. Every language compiler takes high-level languages and input and transforms them into a low-level set of machine instructions.

Getting used to re-thinking your applications as a series of small transformations is not always something you can do on your own. I have spent many hours at whiteboards showing imperative-centric developers how to rethink their problems in terms of transformation. Getting at least one person on each development team that has a few years experience with XSLT or other functional languages is a good way to jump-start your projects.

The job of developing software is really to transform requirements into working programs. And along the way we need to be able to reuse the semantics and models you have already created in prior projects. If the requirements change, we would ideally like to just rerun the transformation process to create new working programs. This is the theory behind model-driven architecture and model-drive development. The challenge that many model-driven approaches have is that their models are not usually stored in easy-to-transform-xml. They are often locked deep in a UML diagram with little hope of the non-programmer doing day-to-day updates of the business rules. Agility is then defendant on how quickly requirements and models can be transformed.

Google search just takes and input string and transforms a representation of the web into a set of hits that rank pages it thinks you are interested in. No most of that transformation is done during their web crawling process, but it is transformation all the same.

If you start to try to visualize your tasks as transforming trees of data you have a good chance of visualizing your design. Learning how to quickly transform that spreadsheet of requirements into XML is a first step. Creating a team of people to quickly transform both data and metadata into new forms is critical to the agility process. But new technologies are coming on line all the time to make this process easier in the 100-CPU era.

Lastly we should note that query languages are not really the same as functional languages. The w3c XQuery specification did not include the ability to pass functions as arguments to functions, one of the hallmarks of functional programming. But this is not a difficult feature to add. Some XQuery implementations such as eXist do support this feature. And I am very optimistic that a future version of the XQuery standard will converge with many of the benefits that users see in using the MapReduce algorithms.


You might also be interested in:

5 Comments

>> If you start to try to visualize your tasks as
>> transforming trees of data you have a good chance of
>> visualizing your design.

When I started out as a CoBOL programmer we used the Structured Design Method. This consisted of rendering our input and output files as Jackson structures (sequences, iterations, selections), merging the resulting structures into a single structure and appending program tasks to the leaf nodes. This gave us the design for our programs.

Cheers, APC

I have experienced tree programming with XML in an opensource project named tXs (http://sourceforge.net/projects/txs). A tree program contains commands (or orders) such as transform this sub-tree or send an e-mail, call a web service,... and each command result is another sub-tree which will replace the command. The engine has to execute commands with a bottom-up order. It's a way to generate conditional instructions just with XSLT transformations. It is much more powerful than pipelines or linear list of transformations...

Tree programs can be executed on parallel architectures without modifications.

An example : command "txs:load" will load a CSV file as an XML subtree, command "txs:transform" will then transform each record into commands for sending personalized e-mail with both XHTML and text contents, and, finally, each "txs:sendmail" command will extract its own e-mail subtree and send each corresponding e-mail.

Comments (You may use HTML tags for style)

The best flash games wouldn't exist without changing variables, because we set them to zero, and have them change. Without variables, say goodbye to Madness Accelerant, Linerider, Linerunner, Bomboozle, etc.

In short, I disagree

@anonymous flash programmer:
You clearly do not know anything about "serious" programming.
You can do everything even without variables, by cleverly using function calls. AS, the interpreted language of Flash similar to JavaScript (and to ECMA Script, of course), is indeed a functional language, with features like closures that you obiously ignore.

News Topics

Recommended for You

Got a Question?