From LiteratePrograms
Jump to: navigation, search

I have published a book, that didn't sell very much and is now out of print, but now seems to be a bit of a collector's item:

Dunlavey, "Building Better Applications: A Theory of Efficient Software Development", International Thompson Publications, 1994, ISBN 0-442-01740-5

I'm wondering if this would be a good site to put up some pages about that material, in onesies and twosies.

It shares some of the same goals as Literate Programming, but has a different emphasis, which I will try to summarize:

- as wonderful as Donald Knuth is, his emphasis seems to be primarily academic computer science, so he is concerned with finely crafted (small) algorithms, and lucid presentation of them, to be understandable by student-level programmers.

- my background originally was in Artificial Intelligence, which is a hotbed of ideas that push the boundaries of computer science. Many current ideas today can be traced back to work in A.I., like the simple notion of a computer having stack instructions.

- other than a short stint in college teaching, my experience has been shaped by many years in industry and consulting, giving an appreciation for the complexity, richness, and novelty that can be found in the problems faced by the real world.

- in my book, I took the approach that for any measure you want to minimize, there is a theoretical minimum, and it is worthwhile seeing how close you can come to it. For example, minimizing execution time, or minimizing bugs. I assume that to do these things may require the reader/programmer to stretch their mind a little, or maybe a lot. If I were to confine myself to concepts familiar to an entry-level programmer, or even a C.S. professor, I might not be able to get closer to minimizing these interesting properties. So, in a sense, the book takes an aggressive approach.

- with respect to Literate Programming, I assume the goal is not simply to make a program understandable, but to make it map so well onto the reader's understanding that if he/she wanted to change it for some altered purpose, they could do it with minimum probability of introducing errors. I.e. I assume readability is a subordinate goal to modifiability.

- in finance, they say "nothing ventured, nothing gained". When a person writes a program, she is investing time, in expectation of the long-term reward of having the program save her time overall. If this were not so, there would be no need to move beyond hand-calculation. When a person learns a language, he is investing time, in the expectation that it will save more time in the long run by being able to implement programs quickly. If this were not so, there would be no need to move beyond machine or assembler language. Thus, I assume that minimizing execution time or bugs may entail a little or a lot of effort on the part of programmers, and I don't expect programs written this way to necessarily be readable without that effort having been made.

So I took all these ideas and tried to look at them from the viewpoint of information theory, because a program taking input and generating output is like an information channel, and when a person writes a program, the part of their mind that is doing that is like an information channel, and information theory talks about how to achieve optimum information transfer in the presence of noise. This resulted in a set of ideas that all interlock with each other:


[edit] speed

Techniques for achieving optimum performance. Profiling is fine for academic programs, but for big, ugly, industrial-strength software, you need deep sampling, and oddly enough, very few people know it. In multiple bouts of optimizing, often you find that to achieve ultimate performance, you need some of the techniques talked about below, and oddly enough, maintainability improves.

[edit] redundancy

Whenever a high-level change to anything requires more than one low-level change to accomplish it, you have redundancy. This goes for run-time data structures as well as source code of programs. In error-correcting code, minimal redundancy allows errors to be detected/corrected, if used with a suitable mechanism for doing that. If a code has a lot of redundancy, and there is no mechanism, the redundancy only makes errors more likely, so it is harmful. Balancing these two is tricky. Recognizing redundancy, both in data and source code, is an essential skill, and it is important to have ways to either minimize it or deal with it effectively. In the realm of data management, this leads to differencing algorithms. In the realm of dynamic user interfaces, this leads to the technique of differential execution.

[edit] language

Minimizing redundancy of source code requires being able to create the proper language for expressing ideas of interest. This may only mean being able to augment your favorite base language with data objects, functions, macros, etc. so that the functionality can be easily expressed. At the other extreme, it may mean being able to design your own syntax, write a parser and code generator for it, with maybe a novel control structure, and implement it that way. It is important to know how to choose sensibly between these extremes. To do this it is useful to know the techniques of lookahead programming and how to write simple parsers.

[edit] data

Data exists to store information for a length of time and be read by programs to do something. The specific way the information is encoded may vary widely, from bits, bytes, and pointers at one extreme to actually representing the information as a program at the other. The latter (which is not OOP) is known by the name of partial evaluation, and is very useful not only for maximizing performance but maintainability. When redundant data must be managed, it may be useful to be able to write simple garbage collectors.

[edit] method

How to tie all this stuff together. I had to call this something, so I called it the Linguistic Method.