Transcript of Presentation Mitchell, R. W. (1964). LISP 2 Specifications Proposal. Stanford Artificial Intelligence Laboratory Memo No. 21, Stanford, California. Moon, David A. (1974).MACLISP Reference Manual, Project MAC Technical Report, MIT, Cambridge, Masssachusetts. Moses, Joel (1970). The function o f FUNCTION in LISP or Why the FUNARG Problem Should Be Culled the Environment Problem. MIT Artificial Intelligence Memo 199, Cambridge, Massachusetts. Newell, A., and Shaw, J. C. (1957). Programming the logic theory machine. Proceedings o f the 1957 Western Joint Computer Conference. IRE. pp. 230-240. Rulifson, J., et al. (1968). QA4--A language for writing problem-solvingprograms. Proceeding IFIP 1968 Congress, TA-2, pp. 111-115. Sussman, G. Winograd, T., and Charniak, E. (1970).Microplanner Reference Manual. AI Memo 203, AIL MIT, Cambridge, Massachusetts. Teitelman, Warren (1975).INTERLISP: Interlisp Reference Manual. Xerox PARC Technical Report, Palo Alto, California. Weisman, Clark (1967). LISP 1.5 Primer. Dickenson Press, Belmont, California. Many reports and memoranda of the MIT and Stanford ArtificialIntelligence Laboratories have dealt with various aspects of LISP and higher level systems built on LISP. TRANSCRIPT OF PRESENTATION BARBARA LISKOV: Welcome to the session on LISP. John McCarthy is going to talk about LISP. At the time o f the development o f LISP, he was an Assistant Professor in Communications Science at MIT. Previously he had received his Ph.D. from Princeton in Mathematics in 1951, and then had worked as Assistant Professor of Mathematics at Dartmouth. His interests at the time were in AI, time sharing, and theory of c o m p u t a t i o n - - e s p e c i a l l y program verification. Today, John McCarthy is Professor of Computer Science at Stanford University, and I believe that his interests are still the same. JOHN MCCARTHY: Well, what I have today I think will be maybe curious rather than instructive. N o w that you know about this history business, you could entitle it something like " T h e Opinions of the Late John M c C a r t h y . " I found writing it a great strain, and I look forward to the Conference on History o f Programming Languages that will be held 20 years from now when none o f the people who were involved in the particular languages will be allowed to give papers, and when one of us, so to speak, dodders up to the microphone and says something, people will listen politely and continue on with what they were discussing previously. First, there was this questionnaire [see "Questions sent to A u t h o r s " in Appendix B ed.I, and the questionnaire had various questions on the administrative history o f the project which can be summed up in " W h o e v e r said you could do t h a t ? " And so I would like to answer these questions with regard to LISP: No one ever said we could do it. As far as that part of LISP after September of 1958 where people actually began code things, the way it took place is that Marvin Minsky and I asked Jerry Weisner, who was the head of the Research Laboratory of Electronics, if we could have an Artificial Intelligence project. [Jerome B. Weisner is now President of MIT.] And he said, " Y e s , what do you n e e d ? " And we said, " W e need two programmers, a secretary, and a k e y p u n c h . " And he said, "Well, how about taking also six graduate students?" And we said, " O h , y e s - - g r a d u a t e students, t o o . " The Research Laboratory of Electronics had agreed to support six graduate students o f mathematics, and they didn't know quite what to do with them, so they LISP Session 185 r John McCarthy ended up graduate students on the artificial intelligence project. We did artificial intelligence, and it seemed reasonable to do the L I S P language at that time, so we did it. I should say that was when the country still had some basic trust in the judgment of scientists as to what science was likely to be useful, so Welsher had this large grant or contract with the Armed Services jointly, and so there never was any proposal that said, " T h i s is what we would like to do. Please let us do it." So that's a little different from the way things are now, and it had certain advantages. Maybe now there are too many people wanting to do things. I don't know. What I would like to do is mainly base this talk on the different features o f L I S P and how they got that way. Because L I S P forms a kind of neat whole, one might think that it was planned that way, but it really wasn't. It sort of grew and then, one might say, snapped into some kind of local maximum, or maybe a local minimum, in the space o f programming languages. I'd like to have the first slide [Frame 1]. What we have here is a LISP program in three different notations. The top one is the notation that I like to use for publication these days. As you see, it's the shortest. It's a function for computing a new list whose elements are alternate elements of the original list. So here's an example of the results of A L T , and let me read this definition. " I f the list is null, or ifcdr of the list is null (and cdr of the list is the list with the first element deleted), then you just take the list b a c k . " So a list of 0 or 1 element is changed into itself. Otherwise, you take the first element of the list, the cdr o f the list, and put it on the front of a list formed by taking the function A L T applied to cddr of the list, which is the list which is gotten by deleting the first two elements. So that's a simple example of a recursive definition of this kind of thing. This is the internal form of A L T as you would feed it to M A C L I S P , which is one of the current popular variants of LISP. This is approximately a 1958 form of publication notation. Now, I should say that this is a publication notation in that there n e v e r was a compiler that would read this notation. In the first place, when L I S P got started there weren't any brackets on the keypunch, and we needed to distinguish brackets from parentheses, and there was a sort of lot of dithering about how to translate from the notation that one would really like to something that could be really keypunched. While various things were done, none of them became very popular, and instead, this kind of notation became popular with some minor variants at the beginning about the first part of it, which I w o n ' t go into. But this part from the cond on is the form that it took early in 1959. Now, L I S P is characterized by a fair number of features, and I want to describe where they came from. The first one is computing with symbolic expressions rather than numbers. That came about because there was a goal to do artificial intelligence research, and the idea that facts about the world had to be represented as sentences in some kind of a air x ~ - - i f n x v n d x then x else a x . air dd x ( D E F U N A L T (X) ( C O N D ((OR ( N U L L X) ( N U L L (CDR X))) X) (T (CONS (CAR X) ( A L T (CDDR X)))) )) alt[x] = [null[x] V null[cdr[x]] ~ x; T ~ cons[car[x]; alt[cddr[x]]] alt(A BCDE) = (ACE) Frame 1 186 Part IV Transcript of Presentation formalism. Also there was a desire to do purely algebraic computations like differentiation and integration. The next is representation of symbolic expressions and other information by list structure in the m e m o r y o f the computer. I should have a slide on list structure, but I don't. I will have to assume that you know what it looks like. This idea came from Newell, Simon, and Shaw, and was represented in their various languages called IPL (1 through N). And they talked about some variant of this, I guess it must have been the first one, at the Dartmouth Summer Research Conference on Artificial Intelligence in 1956, and they described its various virtues; that is, the representation o f information by lists, which meant that the size of the information was variable; the ability to add and delete elements in the middle, which in fact LISP rarely uses; and the idea of recursive subroutines with a programmed stack. The language IPL itself was a machine language. If you think about the average of those languages that John Backus showed this morning, it sort of looked like one o f those. So while it had list operations as elementary, it didn't have any way of combining them algebraically. Representation of information in external media, mostly by multilevel lists and sometimes by S-expressions, came along naturally, later. Selector and constructor operations expressed as functions and composition of functions as a tool for forming more complex functions; well, this idea in some s e n s e - - t h e idea of using an algebraic language came from F O R T R A N . F O R T R A N didn't exist at the time. This was 1956. So there was the F O R T R A N Project. Then the question came as to what were the appropriate operations for operating on list structure. And the kind of list structure that was meant was something whereby a computer word was divided into two halves. One half was, so to speak, the data half, and the second half was the pointer half that pointed to the next word. That is, that gave the address of the next word in the list structure, and so forth. And when you wanted to extend lists, then you got new words from a free storage list, and somehow things got back on the free storage list when you didn't need them anymore, usually by erasure. The data item of a list could itself be a pointer to a list, and therefore a list as a whole was represented by the location of its lead element. This really was the N e w e l l - S i m o n - S h a w idea, and the idea o f using this algebraically was apparent once one considered trying to get something that had the notational advantage o f F O R T R A N . What the operations were gave some trouble. The most obvious thing was extracting parts of a word, and so that was used. And then it became clear that what you wanted to do was think of a register and m e m o r y as a function of its location; so there was a function called " c w r " - - C o n t e n t s o f Word in Register number so-and-so. So if you had a 15-bit quantity, that is, representing an address, as we were thinking about it for F O R T R A N , then C W R of that would be the 36-bit quantity which was the contents of the word in that. And then you could take it apart with these extracters, and it sort of gradually developed that, well, what you were really interested in were not these operations separately, but certain compositions of them. So one had contents o f the address part o f register number . . . . and writing it out or saying it out in that long w a y - - " c o n t e n t s of the address part of register number . . . " - - i s somewhat reminiscent of the mental agony that was involved in thinking that that was the thing that you really wanted. So it has this sort o f monument which still exists in the car. And then this ancient computer, the IBM 704, which was, in fact, at the time an aspiration rather than a fact (for Dartmouth, that is) had this other part of the word called the decrement part. So cdr came along there. LISP Session 187 r John McCarthy This much provided a basis for discussing doing list processing in F O R T R A N , and one of the results of the Dartmouth Summer Research Conference on Artificial Intelligence was that Nat Rochester at IBM, who had been one of the participants, decided to do some work on AI, and he hired Herbert Gelernter, and they took a problem o f Minsky's. Minsky's idea was that if you were going to do theorem proving the way humans did, then you ought to use examples, and in particular, you should not try to prove things that were false in examples. And the easiest domain in which to do this was elementary plane geometry, where you would use a diagram of some kind. Well, in any case, this involved symbolic computation where suitable list structures would represent geometric figures, and also would represent assertions about these geometric figures, like " t h e two sides were equal," or something like that. So some kind of list processing was in order, and it was decided to do this in F O R T R A N , a n d s o there was this F O R T R A N List Processing Language. On that project I served as a consultant. But after a while they began to get good ideas that I'd never even heard of. I would say the major one which is reflected in L I S P is the fact that the constructor operation, which was clearly necessary, that takes two 15-bit quantities, puts them in a word so that you can make more list s t r u c t u r e - - t h a t that should be a function whose value is the location of the new word and that list structure, or the structure representing algebraic expressions and so forth could best be done by composing such functions. That was Gelernter's idea. I still had a somewhat purist view of the distinction between functions and operations, and I was being rather stubborn about it, that that was a kind of subroutine that didn't return a value, and I showed that it should return a value. So at that point one had car, cdr, and c o n s . So we have composition of functions as a tool for forming more complex functions. And of course that's mathematics. Mathematics has always done that, but as far as computing is concerned, L I S P owes that to F O R T R A N . The next thing which is an innovation is the use of conditional expressions for getting branching into function definitions. So we want to use conditional expressions, from something which is in A L G O L - - i f p then a else b - - s o the value of it depends on the test. And that came up through an inconvenience in F O R T R A N . I was writing a chess program in F O R T R A N , and I k e p t having these things that were conveniently done by conditional expressions: i f p then a else b. But what you had to do in F O R T R A N was use this FORT R A N IF statement, which was particularly inconvenient because it was difficult to write and difficult to read, because you could never r e m e m b e r if you said IF an expression less than 0, then it goes this way, equals 0, it's that way, and greater then 0 . . . . One was always getting it wrong. So there was a great incentive to try to get away from that. And so I invented this F O R T R A N IF function, IF of three a r g u m e n t s - - I F ( P , A , B ) - but it had the blemish that in order to work you had to compute P, A, and B before you got started. And in the examples that were used in the chess program, this was merely a minor inefficiency, one of a large number of inefficiencies in the chess program, so it wasn't overly worrisome, but it certainly gave an aspiration to have a conditional expression that would be computed in the sense that if P was true, it would compute A, and would not compute B at all, and so forth. The recursive use of conditional expressions is a sufficient tool for building computable functions. That in a way is kind of a key idea of LISP. It was not in F L P L . There was a certain opposition to recursion among the F L P L people. I didn't quite understand why except that F O R T R A N didn't do it, and it would be very difficult to. We even considered 188 Part IV Transcript of Presentation mapcar[u, f ] ~--if n u then N I L else f [ a u] . mapcar[d u, f ] maplist[u, f ] ~--if n u then N I L else f [ u ] . maplist[d u, f ] diff[exp, var] ~ if a t exp then[if exp = var then 1 else O] else if a e x p = P L U S then P L U S . mapcar[d exp, hx.diff[x,var]] else if a exp = T I M E S then P L U S , maplist[d exp, hx. TIMES.maplist[d exp, hy.if x = y then d/ff[a y, var] else a y]] diff[(TIMES X ( P L U S X Y 3)),X] = ( P L U S ( T I M E S 1 ( P L U S X Y 3)) ( T I M E S X ( P L U S 1 0 0))) Frame 2 a project to make F O R T R A N do it by means of a kind of trick, a function that would modify the F O R T R A N program that called it. But that was in fact never implemented. The remaining features o f this arose in connection with an example [Frame 2]. This example in itself drove a number of the features of LISP, and that was to try to write a function that would differentiate an algebraic expression. The example here down below is that we have the expression which is really x times (x + y + 3), which is to be differentiated with respect to x. And the answer is given down here with no simplification. Notice that we have an inner product there, that T I M E S I ( P L U S X Y 3) which of c o u r s e - - s i n c e we have a product one of whose factors is 1 should be simplified, and then we have a sum, one o f whose terms is 1 and the other two are 0 which we simplify away, and so forth. But of course, simplification is in fact a much harder problem than differentiation. So we wanted to write that. And I've written it in kind of the modern blackboard notation, or publication notation that's going to be in my book, and so we wanted to differentiate an expression with respect to a variable. So if the expression is atomic, then it's either a variable itself or it's a number. In either case, unless it's the same as the variable with respect to which we are differentiating, the partial derivative had better be 0; otherwise the derivative is 1. So that gets us out there, and that looked very good, if you compare that with writing a program of some kind, and it seemed that it was at least reasonably close to the informal definition of differentiation. The next thing is: What if it's a sum? Its being a sum is represented by the first element of the list being +, so the test is fine. And if you differentiate a sum, then as e v e r y b o d y who's taken calculus knows, the derivative of a sum is the sum o f the derivatives. So obviously it's going to start out being + here. But then came a little problem: how do you talk about the sum of the derivatives, and in programming it, there were clearly two kinds of programs that could be written. One is where you have a sum of a fixed number of terms, like just two, where you regard a sum as a binary operation. And then you could write down the formula easy enough. But the other was where you have a sum of an indefinite number of terms, and y o u ' d really like to make that work too. To make that work, what you want to be able to talk about is doing the same operation on all the elements o f a list. You want to be able to get a new list whose elements are obtained from the old list by just taking each element and doing a certain operation to it. In order to describe that, one has to have a notation for functions. So one could write LISP Session 189 r John McCarthy this function called mapcar. This says, " A p p l y the function f to all the elements of the list." If the list is null then y o u ' r e going to get a N I L here. Otherwise you are going to apply the function to the first element of the list and put that onto a front of a list which is obtained by doing the same operation again to the rest of the list. So that's mapcar. It wasn't called mapcar then. It was called maplist, but maplist is something different, which I will describe in just a moment. That was fine for that recursive definition of applying a function to everything on the list. No new ideas were required. But then, how do you write these functions? And so, the way in which to do that was to borrow from Church's L a m b d a Calculus, to borrow the lambda notation. N o w , having borrowed this notation, one of the myths concerning LISP that people think up or invent for themselves becomes apparent, and that is that LISP is somehow a realization of the lambda calculus, or that was the intention. The truth is that I didn't understand the lambda calculus, really. In particular, I didn't understand that you really could do conditional expressions in recursion in some sense in the pure lambda calculus. So, it wasn't an attempt to make the lambda calculus practical, although if someone had started out with that intention, he might have ended up with something like LISP. Then we might go to the very next term where we want to differentiate a product. And a further complication immediately arises. It's clear that the derivative of a product is a sum of terms, each term of which is a product, and each factor of the inner products corresponds to one term of the original product that y o u ' r e differentiating. But the thing is either differentiated or not, and it's differentiated or not according to whether the index of the factor is the same as the index of the summand in which it's going to appear. Well, I've said that correctly, but probably too rapidly. But I w o n ' t say it again! So, anyway, it turned out that you needed for this purpose to be comparing not the elements of the lists themselves, but the sublists, which effectively compares their indices. Well, in any case, while I was spending a summer at IBM, the summer of 1958, I spent a fair amount of time writing that definition. I don't say it took me all summer to write it, but one might say pretty much. I forget whether I did anything else, but I didn't get any further on this project than just that. And so, in the fall of 1958, when the Artificial Intelligence Project started at MIT, LISP existed to this extent, in that we had the idea of this kind of a language. The there are some other ideas that are in LISP. One of them is the representation o f LISP programs as LISP data, and the fact that that could be done was always clear, but the idea that this had some high priority came about in a fairly peculiar way. The original way in which we started to implement LISP is that we had the intention of producing a compiler. But the F O R T R A N people were expressing their shock of having spent 30 manyears on F O R T R A N , which seemed like a lot at that time, and it didn't seem like these graduate students would hold still long enough that we could get 30 man-years of work out of them. So that seemed fairly formidable, and anyway, we wanted to do it right. So what was going on in the fall of 1958 was a lot of hand-coding of LISP functions into the assembly language for the IBM 704, and adopting a number o f conventions. For example, one o f the conventions that was adopted different from IPL was to use a linear stack rather than use a LISP-type list as a stack. So there were saving and unsaying routines for handling variables, and input and output was written. Then I decided to write a paper about this which was to be entitled " R e c u r s i v e Functions of Symbolic Expressions and Their Computation by Machine, Part I . " Now, there 190 Part IV Transcript of Discussant's Remarks was to have been a Part II, and Part II was to contain some applications to algebraic computation because I felt that we should really justify this by doing some integration and differentiation and algebraic simplification and so forth. Well, other people did that, and the Part II never appeared. But one of the contentions made in Part I was that this style of recursive functions was a nearer way of doing basic computability theory than Turing machines. In order to illustrate that contention, what was done was to write a function that played the role of the universal Turing machine; that is, a universal Turing machine in Turing machine theory is a machine that, when it is given the description of another machine on its tape, and also given space to represent blank tape, then it will carry out that computation that the other machine would have done. In LISP, it is the function eval, one version of which is given in the LISP Micromanual that is included in the paper [referring to the language summary - - e d . ] , but that function which was intended for kind of theoretical purposes, almost for rhetorical theoretical purposes, required deciding on a representation of LISP functions as LISP data which is approximately the one given there in the second expression. And I must say, that one was done without a great deal of thought, and in particular the cond is deeper in parentheses than is necessary. Now, when Steve Russell saw that one, he said, " W h y don't I program it, and we'll have an interpreter for L I SP? " And he did, and we did. That was a little bit surprising, somehow, that an interpreter suddenly appeared. There are a couple of other things, garbage collection as a means of handling the erasure problem, and also the fact that LISP is a reasonably good command language in a timesharing environment. I just want to finish off with one remark, and that is the question " H o w come LISP has survived this long?" because most of the languages that were created in that period are long since dead. People who have tried to make better languages than LISP to do the same thing have in some sense succeeded, but I've figured out why it survived so long, and that is that one of the things that everybody, including me, thought was a vice of LISP was, in fact, a virtue. So, therefore, some of the attempts to improve it went in the wrong direction. And the thing that was a virtue when everybody thought it was a vice was all these parentheses, that is, the fact that the primary way of programming in LISP is as LISP data, and this has made LISP very good for big systems in which programs produce more program and things of this kind. Well, I've used up my time without getting really beyond April 1959. It has actually had some additional history in the last 19 years that I don't have time to go into. TRANSCRIPT OF DISCUSSANT'S REMARKS BARBARA LISKOV: Our discussant on LISP is Paul Abrahams, who is one of those six graduate students that McCarthy had to take on in order to get his project going. He was a graduate student at MIT during the early development of LISP and worked on the original LISP system. His doctoral dissertation, "Machine Verification of Mathematical Proof," was one of the early applications of LISP. Today, Paul Abrahams is an Associate Professor of Computer Science at the Courant Institute of Mathematical Sciences. His interests are in compiler design, programming methodology, and programming languages. In the LISP Session 191

1/--страниц