From: droelke@spirit.aud.alcatel.com (Daniel R. Oelke)
Newsgroups: comp.security.misc,comp.security.unix
Subject: History of Sendmail...
Date: 12 Nov 1993 12:42:06 -0600

This is an Interview with Eric Allman - author of sendmail.
It is blatently taken from ora.com's WWW server.  They
are pushing a new Nutshell book on sendmail.  

I think that this explains some of the history behind sendmail,
and why it is the way it is.

Enjoy,
Dan Oelke


Return to Sender: The sendmail Interview
****************************************

Tim O'Reilly interviews Eric Allman ===================================

Sendmail, the UNIX mail utility, has a reputation for being difficult
and frustrating. As a matter of fact, if you're a system administrator,
there's a good chance that you've cursed its existence once or twice.
Our customers have been requesting that we write a book on the topic,
and we're happy to oblige. The book took a lot longer than we liked.
(We went through three prospective authors before finding someone who
was up to the job; sendmail, our new book, will be published this
November.)

Recently, we had an opportunity to chat with Eric Allman, the creator
of sendmail. Eric was also quite involved in the development of our
book. He reviewed it chapter by chapter as it was being written, and
from start to finish when it was done. Eric even developed a new
version of sendmail, due in part to the questions and probings of Bryan
Costales, the principal author of our book on sendmail.

What follows is a conversation Eric had with Tim O'Reilly at our office
in California.

 o The Interview

Return to Sender: The sendmail Interview
****************************************

TIM: You told me once about a conversation you had with Dennis Ritchie
about your mixed feelings regarding your respective creations. Can you
tell us that story?

ERIC: I must have been out to dinner with some people at the Berkeley
UNIX workshop in Colorado. I said I felt very ambiguous about creating
something which I looked at and said "This is a monstrosity by almost
any reasonable definition." Yet, it has been very successful. Although
I look at it now and think, there are many things I would do
differently, there is no doubt that it has helped the world move
forward in some substantial way.

Dennis Ritchie's comment was "I know exactly how you feel!" We all look
back on our work and think, "My goodness, I certainly wish I had
realized what was going to happen." I was one of the people who was
using UNIX when the 5th Edition Manuals came out. They said "The number
of UNIX sites is now at 40 and rising rapidly," and we were one of
those 40! They had no comprehension that it would one day be numbering
in the millions. I had no comprehension that sendmail was going to be
running in millions of places. Instead of giving it away for free, I
just want ten cents from each site! That's not too much to ask, is it?

TIM: Can you tell us a little bit about how sendmail came about? I know
it's old history for you, but it may be of interest to our readers.

ERIC: The exact chronology is a little bit hard for me to get right, so
we'll just make this up as we go along. I was working on the Ingres
Project at Berkeley. This was about the same time that Ernie CoVax, the
11/780, the first 32-bit machine that Berkeley had, came in. I think
they had UUCP of some sort. There were a about a dozen UNIX machines
around on campus at that time. Eric Schmidt had written BerkNet, which
was a sort of UUCP style network, only instead of having dial-up lines,
it was connected all the time. It was a store and forward net. You
could do remote execs painfully, but really it was for file transfer
and mail transfer. Then the Arpanet came in, and that had a whole
different set of e-mail standards. The people working in the Arpanet
world at that point were mostly Tenex (later known as Twenex because it
became TOPS20). They used different mailbox formats, different mailers,
etc. So when you sat down on the Ingres machine, you had to use msg to
send Arpanet mail and Berkeley Mail to send to anywhere else, and you
couldn't send one message to both places at once. That didn't last very
long before hooks started to be made to send mail to other networks.
Those were mostly done by Kurt Shoens and other people surrounding him.
This was back in the days of Berkeley where all source code was
publicaly writable by anybody at anytime, and so if you wanted a new
feature, you put it in. Common courtesy demanded that you ask the
author first, but sometimes common courtesy was less than common.

TIM: When was that?

ERIC: About 1977. This was the Bill Joy period, when everything seemed
to be happening all at once. It was really very exciting. The hooks for
UUCP mail were done in /bin/mail. The hooks for BerkNet were done in
Berkeley Mail and the hooks for Arpanet were done somewhere else
altogether. Networks didn't talk to other networks. This was a clearly
unstable situation. People were starting to hack up different programs,
so that for example, if there was an "@" sign, the mailer would send it
off to the Arpanet. This quickly became unstable because you couldn't
say, "Oh, my address is this." You had to say to somebody, "What mail
program do you use? Oh! With that program my mail address is such and
such." It was far worse than you can possibly imagine. Add to that the
different mailbox formats. Fundamentally, if you were sitting on Ernie,
you couldn't send mail to the Arpanet, but you could receive mail from
the Arpanet. So we had a lot of people who wanted accounts on our
machine, and that started to become unstable because it was only an
11/70. I'm not really telling this very well, but I'm trying to present
this image of chaos.

Somewhere along in here, it became clear that this wasn't going to
work. I started to work on something that would fix the situation. I
spent a tremendous amount of time trying to figure out what to do.
Couldn't figure it out. One day I just said, "Fine, it's become
critical, I'm going to write the ad hoc code, and I'll worry about how
to do it right later." I started writing the ad hoc code. I can
remember where I was sitting, and the way the light came into the room.
I got out the pad of paper, and I started writing. I wrote about two
pages and I said to myself "Oh, that's the way to do it!" That turned
into delivermail, which was the precursor of sendmail. It had
compiled-in configurations, which was okay when you only had a dozen
hosts. It was all dependent on characters. If there's an exclamation
point in the name, do this. If there's an "@" sign in the name do that.
That sort of thing.

TIM: Sounds like sendmail was an outgrowth of just trying to create
order out of the chaos at that time.

ERIC: I think the really interesting programs are not written by people
who sit around and say, "Oh, let's think of the next new widget we can
build," but they have a real problem that they're trying to solve. UNIX
was in its best state when people were solving real problems that they
had then and there, as opposed to anticipating what somebody's problem
might be somewhere...sometime.

Anyway, the world continued to become more complicated. For example, at
one point, all UUCP links were connected to Ernie CoVax, so if you saw
an address with an exclamation point, you'd just send it off to that
machine. It didn't take very long before there were multiple machines
that had UUCP connections.

Back in the NCP [precursor to TCP/IP] days of the Arpanet, there were a
grand total of 254 possible sites on the network. You addressed
messages to, for example, "user@MIT-XX." It was small enough so that at
one point there was actually an RFC [Request for Comment document]
where they were talking about standardizing the reply codes for e-mail.
One weekend some guy connected to every single host on the network --
the entire network -- and checked to see what happened when he
presented them with certain inputs. A couple of days later he published
this RFC where he said 70% of the hosts do this, so let's standardize
it. Then TCP came along with 32-bit addresses instead of 8-bit
addresses. We came up with domains. We converted formats from RFC 733
to 822 message formats. We changed from mail being sent through the FTP
protocol to having its own protocol, SMTP.

Through all of this period, I was developing sendmail, and running as
fast as I could to keep it up-to-date with this week's version of the
protocols. There were literally new drafts coming out every week. The
configuration file turned out to be a very valuable tool because I
usually had to make minimal code changes, plus a few config file
changes. There were several times when a new draft of RFC 821 (only it
wasn't numbered then) came out, and the next day I had it implemented
and could provide feedback on how thus and such worked. So in some
sense, sendmail really helped the development of those protocols. There
were a bunch of things in the protocols that were simply
unimplementable, and they were going to standardize it. In a lot of the
protocols you find, typically with ANSI and ISO, a lot of things are
designed but never built.  That's why POSIX.1 really does work because
it's standardizing something we have been working with and playing with
for a long time. I mean, there are things in "DOT 1" I don't like, but
at least we know what's in there, whereas some of the "DOT n's" where
"n" is greater than 200, are just pie-in-the-sky magic things. I don't
think standards of that sort can really survive.

TIM: It's often been said that when you originally implemented the
format of the configuration file, you decided to make it easy for the
computer to deal with rather than easy for people to deal with because
you figured the computer had to read it often, whereas people had to
deal with it only once or twice. Is that a fair statement?

ERIC: True story here. My first config file was about fifteen lines
long. I actually came across it a couple of years ago and threw it out.
What an ass! At fifteen lines, who cares what the syntax is? My
philosophy at the time was that when I had to change something, I put
it in as an option. That way, I wouldn't have to go in and muck up the
code again, based on the assumption that something that changed once
was probably going to change twice. That turned out to be relatively
true. One day I printed it out again and it was 4 pages. How did this
happen? I should have gotten wise and said, "Wait! Something's wrong."
I also think people write config files that are unnecessarily obscure.

TIM: I think when people poorly understand something, its easier to add
to it than go back and take things out because you might break
something.

ERIC: Well, there is an awful lot of stuff that crept in over the
years. I looked at what Berkeley was shipping as I was working on this
latest release and looked it over and said, "I really do not understand
what this is." So I threw it all out and started over. I came up with
something that was about half the size.

TIM: It sounds like a lot of the problems in sendmail syntax are
related to the fact that it grew historically in response to rapidly
changing demands from a lot of corners. It really wasn't something that
was designed from an overall integrated perspective.

ERIC: I am not the sort of person that goes to bed at night thinking,
"Gee, I wonder what I can do to make life difficult for systems
administrators."

TIM: I think a lot of people will be relieved to hear that you have
suffered from your own creation.

ERIC: I've suffered probably more than anyone because I get the
weirdest problems. They show up in my mailbox. It's interesting to note
that sendmail -- because it tends to be so adaptive and so powerful --
has probably perpetuated more bad mail software than anything else
around. For example, at one point I spent some time looking at messages
as they came in off the wire, before sendmail got hold of them. I would
say almost half of the messages going over the Net are in an incorrect
format now, and sendmail fixes them before you ever see them. For
example, if you are using /usr/ucb/mail or mailx or whatever, and you
say

   To: tim eric

with no comma between the names, it just sends it out. Sendmail puts
the comma in. Because sendmail does that, nobody has found it essential
to go in and fix that damn user interface, which is just wrong. By any
measure it's wrong. (Well, that's not quite right. A long time ago --
when we didn't have any interoperability to consider -- we just used
spaces for the separators, just like we do on command lines. But today
we have standards.)

TIM: How different is the new version 8 of sendmail?

ERIC: What I really wanted to write was a whole new program tentatively
dubbed Son of Sendmail. I was going to completely restructure the code.
There are some things that need to be done -- for example, inverting
the way the queue is done so instead of processing messages, you
process hosts. You open a connection to a host and send everything
youve got for that host, and so forth. I haven't done any of those
things. But there is a huge list of changes and some new features. A
lot of them are performance enhancements, things like connection
caching. Let's say you have ten messages queued up all to be sent to
the same host, which is actually pretty common if it's a major host
that has been down for a few minutes. The old sendmail opened a
connection, sent the first message, closed the connection, opened the
connection to the same host, sent the second message, then closed it.
Connection caching says open it, send it, send second, send third, and
so forth, and close it when you're all done. That is not the same as
doing it right because all it really does is keep the cache of a small
number of open connections and uses it if it's there. But, in fact, you
can have connections open to multiple hosts at once.

I didn't want to make the level of changes in the code that would be
necessary to do it completely right.  Frankly, I wanted something that
would look familiar enough to people so that even if they hated it,
they would still say, yes, I recognize this, and they would be less
afraid to run it.

There are enhancements for new standards. The old sendmail is not RFC
1123 compliant, but the new one is.  I believe that those upgrades are
important. Once again, I have some disagreements with some things in
RFC 1123 but it is still a step forward, and I would like to see
vendors pick up a version of sendmail that will support these things.
Until the major vendors -- Sun, DEC, and HP -- pick up 1123 compliant
mailers, there is no chance that the Net will be 1123 compliant.

TIM: What made you decide to revise sendmail after all these years?

ERIC: There are several reasons. One was simply that we were trying to
put hosts into a CS subdomain at Berkeley. Berkeley has just gotten too
big. It doesn't work any more to have it all as one domain. That
prompted me to look at the code again to put in what is now the "user
database."

At the same time, Bryan Costales was writing the O'Reilly book on
sendmail, and he asked me if I would mind if he wrote it because he
figured that, after all, it was my book. I said something along the
lines of, "Please, be my guest." Bryan started writing the book, but I
agreed that I would review chapters because it is not to anyone's
advantage, least of all mine, to have incorrect information out there.
Believe me, there is a huge amount of misinformation about sendmail.
So, Bryan started passing chapters by me. He's the sort of person who
was really trying everything, just everything. He was trying things no
sane person would ever try. As he asked me to review more chapters of
the book, the changes started to get pretty serious. He'd find
something that didn't work and I'd say, "Yeah, you're right, that's
pretty stupid." So I'd fix it. A huge number of corrections resulted
from Bryan writing the book. It's really his fault.

