Return to Index
by LaFarr Stuart, died July 26, 2021

What makes the 1401 so interesting?
Last Up date: 2008 April 26
Recent changes: Corrections about console display of the Wordmark. And more detail about 1401 addressing.
Started 2006-06-25 21:33:44
I believe: Of all the computers ever been made, the IBM 1401 was/is the easiest to program. Why I say that is based on the following.

Unlike almost all other computers the 1401 has no data registers. I only know of three other computers that were similar. They were: IBM 1620, Honeywell 200 series, and the RCA 301.
The 1401 and the three mentioned above all used decimal arithmetic. And the 1401, 1620 and RCA 301 had decimal addressing of memory.

Because the 1401 only used three characters for addressing, addresses over 999 were a more complicated. The two zone bits over the high order digit, were used in a binary sense for overflows. This allowed for four "overflows" for addressing up to 3,999. In a similar way the two zones over the low order digit were used for four more "overflows" allowing addressing up to 15,999. Fortunately, the assemblers took care of this, and there was a hardware instruction to do address arithmetic.
The Honeywell 200's tried to have binary addressing, in the decimal machine. It resulted in confusion and lost compatibility with the IBM 1401 which it was marketed to replace.

The 1401 was truly a variable precision machine. The word mark was set to denote the high end of any data field. For arithmetic instructions the low end was denoted by the address in the instruction. The high order character and word mark were at the low address of the data area.
It was as easy to do thousand digit arithmetic as two digit arithmetic in a 1401. The first program I wrote for a 1401 evaluated the mathematical constant "e" to 1000 decimal places and printed out the result.

The "Op-Code" of every instruction was a single character; and the "op-code" was denoted by having its' word mark set. Instructions were also variable length, and could be from one to thousands of characters long. The previous sentence is technically true, but functionally seven characters were as big as instructions got. However a frequently used trick to "no-op" an instruction was to simple clear the word mark of its' op-code, which made the proceeding instruction often over 7 characters long.
Some instructions such as the Branch instructions used only one address, but many used two addresses. The addresses were limited to three characters. This made the addressing in the first 1000 characters very simple. For example to add the data at location 405 to location 915 the actual instruction would be: "A405915". Yes, for most of the instructions the op-code was the English language first letter of the instruction name. A for Add, B for Branch, C for Compare, L for Load, M for Move, S for Subtract. Of course some were not so obvious like: Comma for set word mark; but still very easy to learn.
The 1401 hardware had a built in memory dump. From the console any block of 100 memory locations could be printed. The characters were printed on one line, and on a second line the wordmarks were printed as a "1" under the appropriate character.
Typically to debug you would console dump the first 100 characters, then load a dump routine which would dump the rest of memory in the same format as the console dump.

On an IBM card there are 12 rows and 80 columns. A character can be punched into any column. The two rows along the top edge of the card are called "zone punches" and they are only used if the character is alphabetic or a punctuation symbol. The bottom ten rows are labeled zero through nine, and for numeric data only a single punch is punch in the appropriate row.

For alphabetic data, the letters A through I are punched with one of the digits 1 through 9; also the top zone punch is punched. The letters J through R are the same except the second zone row is punched. The letters S through Z are punched with 2 through 9 digit, and the zero digit is punched. For Alpha the zero row is like a third zone row.
This punching and printing the character at the top is all done by the keypunch.

At the time it was introduced the business and scientific world were all very familiar with the 80 column punched card. Every shop had key-punches, and a variety of other IBM machines. The Key-punches at the time were well refined for data entry, and far advanced beyond the only competition which was either a Teletype or a Flexowriter. Both of these captured the data in punched paper tape.

Punch paper tape was 8-level, and for any amount of data rolls up to about 8 inches in diameter were used. There was no paper tape facility to: re-arrange records, insert records, or have printed on the tape what was punch in the tape.
Correcting errors in paper tape is next to impossible. The entire roll has to be redone or some way broken into two rolls. Compare this to simply pulling a card from a deck, correcting it and replacing it in the deck.
After the tape was read, or just punched, it had to be rewound. This was slow and awkward at best. I never saw it automated. In fairness I should point out the Digital Equipment on most of their machines used "fan-folded" tape which did not have to be re-wound. But was limited in size, so it was good only for small jobs and programs.
For business applications, the data could be on many thousands of cards. Not feasible for paper tape. For example: Gas stations were entering the data for a credit card sale onto a card, these cards were sent to the company where literally hundreds of key-punch operators would punch the data printed on the card into the same card. Not feasible with paper tape.

In the 1401 memory is "character addressable" and each location has parity and 7 information bits. Parity is for hardware reliability and is not seen or available to the programmer.

On the 1401 console and in the manuals the character is displayed vertically. The bottom bit is a flag, called a "Word Mark". More about it later. The other six bits are the character, the two top bits are labeled A and B. The lower 4 bits are labeled 8, 4, 2, and 1.
The zone punches of the card map into the A and B bits of the character. And the numeric punches map into the bottom 4 bits. Anyone familiar with the card codes would know the characters in memory. Almost no learning of codes required.

Some may want to argue that the IBM 1620 or the PDP-8 are also candidates for the "easiest to program" computers ever made. I rate them after the 1401 for the following reasons:

The 1620 used memory resident Add and Multiply tables. And these had to be loaded correctly by the software. While this was nearly always done "automatically" by the software. Loading and guarding this was a significant complication the 1401 did not have.
The PDP-8's op-code was a single octal character. Fundamentally, it only had 8 instructions, which made learning the instruction set very easy. This is a bit of an exaggeration. One of the op-codes did not address memory so there were many variants of this op-code.
The PDP-8 (which was really just a hardware upgrade of the PDP-5) was a 12-bit binary word machine, which made it much more difficult than the 1401 for the typical accountant, who may never have ever heard of "binary".

A significant complication for the PDP-8 was that a loader had to be toggled in from the front panel to read in a program. On the 1401 no loader was retained in memory. The 1401 had a "Load" button which read in the first card and passed control to it.

Combine the above simplicity, with the almost undisputed fact that IBM's Card Reader, Card Punch, Printer were unmatched for reliability, and ease of use. For larger installations, up to 6 of the best magnetic tape drives made were available for the 1401. Is it any wonder that the IBM 1401 was the most popular computer in the world when IBM announced the 360 series? Possibly, there were at the time as many PDP-8's; but they were by and large small dedicated applications.

IBM introduced two variable precision machines at about the same time: The 1401 and the 1620. In many ways they were so different, I question that the designers ever spoke with each other. The 1620 had a "flag bit" but it was not nearly as elegant, or truly variable, as the 1401's "Word Mark". The 1620 did not have variable length instructions, and its' handling of character data was sad by comparison to the 1401; but it was designed to be reliable and low cost. It served that purpose fairly well.

It is a sad fact: Variable Precision architectures are today only history. Maybe some day a young designer will re-discover the "word mark" and give the world a truly variable precision machine using binary arithmetic, at least for addressing memory. Using today's fixed word length register machines it is very awkward to do variable precision arithmetic. A little hardware assist would make it so much easier--and I conjecture: faster. I would love to see a machine using a 12-bit byte each with a word mark and parity bit; and no registers for programers to worry about.