Return to Related-Restorations
SWAT team helps Binghamton TechWorks!
IBM 1401 system, and the IBM 1311 of the IBM 1440 system
- A "SWAT" team from the IBM 1401 Restoration effort at the Computer History Museum, Mountain View, CA, flew to Binghamton TechWorks! (on their own money) to help with the restoration of 1401 system related equipment.
- The "SWAT" team consisted of George Ahearn, Carl Claunch, Frank King and Ignacio Menendez.
- On the TechWorks! web site CHM 1401 team consults with TechWorks! VICC team
Table of Contents:
Activity Reports from:
- Carl Claunch
- Ignacio Menendez
- George Ahearn
On Jun 11, 2017, at 3:53 PM, Carl Claunch
This mail provides more detail on the 1401 portion of the restoration effort.In addition to Judd and George, we worked with Don McCarty, Bill Green and Will Donzelli on the machine.
The 1401 system had clean power and could be powered on, but the local team had found it was not performing addition correctly. Certain digit sums worked fine, others were wrong. I built a short loop that would set up two fields with our desired input digits, perform an addition and branch to repeat infinitely. This gave us a means of using a scope and tracing the errors.
We saw that the red light on storage lit during the errors, then subsequently when restoring the result field to its input value we got the light on the B register. Further, we tried various digit values and quickly determined that our problem was a hot bit in the 1 position of the result. Any addition whose result was odd would appear to work properly, but an even result would be stored with the extraneous 1 bit on. 2 + 2 = 5
Like Iggy, we had problems with scoping, for example one of the two probes was so far out of compensation that we had huge overshoots distorting signal shapes. The museum desperately needs some push-on probe tips to insert over the backplane pins, such as the ones we use at CHM. Trying to clip on ordinary probes was vexing. They often fell off and once or twice they shorted and shut down the machine.
We followed the logic to find where the 1 bit was flipping on, beginning at the first gate that produced the Arithmetic 1 value from the qui-binary register inside the arithmetic unit. The adder was creating the proper value, so that 2 + 2 did not have a 1 bit set on. We followed it through the path to where the 1 plane of storage would be inhibited to write 0 or left free to flip to a 1 value.
I remember Ron Williams having commented on the sometimes counterintuitive naming of signals related to storage inhibit lines and we encountered exactly that. The output that drives the memory inhibit line is labeled +U Inhibit 1 which should be true when the core will remain at 0 and false when a 1 is set in the position. However, we saw that the line was at 0V (true) when a value of 1 is desired, thus it really should have been named +U Not Inhibit 1. This had Judd scratching his head in disbelief.
A number of gates are combined in a wired-OR to produce the +U Inhibit 1 level which determines what is written into the 1 plane of storage. For example, one may be gating the B register bit to storage or entering data from an IO device. The relevant gate in this OR was a 2-A0 which is a double negative AND gate. The two AND gates are ORed together to form the output of the 2-A0. Either AND gate's output is true only when both inputs are false.
One of the AND gates combined the 1 bit data entry toggle switch and the Manual Entry toggle switch as the two inputs. Only when the 1 switch is on, producing a -T value, and the Manual Entry is on, producing a -T value, will this AND gate produce its +U (true) output. The other AND gate was fed with the -T Arith 1 value coming from the adder, as one input, and its other input was a pulse labeled -T Arith Digit Inhibit. When the arithmetic unit has a result to store, it sets the -T Arith Digit Inhibit to true ( drops it to -6V). If the -T Arith 1 is also true (-6V), then the AND gate drives +U Inhibit 1 to write a 1 into core.
We saw that the wired-OR fed by this 2-A0 (and other) gates was going positive at the time when an arithmetic result was being stored. However, when we scoped the inputs to the gate the conditions to produce that level did not exist. The -T Arith 1 was positive voltage, which would make the AND condition false.
We swapped the card implementing the 2-A0, yet the problem remained. We looked at all the other gates which fed the big wired-OR, but none of them had a pair of inputs that would produce the output we were seeing. We swapped all those cards as well but the problem remained.
At about this point, the machine became quite ill. The program loop got parity errors on many characters of the instructions, we had parity errors even with manual entry and this was far beyond the hot bit issue affecting arithmetic. It took a while to figure out what was happening, whether the problem was in memory or in registers.
We determined that we were not storing the A bit in the B register, thus the parity became wrong. The wrong value was then written back into storage, so that any pass through the program or data fields stripped out all the A bits. We had to repair this before we could get back to the simpler arithmetic problem.
Some ALD work and scoping led us to the proper gate in the B register for bit A, saw that it was bad and replaced it with a spare. The A register was also failing to hold bit A, so another spare went in and the machine was now able to store properly, retain values in the B and A registers and we could go back to our program loop to debug the arithmetic flaw.
The original hot bit error seemed to be coming from the wired-OR section including the 2-A0 gate but we just couldn't see how it was happening. Card swaps with several good cards of the same type excluded card defects. We used a card extender and started checking a series of increasingly speculative causes since this wasn't making sense to any of us.
We checked the input and output signals on the card itself to exclude socket or finger issues. We checked the voltages to the socket and all the backplane wiring related to the socket. We looked at cards downstream from the wired-OR to see if a bad diode were causing a hidden path.
Suddenly the red lights went away and the value stored in the result field was correct - 2 + 2 is now 4. No more hot bit. It was the end of day and a picnic beckoned. The next morning, it was time to expand our verification of the machine. First step was to create fields for the addition that were more than a single digit long. Immediately the red lights came back.
What we found was that our hot 1 bit became a dead 1 bit. Arithmetic that should produce an odd value wrote a binary 0 into the 1 position and had a parity error. This time, the diagnosis was quicker and obvious.
The -T Arith 1 input to our 2-A0 gate was sitting at ground. T logic levels should be +6 or -6, never 0V. Time to trace how it got this way.This signal entered our 01A3 swing-gate from a paddle card at D23, pin F. We traced continuity from our 2-A0 gate input down to that pin F and it was good. The paddle card carries signals from swing-gate 01B3 where the arithmetic unit is implemented. We found the corresponding paddle card pin on 01B3 and still had continuity. From there, we checked back from the paddle card to the gate which produced our -T Arith 1 signal.
It was floating at ground right on the output of the C type gate that produced it, but the input pin for the same gate had valid levels. Swapping this with a spare fixed the problem. Now, arithmetic results had neither a hot 1 or a dead 1. We must have had a weird analog issue with the failing card that we couldn't see during the hot 1 symptom phase, but the failure progressed to one we could easily spot.
We worked through a number of tests. Different lengths between the A and B fields of the addition. Subtraction instead of addition. Zero and Add worked properly.Branch worked properly, as did Move. We did some Compares and validated that the conditions were set properly. Branch conditions were tested and worked right. Move Zone and Move Digit worked okay. Set and Clear Word Mark instructions worked. We didn't try an Edit nor run through every instruction and permutation but our impression is that the machine is healthy and working properly.
The Read instruction worked reliably in triggering the 1402 to read a card. Due to issues with the relays, false Reader Stop conditions and so forth, we couldn't test the Read instruction all the way through to memory at this time.
On Sun, Jun 11, 2017 at 12:16 PM, Ignacio Menendez < firstname.lastname@example.org > wrote: updated June 13, 2017
Version 3, with corrections from Frank :
Visit to the Binghamton Tech Museum June 7 through June 9, 2017.
We arrived at Tech Museum and were warmly received by Susan Sherwood, after introductions to her team we went right into work.....
George Ahearn, Carl Clunch, and Jud McCarthy solved innumerable 1401 CPU problems and gave them invaluable advice.
Frank King solved problems, gave them invaluable advise for their 1402 Card Reader/Punch, and did close to a full PM.
He also spent some valuable quality time with Don Manning, going through the steps on how to make a new chain, using their magnificent fixtures.
For more details on the above, please contact the other fellows of the CHM Restoration Team that were in attendance.
Don McCarty and Iggy Menendez
Worked on the first 729 NOR Tape to be installed on the 1401
- Biscuit cable connected
- Power 220 A/C connected, with the help of Jack Westemann after repair of 4 bent pins.
- power turned on, verified all voltage and phase for correct 3 phase motor rotation.
- Noted that a resistor on the SMS back panel hung out, with only one end connected; I asked Don for the ALD Logics, that I had emailed to him in PDF format. Unfortunately , these were not available yet, due to their very busy days, and just then the Tech people were able to print them.
In the meantime I went through the PDF file on a laptop, incredibly tedious, time consuming, and finally found where the other end of the resistor should be, and plugged it in.
- To test the phasing I attempted a manual load that did not work; I manually partially loaded the head, powered down and back up, which automatically causes an unload; the tape take up reels worked, rotating in proper direction, but the head motor did not turn at all. Don was ready to swap the power leads, but was not needed, since the motor rotation was OK.
I moved the head motor plug into the tape take up reel motor socket and verified that the head motor works.
From the PDF ALD I determined that the relay controlling the head motor is not picking.... Could not continue without printed ALDs.
The correct NOR ALD for the Mod 5 was finally printed, but too late on last day to do further trouble isolation. Notified Don McCarty how far I got, and where to continue.
I also recommended that they print the FE Maintenance Manuals for both NOR and old relay type 729s. These contain extremely valuable flow charts, to trouble shoot the tape drive's manual operator Panel actions.
Did not get a chance to check, or even get near the TAU, for all the people busy troubleshooting the 1401. Besides, we do not have a 729 to use it with.
Located the 2 correct part numbers for the belts that are missing, and gave them to Susan. Frank notes that only one belt is needed for punch cb timing, the other is for center pocket drive, which is rarely used on the 1402.
1311 Disk Drive part of the 1440 system
Working with Bill Green and Fred Petras:
The Ready Light turns on, but the drive could not be accessed from 1440....
Bill and Fred had already noted that the actuator hydraulic oil reservoir is about half inch low; they are in the process of procuring more oil.
The oscilloscope that we used, unbeknown to us, was erratic, made us waste a few hours, until we realized this....
Don McCarty was kind enough to lend us his personal good scope, he went all the way down to his car to get it.
We could not remove the 1316 pack presently sitting in the 1311, since their only unused pack handle was busted.....
I showed Bill and Fred how to remove a good handle from another pack, by pressing the small hole underneath the handle's long shaft, then removed the 1316 pack from the 1311.
Inspected and cleaned al the heads with the lint free cloth sticks that we brought with us. (and left them some for their use) All heads are OK, as is the surface of the pack that we were going to use.
Found that R/W heads are not loaded, so they could not possibly read data from the 1316 packs.
Noted that they do not have a CE Pack available. They may get by OK, until such a time, if and when, they need to make head alignment adjustments.
- scoped the load problem to the 'heads extended' micro-switch. We removed it, and tested both halves with meter, all contacts on both halves test OK.
Luckily, I had brought with me a 2311 FE Maint. Manual, where it explains how to adjust this switch..... the 1311 manual fails to tell the CE how to do this....
After correct adjustment, the actuator moved fast forward and loaded the heads correctly, BUT would not seek back to home position, as expected.
- Further troubleshooting revealed that relay 3, that causes the heads to return back to home was not showing the NO points not making contact.....
The problem that Bill and Fred found with relay 3 was, not that the armature was blocked by anything, but that a wire connecting to an NC contact on the relay was bent back such that it connected to that pole’s NO contact lug also.
Thus, when the relay was activated, the pole’s common broke the connection to the NC contact but then reconnected to the same wire via the NO contact.
Therefore the connection was broken just for the time it took for the armature to pull in....
After straightening the wire out of the way, the relay started working fine.....
We powered back up, and now, to out dismay, the actuator would not move at all !
Apparently, in the midst of all this, we broke off a wire going to the above mentioned 'heads extended' micro switch.
Fred and Bill re-soldered the wire and VOILA.....
the disk now comes up to speed, heads load, return back to home, and advances correctly to cylinder 0.....
also, the Drive Ready light now turns on at the correct time. (It was coming on too early before, because of the incorrect adjustment of the switch).
Bill now attempted to access the 1311 from the CPU with his program, but experienced a 1440 memory addressing error. They will try to use the 1311 as soon as they fix this 1440 problem.
I mentioned to Bill and Fred that the drive belt is oily. Needs to be replaced if a new one can be procured...
in the meantime, it could be carefully dried with paper towels AND NO SOLVENTS, for the belt has a carbon backing that is used as the discharge path for static build up, that can cause data checks.
This was all that I was able to accomplish on the time available.
The Tech Museum, Susan, et-all, were extremely helpful and provided us with what we asked for, as well as lots of food and refreshments.
Now for the fun part.....
They also gave us a tour of their facility and displays, as well as the Endicott City Museum, that holds many original and unique old items, even predating those made by the precious companies that became IBM.
The museum also has many items from the other companies that used to reside in Endicott, such as a Norden bomb sight, Johnson Shoes' products, the IBM produced M3 rifle during WWII, etc.
At the Tech Museum, we also rode the Link Flight Simulator, and were shown the Apolo simulator that was actually used to train the astronauts that travelled to and from the moon; this was extremely interesting, and will be better yet, when they can obtain the Very High intensity lamps required to make this simulator work.
We also had a very nice dinner at the Session Italian Restaurant, with three IBM original design members friends of George....
Byron Rucker, the designer of the TAU-2 like the onesthat we have at CHM,
Paul Farbanish, and were entertained by the fantastic anecdotes of Jud McCarthy.
In a perfect world, my best wishes for the Tech museum would be :
- Different Location, Location, Location !
- Nicer, newer environment.
- More oscilloscopes available, both storage, and standard easier to use ones.
- More probes, suitable for attaching to the SMS pins.
- SMS plastic pin overlays, for ease and assurance of pin being scoped.
- Increase availability of spare parts.
- Many, many more, high intensity LED flashlights, perhaps one for each member that is involved in the physical Restoration and maintenance.
I thank all parties for giving me this opportunity to go to the Tech museum, meet the fine people there in person, and do something useful, I hope.
Iggy this part will fill in your report for the CPU
On arrival if the expected result of addition was an even digit the actual result was the even digit plus one. So, the idea was to find the cause of a hot '1' bit. Most of that day was spent scoping the path between the adder and memory. That included an attempt to swap digits '1' and '2'. Don McCarty provided a homemade card extender but he needed to rewire it with longer leads. After using it several new problems showed up, all caused by an existing short circuit on the extender. For a while there was low confidence in the extender. Eventually we were able to shift the problem from a hot '1' bit to a hot '2' bit.
The next day circumstances changed in that the adder output was missing a '1' bit and at that time we had no idea why. Debugging was slow because printed ALDs were an incomplete set. If a page was missing we used a Laptop to access Ed Thelen's collection of ALDs from the Connecticut system to read that page. That usually worked.
Using a scope with one trace on the 'inhibit' signal and the other on the 'arith 1' signal it became obvious which logic block was failing. Fortunately there was a spare card and the problem was solved. (See ALD 184.108.40.206) Next multiple digits and various combinations of field length additions were checked out. After that subtractions, then branches, compares, moves, etc.
Next attention turned to the 1402 reader. Without much effort a 'Read' command causes a card to pass through and the storage address ends at '81' but there are errors
One of the unknowns of this 1401 is which features are installed. Other than the obvious, eg tape, no one knows.