More symptoms of a dodgy DSP card

Started by phonoplug, October 08, 2012, 10:04:55 PM

Previous topic - Next topic

phonoplug

I was running a batch of boards yesterday and it seems an occasional issue was getting significantly more frequent. Previously I found every now and then - perhaps once a day - it would suddenly loose track of its position. Sometimes its only a few mm away from where it should be, sometimes more than an inch. It just randomly happens in the middle of a run with the result of, at best multiple mis-picks then you stop it, or at worst, the head hitting something it shouldn't and very helpfully it dropping the head (z axis) and slowly locating it again. Usually you have to exit the program with the e-stop pressed and manually put the arm back to its resting position and run the whole startup thing again.

As it happened several times yesterday I tried to work out of the position error was just a single axis or a combination of two or more. It appeared to be just a y-axis error, and wouldn't you know it, the only original DSP card in my machine is the one for the y-axis. I'll now build a new one and replace it, hopefully this will fix the problem, but I will report back.

If it does fix it, then this seems to be another symptom of a DSP card on the way out, along with the other known problem that they fail to initialise when cold (LED not flashing 3 times when powering up the PC / errors when starting RV Place).

Mike

Have you tried reading eproms from dodgy cards in case it's bits dropping out when they get old.
Seems like a plausible explanation for gradual degredation over time.


phonoplug

Even if the problem is happening several times a day (at which point the system is heading towards being unusable) whats the chance that reading the EEPROM once - or even several times - in a programmer will produce the bad result when probably less than 1 read in a billion in the machine gives a bad read? I say 1 in a billion because even assuming the DSP reads just one instruction per micro-second 1 in a billion means 1000 seconds, equating to 16 minutes. And bearing in mind that the issue happens randomly in a run, not at exactly the same place each time, its a reasonable assumption to say its not related to a particular geographical position of the head which could vaguely possibly relate to a specific part of code in the EPROM. Besides which I suspect the DSP runs quite a small program just crunching numbers at a fast rate, and is probably not aware of the bigger picture of what the machine is doing.

I'm somewhat dubious of the design of the DSP board PCB anyway. There are several aspects that could give rise to problems over time.

Gopher

Interesting, I never came to a fully satisfying conclusion as to what was causing our head crashes ~ 2 years ago and at the time there was no suggestion of watching LED's and I failed to notice there were any. In the end we replaced the ARM board and the ISA card (but not the little cards's thereon) and for the most part the problem went away. However every now and then it does happen again. After the last cluster of them I changed the z-axis speed to slow for everything, suspecting the rack and pinion, and coincidentally or not the problem went away again. I don't think I've ever seen problems loading RVPlace however, at least no more frequently crashing than can be normal on Win98. However it would seem the symptom can be caused by any number of sensors cables or mechanical wear that might mean the machine loses track of the head and anything that helps find the right one is good.
Lucky for it, ours retires in 4-5 weeks, yet to decide exactly what its retirement should involve, ebay, sledgehammer, dusty corner....

Mike

Quote from: phonoplug on October 09, 2012, 10:08:42 AM
Even if the problem is happening several times a day (at which point the system is heading towards being unusable) whats the chance that reading the EEPROM once - or even several times - in a programmer will produce the bad result when probably less than 1 read in a billion in the machine gives a bad read? I say 1 in a billion because even assuming the DSP reads just one instruction per micro-second 1 in a billion means 1000 seconds, equating to 16 minutes. And bearing in mind that the issue happens randomly in a run, not at exactly the same place each time, its a reasonable assumption to say its not related to a particular geographical position of the head which could vaguely possibly relate to a specific part of code in the EPROM. Besides which I suspect the DSP runs quite a small program just crunching numbers at a fast rate, and is probably not aware of the bigger picture of what the machine is doing.
I should have added verify... over the Vcc limits, which a production programmer should do. if the programmed cell voltage is leaking away, it will start to fail at higher Vcc levels. In the system, the effective symprom is it becomes more sensitive to noise on Vcc, becoming increasingly flaky until it finally gives up. Obviously it takes the coincidence of noise at the time an instruction or data read of a bad cell happens.

phonoplug

OK so I can confirm this was caused by the DSP card. Was running a job the other day having now got a new DSP card built up and to hand. On the second panel it lost its position twice causing a small head crash. I then replaced the DSP card for the Y axis (the only one that was still an original) and completed that panel and placed a further 9 without it happening again. These panels take about half an hour each so I think thats a pretty fair conclusion.

nxm

I've a possible alternative answer to this kind of problem, as it happened to my machine. Though I realise you've solved it, this might help someone else.
The driver box died - can't remember exactly what happened to it - and was replaced by a newer recon one. After that, the head would occasionally lose position, and would need to be reset. It happened once or twice per panel.
I had a look in the driver box and saw that the power supply was indeed a new one, but really didn't look beefy enough to provide the impulse current you need to start a large stepper motor with, ie the ones that drive the pickup head. What was happening was that the PSU ran out of juice when starting the steppers, and they'd miss steps which caused larger and larger errors the longer the machine was running.
I replaced the PSU with a bigger one, and gave it a stuffing big audio electrolytic capacitor as well, on the basis that if the supply couldn't make up for a sudden load, the capacitor would.
It worked a treat.

phonoplug

Interesting... Thanks for that. Thats about the only part of the system I haven't taken apart yet. Will investigate.

Mike

For older systems, ageing PSU electrolytics could be worth checking for as well.