MIni2440 Freeze

bluscape
I've got three Mini2440s, all running the same software and Qt application
but one of them freeze. It does not generate a segmentation fault so
somewhere the Mini is getting stuck and can't get out of it. I do have the
watchdog enabled to help recover from this problem but I want to resolve
this without having the watchdogva to "fix" the problem.

I've added debug values all over the code and output these values on the
UART to see where it freezes in the software but the pattern is not
consistent.

I'm starting to think it could be hardware related because two boards are
running fine and the other one doesn't.

I'm using barebox with PTXdist. 
Barebox and the kernel is executed from the NAND and the Qt application is
executed from the SD card.

1.) Do you guys have any idea how I can go about resolving this problem?
2.) Could the FLASH have bad sectors? How can I check for this in barebox?
3.) Could the RAM have bad sectors? How can I check for this in barebox?

Any ideas or help will be appreciated.

Thanks

bluscape
Nobody got any ideas?

davef
Can you run any programs on the bad mini2440?

2. As far as I am aware Barebox does a proper BBT setup, so this shouldn't
be necessary. 

Try running the kernel from SD as well?

3. There must be some method of when RAM is accessed for the machine to
know if it is working or not. 

http://computer.howstuffworks.com/ram1.htm

Other functions of the memory controller include a series of tasks that
include identifying the type, speed and amount of memory and checking for
errors.

bluscape
Yeah, everything works fine, only that it will freeze after a while. The
freeze vary from startup to a few hours and can even sometimes take several
days before it freezes.

davef
Is the ethernet involved here?

Juergen Beisert
Can you check if the size of the SDRAM the Kernel "believes" to have is the
size the system provides physically? Maybe you have a 64 MiB system and the
Kernel tries to use 128 MiB. And it always crashes when the memory gets
filled up after a while and the kernel must use the non-existing SDRAM
above 64 MiB.

davef
Or do a free command every once in awhile to see if free memory is being
slowly eaten up.

TheRegnirps
Sure sounds like hardware. Can you swap SD cards without the problem
following the SD? Are the boards mounted? Is it possible that there is some
stress or torque on the bad board so that with heating and expansion, a
cold joint or cracked trace produces a fault? Are you providing power with
the barrel jack or the 4 wire 2mm connector?

Ceiliedgh
Also, if you have a JTAG dongle you could take a memory snapshot when the
kernel is frozen and try to analyze that (although that's certainly not an
easy task).

LINUXX
Hi,
someone has already solved this problem of freeze??

davef
There are probably many reasons for this to happen.  How often does it
happen?  How much free memory do you have?  Does syslog say anything?  Can
you attach a remote terminal and hope something gets flashed up on the
screen?

USB involved? I had a problem with a mains power relay corrupting a USB
hub.  Sure you are not losing power?