Mini210 boot from NAND

Dave Festing
I have read through a few threads that seem to indicate that there are
problems booting from NAND on the mini210.

Could some kind person summarise the issue as I am trying to make a
decision whether or not to buy a large quantity of these devices.

Thank you,
Dave

Reggie
Hi Dave, I can throw some light on this for you.

There was a perceived issue that showed up as uncorrectable ecc nand
errors, these were mostly caused by mismatched versions of bootloader vs
kernel, slight differences in the way they handle nand caused the kernel
driver to think good nand is bad, this was easily remedied by making sure
that the kernel/kernel sources/superboot come from the same software/source
code dvd and reburning, using the lowformat= yes option in the
friendlyarm.ini file.

That is the only major issue with the nand that I have seen, which turns
out to be a non-issue.

I have seen people mention other uncorrectable ecc nand errors but I
believe those are either existing bad blocks or normal wear and tear, and
has to do with how yaffs2 fs deals with the nand, when it mounts the
partition, it rebuilds the filesystem, yaffs2 scans the whole mtd
partition, in 1MB blocks, then it scans backwards looking for data, and
essentially doing a ton of 8k page reads too, so it will mark out all of
the bad blocks the first time you boot after a flash.

it does all of this in about 10seconds, you can speed this up by creating a
check point, yaffs2 is setup to create a checkpoint when you unmount, or
you can force checkpoint creation by running the 'sync' command at the
command line or from a script.

I have been testing the nand on and off over the last few weeks, I have
really been putting it through it's paces, any time I've seen errors on the
nand have been caused by me and are easily remedied.

Dave Festing
Reggie,

Thank you very much for the feedback.  I will now remove the highlighted
red areas from my purchase spreadsheet!


Cheers,
Dave

Reggie
you're welcome :)

Juergen Beisert
This SoC comes - like the S3C6410 SoC - with more than one type of error
correction unit. One for the older SLC-NAND (Single Level Cell). It
generates a short ECC and is only able to correct one bit errors and to
detect two bit errors.
Modern MLC-NANDs (Multi Level Cell) need much stronger protection. If one
cell fails, more than one bit is broken at this time. So, this SoC comes
with a second correction unit, with a stronger correction algorithm (Reed
Solomon).
And every piece of software (ROM code, bootloader and the running OS) must
be consistent using the same correction unit *and* the same checksum layout
*and* bad block handling. Do you can feel the pain?
For the Reed Solomon correction unit there is still no support in the
mainline linux lernel. It might be dangerous to rely on vender patches,
because you are stuck at their kernel.
I'm still fighting with MLC-NAND support for my Tiny6410. The ROM code of
this CPU destroys the factory bad block markers! m( And both devices (SoC
and NAND device) are from the same vendor...

TheRegnirps
ARMWorks with FriendlyARM is changing to 1G SLC only for Mini210s (or 256M
if an OEM wants it). This raises the pain threshold to reasonable levels
:-)

Reggie
The mini210S has got 1,4,8,12,16bit ecc correction, they're using 16bit
correction on the 210S boards, at least on the 4GB versions.  

In general, any nand driver has the potential to destroy the factory bb
markers, looking at the various versions of s3c_nand.c around the
interwebs, it would appear that samsung doesn't use BBT handling at all, so
it doesn't really care about the factory bad-blocks, it will just re-mark
them as bad when it tries to write something to them the first time and
fails.  It's potentially annoying if you've got bad blocks but certainly
not fatal, especially given that the yaffs2 fs does this on first boot
after a burn.