• Problem With Old Zyxel NSA 221 NASs & Seagate HDs

    From Java Jive@2:250/1 to All on Monday, May 05, 2025 10:24:09
    Having successfully upgraded my two primary QNAP 251+ NASs, I've handed
    down two of the HDs to the Zyxel NSA221s NASs that were my original
    backup solution. The disks are Seagate Iron Wolf 8TB ST8000VN004,
    originally from Amazon ...

    https://www.amazon.co.uk/dp/B07SZVVBBK

    .... but they are giving problems in their new home, or, I should say, housing.

    The problem is the same for each, they spin up too slowly to be found as
    the NSA221 boots, so after a cold boot I have to reboot, so that then
    they can be found on the reboot. Once that is done, they seem to be
    perfectly satisfactory, and, despite the NAS specs saying that they only
    work up to a max of 4TB per disk, I'm actually getting the full
    (nominal) 8TB added to the capacity of the other Toshiba HDs which have
    always been free of such problems.

    I've had similar problems in the past with other Seagate HDs in these
    NASs, which at the time I got around by using a reboot flag in a rather convoluted manner, and which now won't work anyway because I have since configured the NASs to use the combined disk space of the two disks as
    one large virtual HD volume, which means that now I have nowhere to
    write a reboot flag unless both HD are running and found at boot, in
    which case I wouldn't need to write it anyway. I might be able to get
    round this by using RAM, but it would need some investigation as to what
    would survive a reboot.

    This reboot requirement will be easy to forget and should be avoidable
    by making the system wait longer for the disks to spin up. Setting
    aside the problem of altering firmware (see next para), in a normal
    Linux installation, how normally would one accomplish this? A boot
    parameter? An /etc setting?

    I have some scope for making changes, as in the past I've recompiled the firmware to be Gnu GPL and both NASs are running the result. Also the
    command run by UBoot is stored in an environment variable, which means
    that I could add a boot parameter to that fairly easily.

    The Zyxel DK to build a GPL firmware dates from the Ubuntu 7 era (!),
    but I was still running it satisfactorily somewhere around Ubuntu 16 or
    18. Also, the NASs each have a serial header which I can use to
    interrupt their boot and change things, though I wouldn't want to be
    doing that on a permanent basis, only for temporary fixes to see if they
    work. There are also OpenWRT versions of the firmware, but having
    obtained already a pretty good result from my own firmware, I haven't
    gone down that road, as it would be too time consuming for such old
    hardware.

    Can anyone suggest how to fix this slow spin-up problem?

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Jeff Gaines@2:250/1 to All on Monday, May 05, 2025 10:53:06
    On 05/05/2025 in message <vva03q$5ulg$1@dont-email.me> Java Jive wrote:

    Can anyone suggest how to fix this slow spin-up problem?

    Seriously?

    How much is your data worth, more or less than £163.94?

    --
    Jeff Gaines Dorset UK
    Did you know on the Canary Islands there is not one canary?
    And on the Virgin Islands same thing, not one canary.

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Monday, May 05, 2025 11:06:25
    Java Jive wrote:

    Seagate Iron Wolf 8TB ST8000VN004
    Can anyone suggest how to fix this slow spin-up problem?
    I see several complaints about that drive being slow to spin-up, the
    manual says 23s (typ) to 30s (max), so I suspect there's nothing you can
    do to the drives themselves.

    in your grub.cfg maybe experiment with boot_delay=xxx values in ms to
    delay all the kernel messages?

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Monday, May 05, 2025 15:13:25
    In uk.comp.os.linux Andy Burns <usenet@andyburns.uk> wrote:
    Java Jive wrote:

    Seagate Iron Wolf 8TB ST8000VN004
    Can anyone suggest how to fix this slow spin-up problem?
    I see several complaints about that drive being slow to spin-up, the
    manual says 23s (typ) to 30s (max), so I suspect there's nothing you can
    do to the drives themselves.

    in your grub.cfg maybe experiment with boot_delay=xxx values in ms to
    delay all the kernel messages?

    boot_delay appears to slow down kernel printouts, which is likely to be a
    bit dependent on how many there are. Instead I'd use rootdelay=30 to delay
    the kernel start by 30 seconds, and then it'll boot at full speed.

    (It'll be in the u-boot command line variable not grub.cfg, since this is a non-x86 device)

    In regular Linux you could write a udev/systemd rule to mount the drive(s)
    when it/they became ready, whenever that might be. This system is too old
    for systemd though.

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Paul@2:250/1 to All on Monday, May 05, 2025 15:19:33
    On Mon, 5/5/2025 5:24 AM, Java Jive wrote:
    Having successfully upgraded my two primary QNAP 251+ NASs, I've handed down two of the HDs to the Zyxel NSA221s NASs that were my original backup solution.  The disks are Seagate Iron Wolf 8TB ST8000VN004, originally from Amazon ...

    https://www.amazon.co.uk/dp/B07SZVVBBK

    ... but they are giving problems in their new home, or, I should say, housing.

    The problem is the same for each, they spin up too slowly to be found as the NSA221 boots, so after a cold boot I have to reboot, so that then they can be found on the reboot.  Once that is done, they seem to be perfectly satisfactory, and, despite the NAS specs saying that they only work up to a max of 4TB per disk, I'm actually getting the full (nominal) 8TB added to the capacity of the other Toshiba HDs which have always been free of such problems.

    I've had similar problems in the past with other Seagate HDs in these NASs, which at the time I got around by using a reboot flag in a rather convoluted manner, and which now won't work anyway because I have since configured the NASs to use the combined disk space of the two disks as one large virtual HD volume, which means that now I have nowhere to write a reboot flag unless both HD are running and found at boot, in which case I wouldn't need to write it anyway.  I might be able to get round this by using RAM, but it would need some investigation as to what would survive a reboot.

    This reboot requirement will be easy to forget and should be avoidable by making the system wait longer for the disks to spin up.  Setting aside the problem of altering firmware (see next para), in a normal Linux installation, how normally would one accomplish this?  A boot parameter?  An /etc setting?

    I have some scope for making changes, as in the past I've recompiled the firmware to be Gnu GPL and both NASs are running the result.  Also the command run by UBoot is stored in an environment variable, which means that I could add a boot parameter to that fairly easily.

    The Zyxel DK to build a GPL firmware dates from the Ubuntu 7 era (!), but I was still running it satisfactorily somewhere around Ubuntu 16 or 18.  Also, the NASs each have a serial header which I can use to interrupt their boot and change things, though I wouldn't want to be doing that on a permanent basis, only for temporary fixes to see if they work.  There are also OpenWRT versions of the firmware, but having obtained already a pretty good result from my own firmware, I haven't gone down that road, as it would be too time consuming for such old hardware.

    Can anyone suggest how to fix this slow spin-up problem?


    I don't really see a "fix" for this.

    The disk drive has a motor controller. It is a three phase waveform
    generator in a sense. It is somehow programmed for an acceleration profile,
    and that profile leads to a "constrained current draw". When the spindle
    is seized, there is a modulation pattern applied to the waveform generator giving a characteristic sound, but this is mostly useless and too little
    torque is generated to overcome a dry bearing or head stiction.

    For such a slow startup, it could be drawing 12V @ 1A into the motor
    controller chip. The frequency of the waveform generator ramps up, and
    the motor accelerates. Current is drawn to make the motor accelerate
    like that. It looks like the current draw is still open loop (the tops
    of the current draw excursions are not lopped off), so the current
    draw target is mostly maintained by the firmware watching for overload.

    Three phase is used, for controlling torque ripple. The signal coming
    off the head, is supposed to be at a fixed frequency for a zone. The
    disk has zones, and the drive has to compensate for where it is reading
    data. But while it is reading data, they don't want the clock rate
    to vary. And the motor controller attempts to regulate the speed and
    reduce the variation in speed. The servo wedges presumably have a strong
    clock signal, and afford an opportunity to keep the PLL locked. The platter alternates between servo wedges and data sectors, as the heads stay on
    a cylinder.

    1TB drives can accelerate to 7200 RPM in five seconds. Hitachi-HGST-IBM
    drives, are pokey, and can take more than 20 seconds for spinup.
    (No other brands are supposed to be quite as pokey as some of those!
    WD owns the remains of that stream of ownership and owns HGST.)

    The five second drive, likely draws 12V @ 2A for five seconds. The
    pokey drive draws 12V @ 1A for 20 seconds. The assumption is that
    the slow drives are not "boot drives", whereas the fast drives gate
    forward progress, so the power footprint is allowed to be larger.
    At one time, the whole disk drive power envelope was 40W total,
    and drives required laminar airflow over the top to keep the
    temperature down. Modern drives at idle, can be as low as 5 watts idling
    (not as much current needed to maintain constant platter speed).

    You're not going to change the disk drive firmware. The technical details
    of that are unknown. Even the command line language for the disk
    interpreter is cryptic and mostly unknown by laymen. Some of the language
    was exposed, in a Seagate data recovery procedure for a broken firmware.

    *******

    The PC compatible BIOS, has a timer setting, and the max time is 35 seconds. This is the time constant the NAS should be using at its "BIOS level".
    That is, if it had a BIOS level.

    A drive will NOT respond to an ID command, until up to speed. The re-enforcement for this is simple. The drive won't do anything until
    the motor controller chip says "we are now at 7200RPM". Then the heads
    load onto the platter edge, using the bootstrap HDD controller ROM.
    Then around 2 megabytes of additional firmware is read off the
    platter, as the first read operation. This is loaded into
    controller RAM. The 2 megabytes would contain items such as an
    ATA command parser. Now, when the "ID yourself" command comes in,
    the drive is willing to answer. "Seagate Pokey Drive ABC12345".

    Normally, the PC BIOS waits the 35 seconds, using repeated ID yourself commands, to eventually get an answer.

    *******

    The system response then, to a rotating drive, whether done by
    NAS BIOS or NAS OS, is to wait up to 35 seconds for ID to complete.

    *******

    The SATA bus has no RESET signal. (The IDE ribbon cable interface
    did have a hardware RESET, which is handy for getting insane
    controller processors to restart. The only way to fix insanity
    on SATA drives, is to cycle the power!)

    This is why your reboot strategy works. When the NAS is told to
    reboot, the drive is up to speed, and the drive is in a sense,
    unaware of the system state. It continues spinning at 7200 RPM,
    the command parser is loaded, the very first "ID yourself" that
    comes in during boot, will be answered. That is why your current
    double-boot method is working.

    Paul

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Monday, May 05, 2025 15:37:51
    Theo wrote:

    boot_delay appears to slow down kernel printouts, which is likely to be a
    bit dependent on how many there are.

    hence "experiment"

    Instead I'd use rootdelay=30 to delay
    the kernel start by 30 seconds, and then it'll boot at full speed.

    I read that as waiting xx seconds before mounting the root fs, but if it
    can't see any disks, will it wait, or bail-out?



    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Monday, May 05, 2025 20:02:44
    In uk.comp.os.linux Andy Burns <usenet@andyburns.uk> wrote:
    Theo wrote:

    boot_delay appears to slow down kernel printouts, which is likely to be a bit dependent on how many there are.

    hence "experiment"

    Instead I'd use rootdelay=30 to delay
    the kernel start by 30 seconds, and then it'll boot at full speed.

    I read that as waiting xx seconds before mounting the root fs, but if it can't see any disks, will it wait, or bail-out?

    It'll pause the boot for 30 seconds at the point the root fs is mounted.
    The rootfs is not on HDD, it'll be in flash, so the boot will then proceed
    once the 30s is up - it won't then fail for lack of a rootfs. If the discs still aren't up at the end of 30s (or 120s or whatever number you write
    there) then they will be missing, just as they are at the moment. But if
    they are not reliable enough to start given a large enough timeout then that points to a problem with the discs, rather than just a regular but slow
    spinup.

    (if you did have your rootfs on the HDD you could use 'rootwait' to pause
    until the rootfs volume was ready. But that only applies for the rootfs and not other volumes)

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Tuesday, May 06, 2025 12:19:15
    Apologies for a slight delay in replying, thanks for all the replies ...

    On 2025-05-05 20:02, Theo wrote:

    In uk.comp.os.linux Andy Burns <usenet@andyburns.uk> wrote:

    Theo wrote:

    Instead I'd use rootdelay=30 to delay
    the kernel start by 30 seconds, and then it'll boot at full speed.

    I read that as waiting xx seconds before mounting the root fs, but if it
    can't see any disks, will it wait, or bail-out?

    It'll pause the boot for 30 seconds at the point the root fs is mounted.
    The rootfs is not on HDD, it'll be in flash, so the boot will then proceed once the 30s is up - it won't then fail for lack of a rootfs. If the discs still aren't up at the end of 30s (or 120s or whatever number you write there) then they will be missing, just as they are at the moment. But if they are not reliable enough to start given a large enough timeout then that points to a problem with the discs, rather than just a regular but slow spinup.

    Don't think that's the problem here, rather Seagate generally just seem
    to be too slow in spinning up for this box's liking.

    (if you did have your rootfs on the HDD you could use 'rootwait' to pause until the rootfs volume was ready. But that only applies for the rootfs and not other volumes)

    I spent yesterday evening trying to remember the name of the program
    that reads and sets UBoot environment variable, and finally found it
    this morning. Here are the current settings:

    ~ # /zyxel/sbin/fw_printenv
    bootcmd=cp 0x411e0000 0x4a000000 0x25c000; bootm 0x41020000
    baudrate=115200
    autoload=n
    netmask=255.255.255.0
    bootfile="uImage"
    MODEL_ID=DC01
    PRODUCT_NAME=NSA-221
    FEATURE_BIT=00
    CONTRY_TYPE=FF
    VENDOR_NAME=ZyXEL Communications Corp.
    ethaddr=50:67:F0:93:49:90
    ipaddr=192.168.1.233
    tftpblocksize=512
    tftprun=cp 0x411e0000 0x4a000000 0x25c000; tftpboot 0x57000000 uImage;
    bootm 0x57000000
    stdin=serial
    stdout=serial
    stderr=serial
    bootargs=console=ttyS0,115200n8 root=/dev/ram0 rw init=/sbin/init initrd=0x4a000000,4M elevator=cfq mtdparts=physmap-flash.0:128k(uboot),1792k(kernel),1664k(initrd),448k(etc),48k(empty),8k(env1),8k(env2)
    mem=256M poweroutage=yes
    serverip=[anonymised]

    There is a corresponding /zyxel/sbin/fw_setenv which doesn't have a
    --help parameter, but which nevertheless seems understandable enough:

    ~ # /zyxel/sbin/fw_setenv test "This is a test"
    ~ # /zyxel/sbin/fw_printenv test
    test=This is a test
    ~ # /zyxel/sbin/fw_setenv test
    ~ # /zyxel/sbin/fw_printenv test
    ## Error: "test" not defined

    So I guessed that I needed to try adding 'rootdelay=30' to the bootargs setting ...

    ~ # /zyxel/sbin/fw_setenv bootargs "console=ttyS0,115200n8
    root=/dev/ram0 rootdelay=30 rw init=/sbin/init initrd=0x4a000
    000,4M elevator=cfq mtdparts=physmap-flash.0:128k(uboot),1792k(kernel),1664k(initrd),448k(etc),48k(empty),8k(env1),8k(en
    v2) mem=256M poweroutage=yes"
    ~ # /zyxel/sbin/fw_printenv bootargs
    bootargs=console=ttyS0,115200n8 root=/dev/ram0 rootdelay=30 rw
    init=/sbin/init initrd=0x4a000000,4M elevator=cfq mtdparts=physmap-flash.0:128k(uboot),1792k(kernel),1664k(initrd),448k(etc),48k(empty),8k(env1),8k(env2)
    mem=256M poweroutage=yes

    .... but, from the sound, I was suspicious that the drives are only
    powered up when the kernel loads the driver, and then the driver almost immediately expects them to be present, which would mean that this ploy wouldn't work. However, I was game to try it, as I didn't think it
    could do any harm, but, as I feared, it didn't work. The delay comes
    long after this point, and too late to affect things. Further
    information about the actual point of failure is in the dmesg logs below.

    Thanks for trying anyway.

    What I think is needed is some way of telling the driver to wait longer
    for the drives after spinning them up, or a way of spinning up the
    drives on power on.

    Appendix:
    ---------

    Two things that I meant to add to my OP yesterday, but forgot, I add now below, in case they prove to be important ...

    1) The Zyxel NSA221 NAS is based on an Oxford Semiconductor NAS
    controller board design, commonly known as Oxnas. There is some
    historical information about the box on the Wayback Machine, for example:

    <https://web.archive.org/web/20150422213204/http://zyxel.nas-central.org/wiki/Category:NSA-221>

    2) In case anybody spots anything important that I'd missed, these are
    the relevant parts of a serial log from two successive boots, first from
    cold where it fails to find the second drive, then from a successful
    reboot when the drive is already spinning:

    ox810sata: OX810 sata core.
    scsi0 : oxnassata
    ata1: SATA max UDMA/133 irq 18
    ata_eh_reset(2207):Sleep 1 sec before any error happens
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: qc timeout (cmd 0xec)
    ox810sata_bmdma_stop - aborting DMA
    ox810sata aborting DMA.
    ox810sata sending sync escapes
    ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
    ata1: failed to recover some devices, retrying in 1 secs ata_eh_reset(2207):Sleep 1 sec before any error happens ata_wait_after_reset(3500):msleep(6000);
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: qc timeout (cmd 0xec)
    ox810sata_bmdma_stop - aborting DMA
    ox810sata aborting DMA.
    ox810sata sending sync escapes
    ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
    ata1: failed to recover some devices, retrying in 1 secs ata_eh_reset(2207):Sleep 1 sec before any error happens ata_wait_after_reset(3500):msleep(6000);
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: HPA detected: current 7814037168, native 18446744072933654192
    ata1.00: ATA-8: TOSHIBA HDWE140, FP2A, max UDMA/100
    ata1.00: 7814037168 sectors, multi 16: LBA48 NCQ (depth 0/32)
    ata1.00: Drive reports diagnostics failure. This may indicate a drive
    ata1.00: fault or invalid emulation. Contact drive vendor for information. ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: configured for UDMA/100
    scsi 0:0:0:0: Direct-Access ATA TOSHIBA HDWE140 FP2A PQ: 0 ANSI: 5
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sda: sda1 sda2
    sd 0:0:0:0: [sda] Attached SCSI disk
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    ox810sata: OX810 sata core.
    scsi1 : oxnassata
    ata2: SATA max UDMA/133 irq 18
    ata_eh_reset(2207):Sleep 1 sec before any error happens
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata2.00: qc timeout (cmd 0xec)
    ox810sata_bmdma_stop - aborting DMA
    ox810sata aborting DMA.
    ox810sata sending sync escapes
    Port 0 High level registers

    [snip long register dump]

    oxnas_dma_dump_registers() - end
    ox810sata core reset
    ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
    ata2: failed to recover some devices, retrying in 1 secs ata_eh_reset(2207):Sleep 1 sec before any error happens ata_wait_after_reset(3500):msleep(6000);
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata2.00: qc timeout (cmd 0xec)
    ox810sata_bmdma_stop - aborting DMA
    ox810sata aborting DMA.
    ox810sata sending sync escapes
    Port 0 High level registers

    [snip long register dump]

    oxnas_dma_dump_registers() - end
    ox810sata core reset
    ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
    ata2: failed to recover some devices, retrying in 1 secs ata_eh_reset(2207):Sleep 1 sec before any error happens ata_wait_after_reset(3500):msleep(6000);
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata2.00: qc timeout (cmd 0xec)
    ox810sata_bmdma_stop - aborting DMA
    ox810sata aborting DMA.
    ox810sata sending sync escapes
    Port 0 High level registers

    [snip long register dump]

    oxnas_dma_dump_registers() - end
    ox810sata core reset
    ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
    ata2: failed to recover some devices, retrying in 1 secs ata_eh_reset(2207):Sleep 1 sec before any error happens ata_wait_after_reset(3500):msleep(6000);
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)


    This is what it looks like on the reboot, when the Seagate disk is
    already spun up:


    ox810sata: OX810 sata core.
    scsi0 : oxnassata
    ata1: SATA max UDMA/133 irq 18
    ata_eh_reset(2207):Sleep 1 sec before any error happens
    Copy button release
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: HPA detected: current 7814037168, native 18446744072933654192
    ata1.00: ATA-8: TOSHIBA HDWE140, FP2A, max UDMA/100
    ata1.00: 7814037168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: configured for UDMA/100
    ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xb t4
    ata1: soft resetting link
    ata_eh_reset(2207):Sleep 1 sec before any error happens
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata1.00: configured for UDMA/100
    ata1: EH complete
    scsi 0:0:0:0: Direct-Access ATA TOSHIBA HDWE140 FP2A PQ: 0 ANSI: 5
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sd 0:0:0:0: [sda] Very big device. Trying to use READ CAPACITY(16).
    sd 0:0:0:0: [sda] 7814037168 512-byte hardware sectors (4000787 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sda: sda1 sda2
    sd 0:0:0:0: [sda] Attached SCSI disk
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    ox810sata: OX810 sata core.
    scsi1 : oxnassata
    ata2: SATA max UDMA/133 irq 18
    ata_eh_reset(2207):Sleep 1 sec before any error happens
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata2.00: ATA-11: ST8000VN004-2M2101, SC60, max UDMA/133
    ata2.00: 15628053168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata_dev_read_id(2037): Give 100ms while getting HW ID
    ata2.00: configured for UDMA/133
    scsi 1:0:0:0: Direct-Access ATA ST8000VN004-2M21 SC60 PQ: 0 ANSI: 5
    sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    sd 1:0:0:0: [sdb] 15628053168 512-byte hardware sectors (8001563 MB)
    sd 1:0:0:0: [sdb] Write Protect is off
    sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sd 1:0:0:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
    sd 1:0:0:0: [sdb] 15628053168 512-byte hardware sectors (8001563 MB)
    sd 1:0:0:0: [sdb] Write Protect is off
    sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
    support DPO or FUA
    sdb: sdb1 sdb2
    sd 1:0:0:0: [sdb] Attached SCSI disk
    sd 1:0:0:0: Attached scsi generic sg1 type 0

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Tuesday, May 06, 2025 12:37:04
    Java Jive wrote:

    from the sound, I was suspicious that the drives are only powered up
    when the kernel loads the driver, and then the driver almost immediately expects them to be present, which would mean that this ploy wouldn't
    work.  However, I was game to try it, as I didn't think it could do any harm, but, as I feared, it didn't work.  The delay comes long after this point, and too late to affect things.

    A wild thought (which I can't test as I no longer run openWRT so no
    u-boot device) add the normal kernel as a crash kernel, let the first
    kernel boot and spin the drives up too late ... then find a way to crash
    it, so the crash kernel starts up affer the drives are spinning?

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Tuesday, May 06, 2025 15:07:09
    In uk.comp.os.linux Andy Burns <usenet@andyburns.uk> wrote:
    Java Jive wrote:

    from the sound, I was suspicious that the drives are only powered up
    when the kernel loads the driver, and then the driver almost immediately expects them to be present, which would mean that this ploy wouldn't work.  However, I was game to try it, as I didn't think it could do any harm, but, as I feared, it didn't work.  The delay comes long after this point, and too late to affect things.

    A wild thought (which I can't test as I no longer run openWRT so no
    u-boot device) add the normal kernel as a crash kernel, let the first
    kernel boot and spin the drives up too late ... then find a way to crash
    it, so the crash kernel starts up affer the drives are spinning?

    I'm not sure that's an intrinsic feature of uboot - the idea of booting into something different second time around is a feature of OpenWRT I think. I don't know how they do it. None of the other systems I've worked on with u-boot do that.

    However uboot does have its own boot delay. You:

    setenv bootdelay 120

    to delay 120 seconds. In the Zyxel case it appears that would be:

    ~ # /zyxel/sbin/fw_setenv bootdelay 120

    The difference is that this is prior to the kernel booting so the SATA
    driver does not fire up. However, if spinup only happens when the driver begins talking to the drive then this won't help.

    I suppose another option if that happens is to try to talk to the SATA drive
    in uboot, which might commence spinup. Docs: https://github.com/u-boot/u-boot/blob/master/doc/README.sata

    maybe the 'sata info' command is enough to wake the HDD, and even if it
    fails you can then 'sleep 30' or something while the drive spins up, and
    then boot Linux.

    (I'm assuming the Zyxel firmware will let you edit the u-boot command
    script, not just the u-boot environment variables)

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Tuesday, May 06, 2025 15:24:21
    Theo wrote:

    I'm not sure that's an intrinsic feature of uboot - the idea of booting into something different second time around is a feature of OpenWRT I think.
    Crash kernels aren't an openẀRT thing, just a Linux thing, I'm sure
    years ago they were more generic, but now seems focused just as a kdump kernel, probably too memory hungry for a uboot type of device anyway ...


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Tuesday, May 06, 2025 23:52:30
    On 2025-05-06 15:07, Theo wrote:

    ~ # /zyxel/sbin/fw_setenv bootdelay 120

    The difference is that this is prior to the kernel booting so the SATA
    driver does not fire up. However, if spinup only happens when the driver begins talking to the drive then this won't help.

    Yes, I tried 20 seconds, so now the message ...

    Hit any key to stop autoboot:

    .... displays for 20 seconds instead of the 3 seconds previously, but
    this did not help because the HDs didn't spin up upon power on, only
    when the driver loaded.

    I suppose another option if that happens is to try to talk to the SATA drive in uboot, which might commence spinup. Docs: https://github.com/u-boot/u-boot/blob/master/doc/README.sata

    maybe the 'sata info' command is enough to wake the HDD, and even if it
    fails you can then 'sleep 30' or something while the drive spins up, and
    then boot Linux.

    (I'm assuming the Zyxel firmware will let you edit the u-boot command
    script, not just the u-boot environment variables)

    I may look into this, though I think the best solution would be to fix
    the short delay in the SATA driver module, and I'm trying to find a
    suitable place in the source files to do that. Meanwhile, a simpler possibility would be to write the autoreboot flag into the UBoot
    environment, because that doesn't need the HDs to be found to provide
    storage, and would survive a reboot. I may try this as a temporary fix,
    until and if I can investigate a better solution.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Wednesday, May 07, 2025 14:50:18
    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:
    On 2025-05-06 15:07, Theo wrote:

    ~ # /zyxel/sbin/fw_setenv bootdelay 120

    The difference is that this is prior to the kernel booting so the SATA driver does not fire up. However, if spinup only happens when the driver begins talking to the drive then this won't help.

    Yes, I tried 20 seconds, so now the message ...

    Hit any key to stop autoboot:

    ... displays for 20 seconds instead of the 3 seconds previously, but
    this did not help because the HDs didn't spin up upon power on, only
    when the driver loaded.

    I suppose another option if that happens is to try to talk to the SATA drive
    in uboot, which might commence spinup. Docs: https://github.com/u-boot/u-boot/blob/master/doc/README.sata

    maybe the 'sata info' command is enough to wake the HDD, and even if it fails you can then 'sleep 30' or something while the drive spins up, and then boot Linux.

    (I'm assuming the Zyxel firmware will let you edit the u-boot command script, not just the u-boot environment variables)

    I may look into this, though I think the best solution would be to fix
    the short delay in the SATA driver module, and I'm trying to find a
    suitable place in the source files to do that. Meanwhile, a simpler possibility would be to write the autoreboot flag into the UBoot environment, because that doesn't need the HDs to be found to provide storage, and would survive a reboot. I may try this as a temporary fix, until and if I can investigate a better solution.

    Are you sure this delay isn't just the drive set to spin down when not used? Perhaps they boot in the spun-down state. When you try to access a drive that's spun down, the system will often hang waiting for it to spin up.
    Since the kernel wants to read the partition table stored on the disc I'm
    not surprised if it hangs if the drive isn't spinning, and maybe times out.

    The simplest way to adjust it with a Windows tool like SeaTools - there's
    now a version for Linux and a bootable USB version too.

    It's also worth checking if there's updated drive firmware, which may also address the problem.

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Carlos E.R.@2:250/1 to All on Thursday, May 08, 2025 12:57:18
    On 2025-05-07 15:50, Theo wrote:
    Are you sure this delay isn't just the drive set to spin down when not used? Perhaps they boot in the spun-down state. When you try to access a drive that's spun down, the system will often hang waiting for it to spin up.
    Since the kernel wants to read the partition table stored on the disc I'm
    not surprised if it hangs if the drive isn't spinning, and maybe times out.

    The simplest way to adjust it with a Windows tool like SeaTools - there's
    now a version for Linux and a bootable USB version too.

    Some boxes set that OFF timeout themselves, not in the disks, so that it
    is impossible to modify. At ten minutes of no activity, they power down
    the disks.

    --
    Cheers, Carlos.

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Thursday, May 08, 2025 13:11:30
    On 2025-05-08 12:57, Carlos E.R. wrote:
    On 2025-05-07 15:50, Theo wrote:
    Are you sure this delay isn't just the drive set to spin down when not
    used?
    Perhaps they boot in the spun-down state.  When you try to access a drive >> that's spun down, the system will often hang waiting for it to spin up.
    Since the kernel wants to read the partition table stored on the disc I'm
    not surprised if it hangs if the drive isn't spinning, and maybe times
    out.

    The simplest way to adjust it with a Windows tool like SeaTools - there's
    now a version for Linux and a bootable USB version too.

    Some boxes set that OFF timeout themselves, not in the disks, so that it
    is impossible to modify. At ten minutes of no activity, they power down
    the disks.

    It's not the problem here anyway. The disks are set to sleep, but, so
    far at least, that simply means a short delay until the box responds
    over the network and the directory is listed or whatever. That is
    different behaviour from what is happening, or rather not happening, on
    first boot.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Saturday, May 31, 2025 16:30:28
    If anyone needs a reminder, the original problem is appended below, this
    new thread/subthread is about my attempts to fix it.

    The firmware for these Zyxel NSA221 NAS boxes is split into three binary
    files and a numbers of associated checksums and scripts, described in
    comments in an unpacking script as follows ...

    # DATA_0000: header version
    # DATA_0001: firmware version
    # DATA_0002: firmware revision
    # DATA_0101: model number 1
    # DATA_0102: model number 2
    # DATA_0200: core checksum
    # DATA_0201: ZLD checksum
    # DATA_0202: ROM checksum
    # DATA_0203: InitRD checksum
    # DATA_1000: kernel file, uImage
    # DATA_1002: InitRD image, initrd.img.gz
    # DATA_1004: System disk image, sysdisk.img.gz
    # DATA_a000: executable, for some jobs before firmware upgrade
    # DATA_a002: executable, for some jobs after firmware upgrade

    Note that the last two are legacy scripts which with recent builds do
    not actually do anything.

    To create a firmware file, these are packaged up into a single binary
    file, which is then unpacked as above when the firmware is applied. The packing and unpacking are done by shell scripts which call Zyxel
    cut-down versions of a program called CONV723.EXE, which themselves are
    called ram2bin and bin2ram.

    I have the software development kit for the NASs, and several years ago
    built an image which works on both NASs, for which I still have the
    above component files, but for historical reasons lost in time cannot
    now seem to replicate that build from any of the existing or backed up
    build directories. In fact, rather strangely, none of them now produce anything that will boot, even unpacking the entire SDK afresh from
    scratch into a new directory and building an image from that, it too
    doesn't boot!

    So I've been trying a different approach, that of unpacking the working
    build, modifying the initrd, and repacking it, but this too crashes with
    a kernel panic. Even if I simply unpack the initrd, and re-pack it
    UNCHANGED, EXACTLY AS IT WAS BEFORE, even that gives a kernel panic.

    The packing is done by a script called 'makeras_gpl.sh', the most
    relevant section from which reads as follows:

    # Updates ROM_CHECKSUM in {METADATA}, generate romfile_checksum,
    zyconf.tgz and zyconf.rom
    ../make_zyconf.sh

    # Updates CORE_CHECKSUM in ${METADATA}, generate core_checksum ../make_kernel.sh

    # Update ZLD_CHECKSUM in ${METADATA}, generate sysdisk.img.gz and
    zld_checksum
    ../make_sysdisk.sh

    # Update INITRD_CHECKSUM in ${METADATA}, generate initrd.img.gz and initrd_checksum
    ../make_initrd.sh

    # pack firmware with BETA version
    ../fw_pack -r ${METADATA} -o tlv.bin
    ../ram2bin -i tlv.bin -o ras.bin -e "${MODELNAME}" -t 4
    mv ras.bin ${fBETA}
    chmod 644 ${fBETA}
    echo " ==> Beta version file ${fBETA} is created. --> ${vBETA}"

    What I would have readers note is that initrd is the last subcomponent
    to be built, so it's difficult to see how rebuilding it separately can
    alter anything else, for example by having a different checksum, because everything else has already been built. The relevant section of 'make_initrd.sh' is as follows:

    echo -e " \033[1;31m>> Enter Critcal Section! DO NOT CTRL+C <<\033[0m"

    mv fs.initrd initrd
    tar -zcf initrd.tar.gz initrd/
    mv initrd fs.initrd

    # Create ext2 image
    mkdir initrd
    dd if=/dev/zero of=initrd.img bs=1k count=8192
    /sbin/mkfs.ext2 -F -v -m0 initrd.img
    sudo mount -o loop initrd.img initrd/
    sudo tar -zvxf initrd.tar.gz initrd
    sudo umount initrd/

    echo -e " \033[1;32m<< Exit Critcal Section! >>\033[0m"

    sudo gzip -9 < initrd.img > initrd.img.gz
    sudo rm -rf initrd
    sudo rm -f initrd.tar.gz
    sudo rm -f initrd.img

    INITRDCHECKSUM=`./ram2bin -i initrd.img.gz -e "${MODELNAME}" -t 4 -q -f`
    sed -i -e "s/^INITRD_CHECKSUM.*/INITRD_CHECKSUM\tvalue\t`echo ${INITRDCHECKSUM}`/g" ${METADATA}

    I've gone through these steps individually a number of times in case I'd
    made mistakes, but even with unchanged initrd files, I've never got past
    the kernel panic, the relevant part of the dmesg log from which reads as follows:

    physmap platform flash device: 00400000 at 41000000
    physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
    Amd/Fujitsu Extended Query Table at 0x0040
    physmap-flash.0: Swapping erase regions for broken CFI table.
    number of CFI chips: 1
    cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
    7 cmdlinepart partitions found on MTD device physmap-flash.0
    Creating 7 MTD partitions on "physmap-flash.0":
    0x00000000-0x00020000 : "uboot"
    mtd: Giving out device 0 to uboot
    0x00020000-0x001e0000 : "kernel"
    mtd: Giving out device 1 to kernel
    0x001e0000-0x00380000 : "initrd"
    mtd: Giving out device 2 to initrd
    0x00380000-0x003f0000 : "etc"
    mtd: Giving out device 3 to etc
    0x003f0000-0x003fc000 : "empty"
    mtd: Giving out device 4 to empty
    0x003fc000-0x003fe000 : "env1"
    mtd: Giving out device 5 to env1
    0x003fe000-0x00400000 : "env2"
    mtd: Giving out device 6 to env2
    10 Dec 2004 USB 2.0 'Enhanced' Host Controller (EHCI) Driver@e7000000
    Device ID register 42fa05
    oxnas-ehci oxnas-ehci.0: OXNAS EHCI Host Controller
    oxnas-ehci oxnas-ehci.0: new USB bus registered, assigned bus number 1 oxnas-ehci oxnas-ehci.0: irq 7, io mem 0x00000000
    oxnas-ehci oxnas-ehci.0: USB 0.0 started, EHCI 1.00, driver 10 Dec 2004
    usb usb1: configuration #1 chosen from 1 choice
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 3 ports detected
    USB Universal Host Controller Interface driver v3.0
    sl811: driver sl811-hcd, 19 May 2005
    usb 1-1: new high speed USB device using oxnas-ehci and address 2
    In hub_port_init, and number is 0, retry 0, port 1 .....
    usb 1-1: configuration #1 chosen from 1 choice
    hub 1-1:1.0: USB hub found
    hub 1-1:1.0: 4 ports detected
    usb 1-1.2: new high speed USB device using oxnas-ehci and address 3
    In hub_port_init, and number is 1, retry 0, port 2 .....
    usb 1-1.2: configuration #1 chosen from 1 choice
    usbcore: registered new interface driver usblp
    Initializing USB Mass Storage driver...
    scsi2 : SCSI emulation for USB Mass Storage devices
    usbcore: registered new interface driver usb-storage
    USB Mass Storage support registered.
    mice: PS/2 mouse device common for all mice
    i2c /dev entries driver
    pcf8563 0-0051: chip found, driver version 0.4.2
    pcf8563 0-0051: rtc core: registered pcf8563 as rtc0
    OXNAS bit-bash I2C driver initialisation OK
    md: linear personality registered for level -1
    md: raid0 personality registered for level 0
    md: raid1 personality registered for level 1
    TCP cubic registered
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    drivers/rtc/hctosys.c: unable to open rtc device (rtc)
    md: Autodetecting RAID arrays.
    md: Scanned 0 and added 0 devices.
    md: autorun ...
    md: ... autorun DONE.
    RAMDISK: Compressed image found at block 0

    # Above is normal
    # Below is crash

    EXT3-fs: Magic mismatch, very weird !
    List of all partitions:
    0800 3907018584 sda driver: sd
    0801 498688 sda1
    0802 3906518016 sda2
    1f00 128 mtdblock0 (driver?)
    1f01 1792 mtdblock1 (driver?)
    1f02 1664 mtdblock2 (driver?)
    1f03 448 mtdblock3 (driver?)
    1f04 48 mtdblock4 (driver?)
    1f05 8 mtdblock5 (driver?)
    1f06 8 mtdblock6 (driver?)
    No filesystem could mount root, tried: ext3 ext2 vfat fuseblk
    Kernel panic - not syncing: VFS: Unable to mount root fs on
    unknown-block(1,0)

    # Below *would* have been a normal continuation for a successful boot

    VFS: Mounted root (ext2 filesystem).
    Freeing init memory: 116K
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    Mounting file systems...
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    egiga0: PHY is Realtek RTL8211BGR
    Resetting GMAC
    GMAC reset complete
    ifconfig: bad address 'add'
    Starting udhcpc ...
    INITRD: Trying to mount NAND flash as Root FS.egiga0: PHY is Realtek RTL8211BGR
    egiga0: link down
    ...egiga0: link up, 1000Mbps, full-duplex, not using pause, lpa 0xC1E1
    ..scsi 2:0:0:0: Direct-Access ZyXEL USB DISK 2.0 PMAP PQ: 0
    ANSI: 0 CCS

    Any ideas?

    On 2025-05-05 10:24, Java Jive wrote:

    Having successfully upgraded my two primary QNAP 251+ NASs, I've handed
    down two of the HDs to the Zyxel NSA221s NASs that were my original
    backup solution.  The disks are Seagate Iron Wolf 8TB ST8000VN004, originally from Amazon ...

    https://www.amazon.co.uk/dp/B07SZVVBBK

    ... but they are giving problems in their new home, or, I should say, housing.

    The problem is the same for each, they spin up too slowly to be found as
    the NSA221 boots, so after a cold boot I have to reboot, so that then
    they can be found on the reboot.  Once that is done, they seem to be perfectly satisfactory, and, despite the NAS specs saying that they only work up to a max of 4TB per disk, I'm actually getting the full
    (nominal) 8TB added to the capacity of the other Toshiba HDs which have always been free of such problems.

    I've had similar problems in the past with other Seagate HDs in these
    NASs, which at the time I got around by using a reboot flag in a rather convoluted manner, and which now won't work anyway because I have since configured the NASs to use the combined disk space of the two disks as
    one large virtual HD volume, which means that now I have nowhere to
    write a reboot flag unless both HD are running and found at boot, in
    which case I wouldn't need to write it anyway.  I might be able to get round this by using RAM, but it would need some investigation as to what would survive a reboot.

    This reboot requirement will be easy to forget and should be avoidable
    by making the system wait longer for the disks to spin up.  Setting
    aside the problem of altering firmware (see next para), in a normal
    Linux installation, how normally would one accomplish this?  A boot parameter?  An /etc setting?

    I have some scope for making changes, as in the past I've recompiled the firmware to be Gnu GPL and both NASs are running the result.  Also the command run by UBoot is stored in an environment variable, which means
    that I could add a boot parameter to that fairly easily.

    The Zyxel DK to build a GPL firmware dates from the Ubuntu 7 era (!),
    but I was still running it satisfactorily somewhere around Ubuntu 16 or 18.  Also, the NASs each have a serial header which I can use to
    interrupt their boot and change things, though I wouldn't want to be
    doing that on a permanent basis, only for temporary fixes to see if they work.  There are also OpenWRT versions of the firmware, but having
    obtained already a pretty good result from my own firmware, I haven't
    gone down that road, as it would be too time consuming for such old hardware.

    Can anyone suggest how to fix this slow spin-up problem?


    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Paul@2:250/1 to All on Saturday, May 31, 2025 20:55:30
    On Sat, 5/31/2025 11:30 AM, Java Jive wrote:
    If anyone needs a reminder, the original problem is appended below, this new thread/subthread is about my attempts to fix it.

    The firmware for these Zyxel NSA221 NAS boxes is split into three binary files and a numbers of associated checksums and scripts, described in comments in an unpacking script as follows ...

      # DATA_0000: header version
      # DATA_0001: firmware version
      # DATA_0002: firmware revision
      # DATA_0101: model number 1
      # DATA_0102: model number 2
      # DATA_0200: core checksum
      # DATA_0201: ZLD checksum
      # DATA_0202: ROM checksum
      # DATA_0203: InitRD checksum
      # DATA_1000: kernel file, uImage
      # DATA_1002: InitRD image, initrd.img.gz
      # DATA_1004: System disk image, sysdisk.img.gz
      # DATA_a000: executable, for some jobs before firmware upgrade
      # DATA_a002: executable, for some jobs after firmware upgrade

    Note that the last two are legacy scripts which with recent builds do not actually do anything.

    To create a firmware file, these are packaged up into a single binary file, which is then unpacked as above when the firmware is applied.  The packing and unpacking are done by shell scripts which call Zyxel cut-down versions of a program called CONV723.EXE, which themselves are called ram2bin and bin2ram.

    I have the software development kit for the NASs, and several years ago built an image which works on both NASs, for which I still have the above component files, but for historical reasons lost in time cannot now seem to replicate that build from any of the existing or backed up build directories.  In fact, rather strangely, none of them now produce anything that will boot, even unpacking the entire SDK afresh from scratch into a new directory and building an image from that, it too doesn't boot!

    So I've been trying a different approach, that of unpacking the working build, modifying the initrd, and repacking it, but this too crashes with a kernel panic.  Even if I simply unpack the initrd, and re-pack it UNCHANGED, EXACTLY AS IT WAS BEFORE, even that gives a kernel panic.

    The packing is done by a script called 'makeras_gpl.sh', the most relevant section from which reads as follows:

    # Updates ROM_CHECKSUM in {METADATA}, generate romfile_checksum, zyconf.tgz and zyconf.rom
    ./make_zyconf.sh

    # Updates CORE_CHECKSUM in ${METADATA}, generate core_checksum ./make_kernel.sh

    # Update ZLD_CHECKSUM in ${METADATA}, generate sysdisk.img.gz and zld_checksum
    ./make_sysdisk.sh

    # Update INITRD_CHECKSUM in ${METADATA}, generate initrd.img.gz and initrd_checksum
    ./make_initrd.sh

    # pack firmware with BETA version
    ./fw_pack -r ${METADATA} -o tlv.bin
    ./ram2bin -i tlv.bin -o ras.bin -e "${MODELNAME}" -t 4
    mv ras.bin ${fBETA}
    chmod 644 ${fBETA}
    echo " ==> Beta version file ${fBETA} is created. --> ${vBETA}"

    What I would have readers note is that initrd is the last subcomponent to be built, so it's difficult to see how rebuilding it separately can alter anything else, for example by having a different checksum, because everything else has already been built.  The relevant section of 'make_initrd.sh' is as follows:

    echo -e " \033[1;31m>> Enter Critcal Section! DO NOT CTRL+C <<\033[0m"

    mv fs.initrd initrd
    tar -zcf initrd.tar.gz initrd/
    mv initrd fs.initrd

    # Create ext2 image
    mkdir initrd
    dd if=/dev/zero of=initrd.img bs=1k count=8192
    /sbin/mkfs.ext2 -F -v -m0 initrd.img
    sudo mount -o loop initrd.img initrd/
    sudo tar -zvxf initrd.tar.gz initrd
    sudo umount initrd/

    echo -e " \033[1;32m<< Exit Critcal Section! >>\033[0m"

    sudo gzip -9 < initrd.img > initrd.img.gz
    sudo rm -rf initrd
    sudo rm -f initrd.tar.gz
    sudo rm -f initrd.img

    INITRDCHECKSUM=`./ram2bin -i initrd.img.gz -e "${MODELNAME}" -t 4 -q -f`
    sed -i -e "s/^INITRD_CHECKSUM.*/INITRD_CHECKSUM\tvalue\t`echo ${INITRDCHECKSUM}`/g" ${METADATA}

    I've gone through these steps individually a number of times in case I'd made mistakes, but even with unchanged initrd files, I've never got past the kernel panic, the relevant part of the dmesg log from which reads as follows:

    physmap platform flash device: 00400000 at 41000000
    physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
     Amd/Fujitsu Extended Query Table at 0x0040
    physmap-flash.0: Swapping erase regions for broken CFI table.
    number of CFI chips: 1
    cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
    7 cmdlinepart partitions found on MTD device physmap-flash.0
    Creating 7 MTD partitions on "physmap-flash.0":
    0x00000000-0x00020000 : "uboot"
    mtd: Giving out device 0 to uboot
    0x00020000-0x001e0000 : "kernel"
    mtd: Giving out device 1 to kernel
    0x001e0000-0x00380000 : "initrd"
    mtd: Giving out device 2 to initrd
    0x00380000-0x003f0000 : "etc"
    mtd: Giving out device 3 to etc
    0x003f0000-0x003fc000 : "empty"
    mtd: Giving out device 4 to empty
    0x003fc000-0x003fe000 : "env1"
    mtd: Giving out device 5 to env1
    0x003fe000-0x00400000 : "env2"
    mtd: Giving out device 6 to env2
    10 Dec 2004 USB 2.0 'Enhanced' Host Controller (EHCI) Driver@e7000000 Device ID register 42fa05
    oxnas-ehci oxnas-ehci.0: OXNAS EHCI Host Controller
    oxnas-ehci oxnas-ehci.0: new USB bus registered, assigned bus number 1 oxnas-ehci oxnas-ehci.0: irq 7, io mem 0x00000000
    oxnas-ehci oxnas-ehci.0: USB 0.0 started, EHCI 1.00, driver 10 Dec 2004
    usb usb1: configuration #1 chosen from 1 choice
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 3 ports detected
    USB Universal Host Controller Interface driver v3.0
    sl811: driver sl811-hcd, 19 May 2005
    usb 1-1: new high speed USB device using oxnas-ehci and address 2
    In hub_port_init, and number is 0, retry 0, port 1 .....
    usb 1-1: configuration #1 chosen from 1 choice
    hub 1-1:1.0: USB hub found
    hub 1-1:1.0: 4 ports detected
    usb 1-1.2: new high speed USB device using oxnas-ehci and address 3
    In hub_port_init, and number is 1, retry 0, port 2 .....
    usb 1-1.2: configuration #1 chosen from 1 choice
    usbcore: registered new interface driver usblp
    Initializing USB Mass Storage driver...
    scsi2 : SCSI emulation for USB Mass Storage devices
    usbcore: registered new interface driver usb-storage
    USB Mass Storage support registered.
    mice: PS/2 mouse device common for all mice
    i2c /dev entries driver
    pcf8563 0-0051: chip found, driver version 0.4.2
    pcf8563 0-0051: rtc core: registered pcf8563 as rtc0
    OXNAS bit-bash I2C driver initialisation OK
    md: linear personality registered for level -1
    md: raid0 personality registered for level 0
    md: raid1 personality registered for level 1
    TCP cubic registered
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    drivers/rtc/hctosys.c: unable to open rtc device (rtc)
    md: Autodetecting RAID arrays.
    md: Scanned 0 and added 0 devices.
    md: autorun ...
    md: ... autorun DONE.
    RAMDISK: Compressed image found at block 0

    # Above is normal
    # Below is crash

    EXT3-fs: Magic mismatch, very weird !
    List of all partitions:
    0800 3907018584 sda driver: sd
      0801     498688 sda1
      0802 3906518016 sda2
    1f00        128 mtdblock0 (driver?)
    1f01       1792 mtdblock1 (driver?)
    1f02       1664 mtdblock2 (driver?)
    1f03        448 mtdblock3 (driver?)
    1f04         48 mtdblock4 (driver?)
    1f05          8 mtdblock5 (driver?)
    1f06          8 mtdblock6 (driver?)
    No filesystem could mount root, tried:  ext3 ext2 vfat fuseblk
    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0)

    # Below *would* have been a normal continuation for a successful boot

    VFS: Mounted root (ext2 filesystem).
    Freeing init memory: 116K
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    Mounting file systems...
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    egiga0: PHY is Realtek RTL8211BGR
    Resetting GMAC
    GMAC reset complete
    ifconfig: bad address 'add'
    Starting udhcpc ...
    INITRD: Trying to mount NAND flash as Root FS.egiga0: PHY is Realtek RTL8211BGR
    egiga0: link down
    ..egiga0: link up, 1000Mbps, full-duplex, not using pause, lpa 0xC1E1
    .scsi 2:0:0:0: Direct-Access     ZyXEL    USB DISK 2.0     PMAP PQ: 0 ANSI: 0 CCS

    Any ideas?

    On 2025-05-05 10:24, Java Jive wrote:

    Having successfully upgraded my two primary QNAP 251+ NASs, I've handed down two of the HDs to the Zyxel NSA221s NASs that were my original backup solution.  The disks are Seagate Iron Wolf 8TB ST8000VN004, originally from Amazon ...

    https://www.amazon.co.uk/dp/B07SZVVBBK

    ... but they are giving problems in their new home, or, I should say, housing.

    The problem is the same for each, they spin up too slowly to be found as the NSA221 boots, so after a cold boot I have to reboot, so that then they can be found on the reboot.  Once that is done, they seem to be perfectly satisfactory, and, despite the NAS specs saying that they only work up to a max of 4TB per disk, I'm actually getting the full (nominal) 8TB added to the capacity of the other Toshiba HDs which have always been free of such problems.

    I've had similar problems in the past with other Seagate HDs in these NASs, which at the time I got around by using a reboot flag in a rather convoluted manner, and which now won't work anyway because I have since configured the NASs to use the combined disk space of the two disks as one large virtual HD volume, which means that now I have nowhere to write a reboot flag unless both HD are running and found at boot, in which case I wouldn't need to write it anyway.  I might be able to get round this by using RAM, but it would need some investigation as to what would survive a reboot.

    This reboot requirement will be easy to forget and should be avoidable by making the system wait longer for the disks to spin up.  Setting aside the problem of altering firmware (see next para), in a normal Linux installation, how normally would one accomplish this?  A boot parameter?  An /etc setting?

    I have some scope for making changes, as in the past I've recompiled the firmware to be Gnu GPL and both NASs are running the result.  Also the command run by UBoot is stored in an environment variable, which means that I could add a boot parameter to that fairly easily.

    The Zyxel DK to build a GPL firmware dates from the Ubuntu 7 era (!), but I was still running it satisfactorily somewhere around Ubuntu 16 or 18.  Also, the NASs each have a serial header which I can use to interrupt their boot and change things, though I wouldn't want to be doing that on a permanent basis, only for temporary fixes to see if they work.  There are also OpenWRT versions of the firmware, but having obtained already a pretty good result from my own firmware, I haven't gone down that road, as it would be too time consuming for such old hardware.

    Can anyone suggest how to fix this slow spin-up problem?



    What does "tune2fs" say about the parametrics of the filesystem ?

    https://media.geeksforgeeks.org/wp-content/uploads/20230929130854/Image-3.png

    Have you previously removed the drive and mounted it on a technician machine ? Maybe some damage was done to it, while it was out of the NAS and being probed.

    There's got to be some reason that not even the magic number is correct.

    *******

    NOR flash can have bad bits in it, but that does not happen all that often.
    The most likely place for a failure, is segments which are flashed during
    each boot, and even with the high cycle count NOR flash supports, that sometimes leads to grief.

    The flash load can be segmented, and each chunk has a checksum. It normally isn't possible to capture an image of an entire flash chip, and just compare
    it to an entire image held in hand. The validity may only be able to be determined
    by knowing the start and end address of a chunk and verifying it. Automation
    in the tools would be the preferred way to determine the flash itself
    wasn't causing a corruption. Normally, with flash devices, the loader
    will halt, if a portion of what it is loading is defective.

    *******

    Processors do not normally go defective. Sometimes bad batches escape
    the factory. And a NAS box is highly unlikely to have been overclocked
    for most of its life.

    As for your firmware kit, I would have frozen the working environment
    at Ubuntu 7. In the hopes I would always have an old machine to run it on. Dragging a build environment along, say an unsupported one, on a dynamic
    OS situation, that's kinda asking for trouble.

    Paul

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Sunday, June 01, 2025 12:08:20
    On 2025-05-31 20:55, Paul wrote:

    On Sat, 5/31/2025 11:30 AM, Java Jive wrote:

    If anyone needs a reminder, the original problem is appended below, this new thread/subthread is about my attempts to fix it.

    The firmware for these Zyxel NSA221 NAS boxes is split into three binary files and a numbers of associated checksums and scripts, described in comments in an unpacking script as follows ...

      # DATA_0000: header version
      # DATA_0001: firmware version
      # DATA_0002: firmware revision
      # DATA_0101: model number 1
      # DATA_0102: model number 2
      # DATA_0200: core checksum
      # DATA_0201: ZLD checksum
      # DATA_0202: ROM checksum
      # DATA_0203: InitRD checksum
      # DATA_1000: kernel file, uImage
      # DATA_1002: InitRD image, initrd.img.gz
      # DATA_1004: System disk image, sysdisk.img.gz
      # DATA_a000: executable, for some jobs before firmware upgrade
      # DATA_a002: executable, for some jobs after firmware upgrade

    Note that the last two are legacy scripts which with recent builds do not actually do anything.

    To create a firmware file, these are packaged up into a single binary file, which is then unpacked as above when the firmware is applied.  The packing and unpacking are done by shell scripts which call Zyxel cut-down versions of a program called CONV723.EXE, which themselves are called ram2bin and bin2ram.

    I have the software development kit for the NASs, and several years ago built an image which works on both NASs, for which I still have the above component files, but for historical reasons lost in time cannot now seem to replicate that build from any of the existing or backed up build directories.  In fact, rather strangely, none of them now produce anything that will boot, even unpacking the entire SDK afresh from scratch into a new directory and building an image from that, it too doesn't boot!

    So I've been trying a different approach, that of unpacking the working build, modifying the initrd, and repacking it, but this too crashes with a kernel panic.  Even if I simply unpack the initrd, and re-pack it UNCHANGED, EXACTLY AS IT WAS BEFORE, even that gives a kernel panic.

    The packing is done by a script called 'makeras_gpl.sh', the most relevant section from which reads as follows:

    # Updates ROM_CHECKSUM in {METADATA}, generate romfile_checksum, zyconf.tgz and zyconf.rom
    ./make_zyconf.sh

    # Updates CORE_CHECKSUM in ${METADATA}, generate core_checksum
    ./make_kernel.sh

    # Update ZLD_CHECKSUM in ${METADATA}, generate sysdisk.img.gz and zld_checksum
    ./make_sysdisk.sh

    # Update INITRD_CHECKSUM in ${METADATA}, generate initrd.img.gz and initrd_checksum
    ./make_initrd.sh

    # pack firmware with BETA version
    ./fw_pack -r ${METADATA} -o tlv.bin
    ./ram2bin -i tlv.bin -o ras.bin -e "${MODELNAME}" -t 4
    mv ras.bin ${fBETA}
    chmod 644 ${fBETA}
    echo " ==> Beta version file ${fBETA} is created. --> ${vBETA}"

    What I would have readers note is that initrd is the last subcomponent to be built, so it's difficult to see how rebuilding it separately can alter anything else, for example by having a different checksum, because everything else has already been built.  The relevant section of 'make_initrd.sh' is as follows:

    echo -e " \033[1;31m>> Enter Critcal Section! DO NOT CTRL+C <<\033[0m"

    mv fs.initrd initrd
    tar -zcf initrd.tar.gz initrd/
    mv initrd fs.initrd

    # Create ext2 image
    mkdir initrd
    dd if=/dev/zero of=initrd.img bs=1k count=8192
    /sbin/mkfs.ext2 -F -v -m0 initrd.img
    sudo mount -o loop initrd.img initrd/
    sudo tar -zvxf initrd.tar.gz initrd
    sudo umount initrd/

    echo -e " \033[1;32m<< Exit Critcal Section! >>\033[0m"

    sudo gzip -9 < initrd.img > initrd.img.gz
    sudo rm -rf initrd
    sudo rm -f initrd.tar.gz
    sudo rm -f initrd.img

    INITRDCHECKSUM=`./ram2bin -i initrd.img.gz -e "${MODELNAME}" -t 4 -q -f`
    sed -i -e "s/^INITRD_CHECKSUM.*/INITRD_CHECKSUM\tvalue\t`echo ${INITRDCHECKSUM}`/g" ${METADATA}

    I've gone through these steps individually a number of times in case I'd made mistakes, but even with unchanged initrd files, I've never got past the kernel panic, the relevant part of the dmesg log from which reads as follows:

    physmap platform flash device: 00400000 at 41000000
    physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank
     Amd/Fujitsu Extended Query Table at 0x0040
    physmap-flash.0: Swapping erase regions for broken CFI table.
    number of CFI chips: 1
    cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
    7 cmdlinepart partitions found on MTD device physmap-flash.0
    Creating 7 MTD partitions on "physmap-flash.0":
    0x00000000-0x00020000 : "uboot"
    mtd: Giving out device 0 to uboot
    0x00020000-0x001e0000 : "kernel"
    mtd: Giving out device 1 to kernel
    0x001e0000-0x00380000 : "initrd"
    mtd: Giving out device 2 to initrd
    0x00380000-0x003f0000 : "etc"
    mtd: Giving out device 3 to etc
    0x003f0000-0x003fc000 : "empty"
    mtd: Giving out device 4 to empty
    0x003fc000-0x003fe000 : "env1"
    mtd: Giving out device 5 to env1
    0x003fe000-0x00400000 : "env2"
    mtd: Giving out device 6 to env2
    10 Dec 2004 USB 2.0 'Enhanced' Host Controller (EHCI) Driver@e7000000 Device ID register 42fa05
    oxnas-ehci oxnas-ehci.0: OXNAS EHCI Host Controller
    oxnas-ehci oxnas-ehci.0: new USB bus registered, assigned bus number 1
    oxnas-ehci oxnas-ehci.0: irq 7, io mem 0x00000000
    oxnas-ehci oxnas-ehci.0: USB 0.0 started, EHCI 1.00, driver 10 Dec 2004
    usb usb1: configuration #1 chosen from 1 choice
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 3 ports detected
    USB Universal Host Controller Interface driver v3.0
    sl811: driver sl811-hcd, 19 May 2005
    usb 1-1: new high speed USB device using oxnas-ehci and address 2
    In hub_port_init, and number is 0, retry 0, port 1 .....
    usb 1-1: configuration #1 chosen from 1 choice
    hub 1-1:1.0: USB hub found
    hub 1-1:1.0: 4 ports detected
    usb 1-1.2: new high speed USB device using oxnas-ehci and address 3
    In hub_port_init, and number is 1, retry 0, port 2 .....
    usb 1-1.2: configuration #1 chosen from 1 choice
    usbcore: registered new interface driver usblp
    Initializing USB Mass Storage driver...
    scsi2 : SCSI emulation for USB Mass Storage devices
    usbcore: registered new interface driver usb-storage
    USB Mass Storage support registered.
    mice: PS/2 mouse device common for all mice
    i2c /dev entries driver
    pcf8563 0-0051: chip found, driver version 0.4.2
    pcf8563 0-0051: rtc core: registered pcf8563 as rtc0
    OXNAS bit-bash I2C driver initialisation OK
    md: linear personality registered for level -1
    md: raid0 personality registered for level 0
    md: raid1 personality registered for level 1
    TCP cubic registered
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    drivers/rtc/hctosys.c: unable to open rtc device (rtc)
    md: Autodetecting RAID arrays.
    md: Scanned 0 and added 0 devices.
    md: autorun ...
    md: ... autorun DONE.
    RAMDISK: Compressed image found at block 0

    # Above is normal
    # Below is crash

    EXT3-fs: Magic mismatch, very weird !
    List of all partitions:
    0800 3907018584 sda driver: sd
      0801     498688 sda1
      0802 3906518016 sda2
    1f00        128 mtdblock0 (driver?)
    1f01       1792 mtdblock1 (driver?)
    1f02       1664 mtdblock2 (driver?)
    1f03        448 mtdblock3 (driver?)
    1f04         48 mtdblock4 (driver?)
    1f05          8 mtdblock5 (driver?)
    1f06          8 mtdblock6 (driver?)
    No filesystem could mount root, tried:  ext3 ext2 vfat fuseblk
    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0)

    # Below *would* have been a normal continuation for a successful boot

    VFS: Mounted root (ext2 filesystem).
    Freeing init memory: 116K
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    Mounting file systems...
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    MTD_open
    MTD_ioctl
    MTD_read
    MTD_close
    egiga0: PHY is Realtek RTL8211BGR
    Resetting GMAC
    GMAC reset complete
    ifconfig: bad address 'add'
    Starting udhcpc ...
    INITRD: Trying to mount NAND flash as Root FS.egiga0: PHY is Realtek RTL8211BGR
    egiga0: link down
    ..egiga0: link up, 1000Mbps, full-duplex, not using pause, lpa 0xC1E1
    .scsi 2:0:0:0: Direct-Access     ZyXEL    USB DISK 2.0     PMAP PQ: 0 ANSI: 0 CCS

    Any ideas?

    What does "tune2fs" say about the parametrics of the filesystem ?

    https://media.geeksforgeeks.org/wp-content/uploads/20230929130854/Image-3.png

    Have you previously removed the drive and mounted it on a technician machine ?
    Maybe some damage was done to it, while it was out of the NAS and being probed.

    There's got to be some reason that not even the magic number is correct.

    I don't think I would be able to run a graphical tool on it. The only
    access I have is via ...

    Web interface
    ssh command-line
    USB and Ethernet

    .... unlike the more modern QNAPs, which each have an HDMI output and a
    remote control that I've never used.

    *******

    NOR flash can have bad bits in it, but that does not happen all that often. The most likely place for a failure, is segments which are flashed during each boot, and even with the high cycle count NOR flash supports, that sometimes leads to grief.

    The flash load can be segmented, and each chunk has a checksum. It normally isn't possible to capture an image of an entire flash chip, and just compare it to an entire image held in hand. The validity may only be able to be determined
    by knowing the start and end address of a chunk and verifying it. Automation in the tools would be the preferred way to determine the flash itself
    wasn't causing a corruption. Normally, with flash devices, the loader
    will halt, if a portion of what it is loading is defective.

    Don't think this would explain why only particular builds show the
    kernel panic. The original working build, the initrd of which I'm
    trying to adapt, boots fine. It's only the attempts to customise the
    initrd which crash in a kernel panic.

    *******

    Processors do not normally go defective. Sometimes bad batches escape
    the factory. And a NAS box is highly unlikely to have been overclocked
    for most of its life.

    As for your firmware kit, I would have frozen the working environment
    at Ubuntu 7. In the hopes I would always have an old machine to run it on. Dragging a build environment along, say an unsupported one, on a dynamic
    OS situation, that's kinda asking for trouble.
    Yes, perhaps, but my worst mistake seems to have been not to have copied
    the build directory of the working build into a new build directory to continue development thereafter, which seems to have meant that the
    backup of what was working later got overwritten by backups of what did
    not. I'm a bit embarrassed and annoyed at my own stupidity there.

    I'm about to try to do a new build from scratch, to see if I can work
    out what is going wrong.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Monday, June 02, 2025 13:21:34
    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:
    mv fs.initrd initrd
    tar -zcf initrd.tar.gz initrd/
    mv initrd fs.initrd

    # Create ext2 image
    mkdir initrd
    dd if=/dev/zero of=initrd.img bs=1k count=8192
    /sbin/mkfs.ext2 -F -v -m0 initrd.img
    sudo mount -o loop initrd.img initrd/
    sudo tar -zvxf initrd.tar.gz initrd
    sudo umount initrd/

    echo -e " \033[1;32m<< Exit Critcal Section! >>\033[0m"

    sudo gzip -9 < initrd.img > initrd.img.gz
    sudo rm -rf initrd
    sudo rm -f initrd.tar.gz
    sudo rm -f initrd.img

    INITRDCHECKSUM=`./ram2bin -i initrd.img.gz -e "${MODELNAME}" -t 4 -q -f`
    sed -i -e "s/^INITRD_CHECKSUM.*/INITRD_CHECKSUM\tvalue\t`echo ${INITRDCHECKSUM}`/g" ${METADATA}

    RAMDISK: Compressed image found at block 0

    # Above is normal
    # Below is crash

    EXT3-fs: Magic mismatch, very weird !
    List of all partitions:
    0800 3907018584 sda driver: sd
    0801 498688 sda1
    0802 3906518016 sda2
    1f00 128 mtdblock0 (driver?)
    1f01 1792 mtdblock1 (driver?)
    1f02 1664 mtdblock2 (driver?)
    1f03 448 mtdblock3 (driver?)
    1f04 48 mtdblock4 (driver?)
    1f05 8 mtdblock5 (driver?)
    1f06 8 mtdblock6 (driver?)
    No filesystem could mount root, tried: ext3 ext2 vfat fuseblk
    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0)

    # Below *would* have been a normal continuation for a successful boot

    VFS: Mounted root (ext2 filesystem).

    So you appear to be making an ext2 FS and gzipping it. Do you get the 'RAMDISK: Compressed image found' in the crash scenario?

    Searching on "Magic mismatch, very weird" comes up with some threads. One is hardware failure, the other is about using a non-1k blocksize with (old) mke2fs and a 2007-era ramdisk implementation that doesn't support other than 1k: https://sourceforge.net/p/e2fsprogs/bugs/175/#b0df

    Perhaps you could try -b1024 on the mkfs.ext2 command? Or experiment with other blocksizes?

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Monday, June 02, 2025 13:44:45
    On 2025-06-02 13:21, Theo wrote:

    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:

    RAMDISK: Compressed image found at block 0

    # Above is normal
    # Below is crash

    EXT3-fs: Magic mismatch, very weird !
    List of all partitions:
    0800 3907018584 sda driver: sd
    0801 498688 sda1
    0802 3906518016 sda2
    1f00 128 mtdblock0 (driver?)
    1f01 1792 mtdblock1 (driver?)
    1f02 1664 mtdblock2 (driver?)
    1f03 448 mtdblock3 (driver?)
    1f04 48 mtdblock4 (driver?)
    1f05 8 mtdblock5 (driver?)
    1f06 8 mtdblock6 (driver?)
    No filesystem could mount root, tried: ext3 ext2 vfat fuseblk
    Kernel panic - not syncing: VFS: Unable to mount root fs on
    unknown-block(1,0)

    # Below *would* have been a normal continuation for a successful boot

    VFS: Mounted root (ext2 filesystem).

    So you appear to be making an ext2 FS and gzipping it. Do you get the 'RAMDISK: Compressed image found' in the crash scenario?

    Yes, doing a diff between the successful and kernel panic boots, the
    RAMDISK line is the last line common between the two, thereafter they
    diverge as explained.

    Searching on "Magic mismatch, very weird" comes up with some threads. One is hardware failure, the other is about using a non-1k blocksize with (old) mke2fs
    and a 2007-era ramdisk implementation that doesn't support other than 1k: https://sourceforge.net/p/e2fsprogs/bugs/175/#b0df

    Perhaps you could try -b1024 on the mkfs.ext2 command? Or experiment with other blocksizes?

    Thanks, may try that later this afternoon, but what baffles me is that I
    think I've completely followed the procedure in the original scripts, so
    why is the result so different?

    BTW, my attempt to rebuild from scratch is failing also, when building Busybox:

    CC loginutils/passwd.o
    loginutils/passwd.c: In function ‘passwd_main’:
    loginutils/passwd.c:93:16: error: storage size of ‘rlimit_fsize’ isn’t known
    struct rlimit rlimit_fsize;
    ^
    loginutils/passwd.c:180:2: warning: implicit declaration of function ‘setrlimit’ [-Wimplicit-function-declaration]
    setrlimit(RLIMIT_FSIZE, &rlimit_fsize);
    ^
    loginutils/passwd.c:180:12: error: ‘RLIMIT_FSIZE’ undeclared (first use
    in this function)
    setrlimit(RLIMIT_FSIZE, &rlimit_fsize);
    ^
    loginutils/passwd.c:180:12: note: each undeclared identifier is reported
    only once for each function it appears in
    loginutils/passwd.c:93:16: warning: unused variable ‘rlimit_fsize’ [-Wunused-variable]
    struct rlimit rlimit_fsize;
    ^
    make[2]: *** [loginutils/passwd.o] Error 1
    make[1]: *** [loginutils] Error 2
    make[1]: Leaving directory `/home/devel/zyxel/NSA-221-GPL/TestBuild/trunk/sysapps/busybox-1.17.2'
    make: *** [busybox_initrd] Error 2

    Don't understand it, can't remember ever having that error before in any
    of the previous builds.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Monday, June 02, 2025 15:01:30
    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:
    Thanks, may try that later this afternoon, but what baffles me is that I think I've completely followed the procedure in the original scripts, so
    why is the result so different?

    Not sure, but maybe something in the layout of the modified ext2 fs is
    tripping it up that wasn't there before. Are you using a period mkfs.ext2
    or a modern one?

    BTW, my attempt to rebuild from scratch is failing also, when building Busybox:

    CC loginutils/passwd.o
    loginutils/passwd.c: In function ‘passwd_main’: loginutils/passwd.c:93:16: error: storage size of ‘rlimit_fsize’ isn’t known
    struct rlimit rlimit_fsize;

    Could be a compiler thing. I might try with an older compiler.
    Or maybe some library is different now from when it was then?
    Perhaps try building in a VM of a period-correct Linux distro?

    Theo

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Wednesday, June 04, 2025 15:43:43
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 - PART
    SOLVED

    On 2025-06-02 13:44, Java Jive wrote:

    Searching on "Magic mismatch, very weird" comes up with some threads.
    One is
    hardware failure, the other is about using a non-1k blocksize with
    (old) mke2fs
    and a 2007-era ramdisk implementation that doesn't support other than 1k:
    https://sourceforge.net/p/e2fsprogs/bugs/175/#b0df

    Perhaps you could try -b1024 on the mkfs.ext2 command?  Or experiment
    with
    other blocksizes?

    Thanks, may try that later this afternoon,

    And it worked, adding the -b1024 parameter makes my copying manually the original procedure work! Thanks for that.

    I'm making some progress now, I've managed to clean up Zyxel's original scripts somewhat, the originals gave lots of spurious errors in 'dmesg'.
    However, the fundamental plan, of doing an automatic reboot one time
    only if no storage is detected, didn't work from 'init', I think because 'init' can not be broken into, apparently not even programmatically.
    The 'Rebooting' message is displayed, but no reboot actually occurs.

    So I tried moving that bit of code to rcS, but I still can't get it to
    reboot. Again all the messages are correctly displayed, but no reboot actually occurs.

    Still the above problem has been solved thanks to your help, much obliged.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Friday, June 06, 2025 12:25:35
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 -
    FULLY SOLVED

    On 2025-06-04 15:43, Java Jive wrote:
    On 2025-06-02 13:44, Java Jive wrote:

    Searching on "Magic mismatch, very weird" comes up with some threads.
    One is
    hardware failure, the other is about using a non-1k blocksize with
    (old) mke2fs
    and a 2007-era ramdisk implementation that doesn't support other than
    1k:
    https://sourceforge.net/p/e2fsprogs/bugs/175/#b0df

    Perhaps you could try -b1024 on the mkfs.ext2 command?  Or experiment
    with
    other blocksizes?

    Thanks, may try that later this afternoon,

    And it worked, adding the -b1024 parameter makes my copying manually the original procedure work!  Thanks for that.

    [...]

    So I tried moving that bit of code to rcS, but I still can't get it to reboot.  Again all the messages are correctly displayed, but no reboot actually occurs.

    I now have this fully working. If it's of any interest here's the code
    from rcS. If on first boot, less than 2 HDs are found, it's sets a flag
    in the U-boot environment, which survives a reboot, and then does a
    reboot. On the second boot, it wipes the reboot flag and carries on the
    boot regardless of how many HDs are found. In my case, the reboot
    allows the second HD to be detected during the second boot, so the XFS
    storage area spread across both HDs becomes available.

    [Beware unintended line wrap, and note that the variables ECHO, WC, etc contain the full initrd path to the binaries concerned]

    # Check for HDs
    ${ECHO} "Checking for found hard drives ..."
    HDs="$(${SGMAP}|${GREP} 'ATA'|${WC} -l)"
    if [ "${HDs}" -lt 2 ]
    then
    case "${HDs}" in
    0) ${ECHO} "WARNING: No hard drives found!"
    ;;
    1) ${ECHO} "WARNING: Only 1 hard drive found!"
    ;;
    esac
    ${ECHO} "Checking firmware for reboot flag ..."
    REBOOTED="$(${PRINTENV} ${REBOOTFLG})"
    if [ -z "${REBOOTED}" ] || [ "${REBOOTED}" == "## Error: \"${REBOOTFLG}\" not defined" ]
    then
    # Set flag and reboot
    ${SETENV} ${REBOOTFLG} true
    ${ECHO} "Rebooting to try to pick up slow-spin-up drives ..."
    # The following command is valid according to the help parameter, but fails
    # ${UMOUNT} -a
    ${SLEEP} 5
    ${REBOOT}
    exit
    else
    # This is already a reboot, but still less than two HDs
    # nothing further that can be done here
    ${ECHO} "Less than 2 hard drives found even after reboot"
    fi
    else
    ${ECHO} "2 hard drives found!"
    fi
    # 2 hard drives were found or this is already a reboot
    # so just clear the flag and continue
    ${ECHO} "Clearing reboot flag in firmware ..."
    ${SETENV} ${REBOOTFLG}


    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Lew Pitcher@2:250/1 to All on Friday, June 06, 2025 13:45:13
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 -
    FULLY SOLVED

    On Fri, 06 Jun 2025 12:25:35 +0100, Java Jive wrote:

    [snip]

    I now have this fully working. If it's of any interest here's the code
    from rcS. If on first boot, less than 2 HDs are found, it's sets a flag
    in the U-boot environment, which survives a reboot, and then does a
    reboot. On the second boot, it wipes the reboot flag and carries on the boot regardless of how many HDs are found. In my case, the reboot
    allows the second HD to be detected during the second boot, so the XFS storage area spread across both HDs becomes available.

    [snip]
    ${SETENV} ${REBOOTFLG} true
    ${ECHO} "Rebooting to try to pick up slow-spin-up drives ..."
    # The following command is valid according to the help parameter, but fails
    # ${UMOUNT} -a

    Yah, assuming ${UMOUNT} resolves to something like /bin/umount, then
    ${UMOUNT} -a
    probably would fail here. Primarily while trying to umount the filesystem
    that has your scripts cwd, and (because the umount failure left that
    filesystem still mounted) the root filesystem.


    Remember, umount can't unmount an active mountpoint (one with mountspoints, open files or directories on it), and

    a) your script's cwd is most likely located in one of the filesystems
    mentioned in /etc/mtab (and, of course, open, because your active
    process lives in that cwd),

    b) / is probably in your /etc/mtab, and can't be umounted until all
    the filesystems that reside on it are umounted, and

    c) your use of the -a option effectively asks umount to unmount /all/
    filesystems listed in /etc/mtab ("except the proc filesystem")

    [snip]

    HTH
    --
    Lew Pitcher
    "In Skills We Trust"

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Friday, June 06, 2025 13:54:21
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 -
    FULLY SOLVED

    Java Jive wrote:

    I now have this fully working.

    Now, how long until the drives fail :-P

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Dan Purgert@2:250/1 to All on Friday, June 06, 2025 14:02:22
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 -
    FULLY SOLVED

    On 2025-06-06, Andy Burns wrote:
    Java Jive wrote:

    I now have this fully working.

    Now, how long until the drives fail :-P

    If it's anything like my luck, they actually failed 3 weeks ago, and all
    of this fighting is BECAUSE the drives are bad :)


    --
    |_|O|_|
    |_|_|O| Github: https://github.com/dpurgert
    |O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860

    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Friday, June 06, 2025 16:46:44
    Subject: Re: Problem With Old Zyxel NSA 221 NASs & Seagate HDs - Part 2 -
    FULLY SOLVED

    On 2025-06-06 13:45, Lew Pitcher wrote:
    On Fri, 06 Jun 2025 12:25:35 +0100, Java Jive wrote:

    [snip]

    I now have this fully working. If it's of any interest here's the code
    from rcS. If on first boot, less than 2 HDs are found, it's sets a flag
    in the U-boot environment, which survives a reboot, and then does a
    reboot. On the second boot, it wipes the reboot flag and carries on the
    boot regardless of how many HDs are found. In my case, the reboot
    allows the second HD to be detected during the second boot, so the XFS
    storage area spread across both HDs becomes available.

    [snip]
    ${SETENV} ${REBOOTFLG} true
    ${ECHO} "Rebooting to try to pick up slow-spin-up drives ..."
    # The following command is valid according to the help parameter, but fails >> # ${UMOUNT} -a

    Yah, assuming ${UMOUNT} resolves to something like /bin/umount, then
    ${UMOUNT} -a
    probably would fail here. Primarily while trying to umount the filesystem that has your scripts cwd, and (because the umount failure left that filesystem still mounted) the root filesystem.


    Remember, umount can't unmount an active mountpoint (one with mountspoints, open files or directories on it), and

    a) your script's cwd is most likely located in one of the filesystems
    mentioned in /etc/mtab (and, of course, open, because your active
    process lives in that cwd),

    b) / is probably in your /etc/mtab, and can't be umounted until all
    the filesystems that reside on it are umounted, and

    c) your use of the -a option effectively asks umount to unmount /all/
    filesystems listed in /etc/mtab ("except the proc filesystem")

    Thanks for the explanation, which has led me to look back into PuTTY's
    log files investigating further. I think your explanation probably does
    fit my current situation, because I've now reinstated the command, and
    this is the result as of now ...

    BusyBox v1.17.2 (2017-09-14 21:33:20 BST) multi-call binary.

    Usage: umount [OPTIONS] FILESYSTEM|DIRECTORY

    umount: can't umount /proc: Device or resource busy

    .... which originally confused me because seeing an abbreviated
    explanation of the usage and not noticing the last line led me to
    believe that the '-a' parameter had not been accepted. However, I have another log file of apparently the same command used in the same
    situation that contains only the last line above, which is a much more reasonable message. In both cases, the system does still reboot.
    However, there are other places in the boot scripts, particularly
    Zyxel's original scripts, where 'umount -a' appears to fail completely
    and just displays the help, here's an example of that ...

    Usage: umount [-hV]
    umount -a [-f] [-r] [-n] [-v] [-t vfstypes] [-O opts]
    umount [-f] [-r] [-n] [-v] special | node...

    .... so I'm not really sure what is going on in that case, perhaps an invisible character such as a non-breaking space has found its way into
    the script. Generally, the command's output is somewhat confusing and
    seems to have been rather poorly written, at least in the cut-down
    BusyBox version used on this NAS box.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.1 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)