• very odd nfs behaviour

    From Mike Scott@2:250/1 to All on Friday, January 24, 2025 16:56:00
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant I
    don't know.

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop box.
    An example good listing would be (sorry about wrap):

    mike@troi ~ $ ls /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png

    That corresponds exactly to what's on the server.


    However, on my desktop m/c, the same command complains about a missing
    file, and triplicates all the lines bar the first, which is duplicated,
    and there's an error about not finding a file that has an incorrect name anyway:

    Desktop> ls /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
    ls: cannot access '/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.2':
    No such file or directory /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png


    If I unmount and remount the file system, I get different results -
    always works on the other machines, and fails /differently/ each time on
    mine.


    I've also seen this happen in a virtual machine running on my box.

    It happens whether I hard mount or use the automounter.


    The OS versions are different - I'm running mint 21.2, the VM is at
    21.3; while the others are both rather older versions (and different hardware). The machines are all configured the same.


    I'm at a loss! Can anyone suggest what's going on here please? I'm sure
    this used to work!

    Thanks.

    --
    Mike Scott
    Harlow, England

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Mike Scott@2:250/1 to All on Friday, January 24, 2025 17:00:20
    On 24/01/2025 16:56, Mike Scott wrote:
    A very odd situation here.
    .....


    I should have mentioned that things do look OK in caja. The problem
    affects ls, find, as well as a perl test script using File::Find.


    --
    Mike Scott
    Harlow, England


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Edmund@2:250/1 to All on Friday, January 24, 2025 17:01:45
    On 1/24/25 17:56, Mike Scott wrote:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant I don't know.

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop box.
    An example good listing would be (sorry about wrap):

    mike@troi ~ $ ls /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.* /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png

    That corresponds exactly to what's on the server.


    However, on my desktop m/c, the same command complains about a missing
    file, and triplicates all the lines bar the first, which is duplicated,
    and there's an error about not finding a file that has an incorrect name anyway:

    Desktop> ls /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
    ls: cannot access '/nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.2': No such file or directory
    /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png


    If I unmount and remount the file system, I get different results -
    always works on the other machines, and fails /differently/ each time on mine.


    I've also seen this happen in a virtual machine running on my box.

    It happens whether I hard mount or use the automounter.


    The OS versions are different - I'm running mint 21.2, the VM is at
    21.3; while the others are both rather older versions (and different hardware). The machines are all configured the same.


    I'm at a loss! Can anyone suggest what's going on here please? I'm sure
    this used to work!

    Thanks.

    Wild guess, running out of disk space?





    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Mike Scott@2:250/1 to All on Friday, January 24, 2025 17:30:30
    On 24/01/2025 17:01, Edmund wrote:
    Wild guess, running out of disk space?

    Would it were so simple! No, but nice idea, thanks.


    --
    Mike Scott
    Harlow, England


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Richard Kettlewell@2:250/1 to All on Friday, January 24, 2025 17:55:37
    Mike Scott <usenet.16@scottsonline.org.uk.invalid> writes:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant
    I don't know.

    How many files in the directory?

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop
    box. An example good listing would be (sorry about wrap):

    mike@troi ~ $ ls /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png

    That corresponds exactly to what's on the server.


    However, on my desktop m/c, the same command complains about a missing
    file, and triplicates all the lines bar the first, which is
    duplicated, and there's an error about not finding a file that has an incorrect name anyway:

    Desktop> ls /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
    ls: cannot access '/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.2':
    No such file or directory /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
    /nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png


    If I unmount and remount the file system, I get different results -
    always works on the other machines, and fails /differently/ each time
    on mine.


    I've also seen this happen in a virtual machine running on my box.

    It happens whether I hard mount or use the automounter.


    The OS versions are different - I'm running mint 21.2, the VM is at
    21.3; while the others are both rather older versions (and different hardware). The machines are all configured the same.


    I'm at a loss! Can anyone suggest what's going on here please? I'm
    sure this used to work!

    Formerly, Linux NFS servers could get confused by large directories. https://lwn.net/Articles/544520/ is the best writeup I’ve found.

    In your case the server is FreeBSD, so Linux’s historical bugs aren’t directly relevant, beyond highlighting that merely listing a directory
    is more complex than you might initially imagine. I’m not sure why a hypothetical similar bug in FreeBSD would only be visible on a subset of clients either.

    --
    https://www.greenend.org.uk/rjk/

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: terraraq NNTP server (2:250/1@fidonet)
  • From Carlos E.R.@2:250/1 to All on Friday, January 24, 2025 21:56:30
    On 2025-01-24 17:56, Mike Scott wrote:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant I don't know.

    If you are using nfs version 3, perhaps try version 4.
    If version 4, what's in the exports file?

    All client machines have the same fstab nfs entry?



    --
    Cheers, Carlos.

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Paul@2:250/1 to All on Friday, January 24, 2025 22:34:19
    On Fri, 1/24/2025 4:56 PM, Carlos E.R. wrote:
    On 2025-01-24 17:56, Mike Scott wrote:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information files. >> It's large, and the paths quite long - whether that's relevant I don't know.

    If you are using nfs version 3, perhaps try version 4.
    If version 4, what's in the exports file?

    All client machines have the same fstab nfs entry?

    I would be just a bit curious about the software versions myself.

    Back when I was using nfs at work, that topic came up quite often.
    What is the version at each end.

    The FreeBSD have their own taste in software, so there's no reason
    for anything to particularly match Linux.

    I would be examining the versions on the cases that work,
    and checking the versions in the non-working case.

    In mo9dern times, some of the computers have "power management"
    and that could influence whether things like "stale mounts"
    are showing up. You would want to find a log and see if
    there is any sign of behaviors like that (mount malfunctions
    because the disk could not be accessed in time, like a stat()
    check).

    Even your NIC can be set to power down when not in use.

    Paul

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Arti F. Idiot@2:250/1 to All on Saturday, January 25, 2025 00:13:23
    On 1/24/25 9:56 AM, Mike Scott wrote:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant I don't know.

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop box.
    An example good listing would be (sorry about wrap):

    <snip>

    The OS versions are different - I'm running mint 21.2, the VM is at
    21.3; while the others are both rather older versions (and different hardware). The machines are all configured the same.


    I'm at a loss! Can anyone suggest what's going on here please? I'm sure
    this used to work!

    I'm sure you've already checked for aliased commands but if not..

    Any chance the problem machine has a different filesystem, i.e. BTRFS ?
    Delayed CoW processing of large NFS mounts could cause some weirdness.

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Anarchists of America (2:250/1@fidonet)
  • From Carlos E.R.@2:250/1 to All on Saturday, January 25, 2025 00:45:25
    On 2025-01-24 23:34, Paul wrote:
    On Fri, 1/24/2025 4:56 PM, Carlos E.R. wrote:
    On 2025-01-24 17:56, Mike Scott wrote:
    A very odd situation here.

    ....

    The FreeBSD have their own taste in software, so there's no reason
    for anything to particularly match Linux.

    I would be examining the versions on the cases that work,
    and checking the versions in the non-working case.

    In mo9dern times, some of the computers have "power management"
    and that could influence whether things like "stale mounts"
    are showing up. You would want to find a log and see if
    there is any sign of behaviors like that (mount malfunctions
    because the disk could not be accessed in time, like a stat()
    check).

    Even your NIC can be set to power down when not in use.

    I have seen nfs survive hibernation of the machines. It is quite resilient.

    Then we typically forget about the "fsid= " number.

    --
    Cheers, Carlos.

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Mike Scott@2:250/1 to All on Saturday, January 25, 2025 17:02:35
    On 24/01/2025 21:56, Carlos E.R. wrote:
    On 2025-01-24 17:56, Mike Scott wrote:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant
    I don't know.

    If you are using nfs version 3, perhaps try version 4.
    If version 4, what's in the exports file?

    All client machines have the same fstab nfs entry?




    Thanks to all for commenting.

    To clarify a few points:

    The clients in question all use autofs, and the tables are copies of a
    central master. So nfs options should be the same.

    On my own machine, it made no difference whether the fs was automounted,
    or manually.

    The precise point of error changes when the fs is remounted, manually or
    by reboot.

    BTW NFSv4 isn't really an option. A very different beast, and doesn't
    seem to offer anything I need.

    It used to work, I'm (nearly) sure - it affects my software to make a
    photo index which dates back years. I'm sure I'd have noticed an issue.

    The directories can each have several hundred files.



    Currently, I see:

    Desktop> find /nfs/mmedia/pictures/originals-index4/mike/master_digital_camera/2008_0418/ -ls >/dev/null
    find: ‘/nfs/mmedia/pictures/originals-index4/mike/master_digital_camera/2008_0418/2008_040’:
    No such file or directory

    ls /nfs/mmedia/pictures/originals-index4 [linewrapped] /mike/master_digital_camera[/2008_0418/
    <skip>
    2008_0216_102631.jpg--slide.png
    2008_0216_102631.jpg--thumb.png
    2008_040 <<<<< odd extra entry
    2008_0406_070443.jpg.exif
    etc


    I copied just that folder to within /tmp, so local: the initial copy
    failed because of that bad entry, so I did a copy-and-paste of the
    originals. A 'diff -r' on the two directories moaned that 2008_040 only existed on the nfs folder, so at least at the moment, it's a spurious
    extra entry rather than a mangled real one.


    FWIW my machine is on Linux Mint 21.2 Victoria; one of the working ones
    is on Linux Mint 21.1 Vera, so not too different. The lappy I can't
    check ATM.


    .......

    Oh, I've just rebooted after an abortive attempt to run a DVD live
    system. Now I get

    find /nfs/mmedia/pictures/ -ls >/dev/null
    find: ‘/nfs/mmedia/pictures/originals-index4/mike/camera2018/20181210/2018-12-10/’:
    No such file or directory
    find: ‘/nfs/mmedia/pictures/originals-nokeys-index/mike/master_digital_camera/2008_1219/2008_121’:
    No such file or directory
    find: ‘/nfs/mmedia/pictures/originals-nokeys-index/mike/camera2018/20181210/2018-12-10/’:
    No such file or directory


    I'm pretty sure I had a "filename too long" yesterday as well.


    I'm at a loss as to what to try next - the live DVD seemed a good idea,
    but wouldn't boot: I'll have to think about a fresh thread for that one.



    --
    Mike Scott
    Harlow, England


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Lawrence D'Oliveiro@2:250/1 to All on Sunday, January 26, 2025 00:01:47
    On Fri, 24 Jan 2025 16:56:00 +0000, Mike Scott wrote:

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop box.

    Have you checked the system logs, on both client and server side, to see
    if any interesting messages appear when you are doing these listings?

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Grant Taylor@2:250/1 to All on Sunday, January 26, 2025 02:08:05
    On 1/24/25 10:56, Mike Scott wrote:
    A very odd situation here.

    Yes, it seems that way.

    I don't have an answer, or even a hint. But I do have some additional observations that I didn't see mentioned in the thread:

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant I don't know.

    How long is "quite long"? Are you tickling any sort of limits?

    On two of three machines all running mint at various versions all is
    well; I have problems on the third, which happens to be my desktop box.

    What versions (kernel, OS, etc.) are the three machines?

    An example good listing would be (sorry about wrap):

    The line wrap actually came through well on my end.

    mike@troi ~ $ ls /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.* /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png /nfs/mmedia/pictures/originals-index4/mike/ camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png

    That corresponds exactly to what's on the server.

    Okay.

    However, on my desktop m/c, the same command complains about a missing
    file, and triplicates all the lines bar the first, which is duplicated,
    and there's an error about not finding a file that has an incorrect name anyway:

    This feel like NFS loosing state and or synchronization between the
    client and server when listing directories.

    The duplication -> triplication and the wild name seem like something
    has failed somewhere at the underlying RPC layer.

    If I unmount and remount the file system, I get different results -
    always works on the other machines, and fails /differently/ each time on mine.

    Different network / RPC / NFS mismatches would likely happen with
    underlying protocol problems.

    I've also seen this happen in a virtual machine running on my box.

    Is the VM running on the same box that has the problem? Or is it
    running on a different system?

    It happens whether I hard mount or use the automounter.

    I'm not surprised by that. IMHO the auto-mounter's only role is to automatically mount (and unmount when idle) the NFS export using
    standard mount methods.

    The OS versions are different - I'm running mint 21.2, the VM is at
    21.3; while the others are both rather older versions (and different hardware). The machines are all configured the same.

    So not exactly the same versions, but close to each other.

    I'm at a loss! Can anyone suggest what's going on here please? I'm sure
    this used to work!

    I'd reach for a packet capture and feed it into Wireshark or something
    similar that can analyze the underlying UDP / TCP, RPC, and NFS protocol
    and call out any oddities.



    --
    Grant. . . .

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: TNet Consulting (2:250/1@fidonet)
  • From Mike Scott@2:250/1 to All on Monday, January 27, 2025 15:41:37
    [ comp.unix.bsd.freebsd.misc added ]

    On 24/01/2025 17:55, Richard Kettlewell wrote:
    Mike Scott <usenet.16@scottsonline.org.uk.invalid> writes:
    A very odd situation here.

    I have a (freebsd) server serving a tree of photos and information
    files. It's large, and the paths quite long - whether that's relevant
    I don't know.

    [ screed about file names being truncated when read over nfs ]
    ........

    I'm at a loss! Can anyone suggest what's going on here please? I'm
    sure this used to work!

    Formerly, Linux NFS servers could get confused by large directories. https://lwn.net/Articles/544520/ is the best writeup I’ve found.

    In your case the server is FreeBSD, so Linux’s historical bugs aren’t directly relevant, beyond highlighting that merely listing a directory
    is more complex than you might initially imagine. I’m not sure why a hypothetical similar bug in FreeBSD would only be visible on a subset of clients either.



    OK, I've at least found what's happened, if not the root issue. Sort of
    mea culpa, for which I apologise.

    In spite of my assertion (which I should have checked and didn't), the
    mount options differed. The working machines all specified rsize=8192.
    My box was using a much larger figure, of 131072 (ie 32 * 4096).

    It seems anything over 8192 causes this issue - that filenames get
    truncated.

    Whether that's a linux client issue or a freebsd server issue, or the
    result of interworking, I've no idea. Nor can I imagine why it should
    happen without errors being flagged up somewhere (I checked the logs at
    both ends) -- which is nasty, because I had a system that met the specs
    and mostly worked but very occasionally (< about 1 in 100k times, I
    reckon) failed silently. Ouch.


    Anyway, thanks to all for comments and advice offered. I'm back 'on the
    road'; maybe if someone else hits the same issue they'll find this thread.



    --
    Mike Scott
    Harlow, England


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Lawrence D'Oliveiro@2:250/1 to All on Monday, January 27, 2025 23:24:38
    On Mon, 27 Jan 2025 15:41:37 +0000, Mike Scott wrote:

    In spite of my assertion (which I should have checked and didn't), the
    mount options differed. The working machines all specified rsize=8192.
    My box was using a much larger figure, of 131072 (ie 32 * 4096).

    It seems anything over 8192 causes this issue - that filenames get truncated.

    I don’t understand why increasing rsize on its own would have any
    effect: according to the docs, that only controls the maximum size of
    packets that this end can receive; the maximum size the other end can
    send is limited by that end’s wsize value. So increasing rsize on its
    own should have no effect.

    Looking up NFS mount options online, this page <https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
    does say “be careful when changing these values; some older Linux
    kernels and network cards do not work well with larger block sizes”.

    Whether that's a linux client issue or a freebsd server issue, or the
    result of interworking, I've no idea. Nor can I imagine why it should
    happen without errors being flagged up somewhere (I checked the logs at
    both ends) -- which is nasty, because I had a system that met the specs
    and mostly worked but very occasionally (< about 1 in 100k times, I
    reckon) failed silently. Ouch.

    That really baffles me, that you don’t see any errors indicating there was
    a problem.

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Mike Scott@2:250/1 to All on Tuesday, January 28, 2025 08:06:20
    On 27/01/2025 23:24, Lawrence D'Oliveiro wrote:
    On Mon, 27 Jan 2025 15:41:37 +0000, Mike Scott wrote:

    In spite of my assertion (which I should have checked and didn't), the
    mount options differed. The working machines all specified rsize=8192.
    My box was using a much larger figure, of 131072 (ie 32 * 4096).

    It seems anything over 8192 causes this issue - that filenames get
    truncated.

    I don’t understand why increasing rsize on its own would have any
    effect: according to the docs, that only controls the maximum size of
    packets that this end can receive; the maximum size the other end can
    send is limited by that end’s wsize value. So increasing rsize on its
    own should have no effect.

    Looking up NFS mount options online, this page <https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
    does say “be careful when changing these values; some older Linux
    kernels and network cards do not work well with larger block sizes”.

    Whether that's a linux client issue or a freebsd server issue, or the
    result of interworking, I've no idea. Nor can I imagine why it should
    happen without errors being flagged up somewhere (I checked the logs at
    both ends) -- which is nasty, because I had a system that met the specs
    and mostly worked but very occasionally (< about 1 in 100k times, I
    reckon) failed silently. Ouch.

    That really baffles me, that you don’t see any errors indicating there was a problem.

    Yes, it's an odd one in many ways. Not least because rsize/wsize are
    supposed to be irrelevant for tcp mounts (which is all the server
    provides anyway)

    I've just tried a loopback NFS mount on the server (it's the only fbsd
    box I have to hand) and can't force the problem to show. So presumably
    it's something to do with the inter-system working, but I don't have the knowledge to delve further :-{

    So I'll have to settle for 'it works now'. But as I noted, I'm very discomforted that such a problem is even possible without errors being
    flagged somewhere.

    Thanks again to all who've responded.

    --
    Mike Scott
    Harlow, England


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Scott family (2:250/1@fidonet)
  • From Carlos E.R.@2:250/1 to All on Tuesday, January 28, 2025 11:34:42
    On 2025-01-28 09:06, Mike Scott wrote:
    On 27/01/2025 23:24, Lawrence D'Oliveiro wrote:
    On Mon, 27 Jan 2025 15:41:37 +0000, Mike Scott wrote:

    ....

    So I'll have to settle for 'it works now'. But as I noted, I'm very discomforted that such a problem is even possible without errors being flagged somewhere.

    Maybe report to some bug tracker at the "distributions" involved.


    --
    Cheers, Carlos.

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From pinnerite@2:250/1 to All on Tuesday, January 28, 2025 22:58:45
    On Tue, 28 Jan 2025 08:06:20 +0000
    Mike Scott <usenet.16@scottsonline.org.uk.invalid> wrote:

    On 27/01/2025 23:24, Lawrence D'Oliveiro wrote:
    On Mon, 27 Jan 2025 15:41:37 +0000, Mike Scott wrote:

    In spite of my assertion (which I should have checked and didn't), the
    mount options differed. The working machines all specified rsize=8192.
    My box was using a much larger figure, of 131072 (ie 32 * 4096).

    It seems anything over 8192 causes this issue - that filenames get
    truncated.

    I don’t understand why increasing rsize on its own would have any
    effect: according to the docs, that only controls the maximum size of packets that this end can receive; the maximum size the other end can
    send is limited by that end’s wsize value. So increasing rsize on its
    own should have no effect.

    Looking up NFS mount options online, this page <https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
    does say “be careful when changing these values; some older Linux
    kernels and network cards do not work well with larger block sizes”.

    Whether that's a linux client issue or a freebsd server issue, or the
    result of interworking, I've no idea. Nor can I imagine why it should
    happen without errors being flagged up somewhere (I checked the logs at
    both ends) -- which is nasty, because I had a system that met the specs
    and mostly worked but very occasionally (< about 1 in 100k times, I
    reckon) failed silently. Ouch.

    That really baffles me, that you don’t see any errors indicating there was
    a problem.

    Yes, it's an odd one in many ways. Not least because rsize/wsize are supposed to be irrelevant for tcp mounts (which is all the server
    provides anyway)

    I've just tried a loopback NFS mount on the server (it's the only fbsd
    box I have to hand) and can't force the problem to show. So presumably
    it's something to do with the inter-system working, but I don't have the knowledge to delve further :-{

    So I'll have to settle for 'it works now'. But as I noted, I'm very discomforted that such a problem is even possible without errors being flagged somewhere.

    Thanks again to all who've responded.

    --
    Mike Scott
    Harlow, England


    I had a similar situation several months ago.
    I tested the drive and tried repairs but in the end it was clear the drive had reached its "sell-by" date.
    Then a second one went too.
    Both were made by Seagate (in different countries) and manufactered five years apart.

    Regards,

    Alan

    --
    Linux Mint 21.3 kernel version 5.15.0-127-generic Cinnamon 6.0.4
    AMD Ryzen 7 7700, Radeon RX 6600, 32GB DDR5, 1TB SSD, 2TB Barracuda

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)