• Ubuntu 22 Boot Errors

    From Java Jive@2:250/1 to All on Sunday, December 08, 2024 11:41:05
    I'm going round my Ubuntu 22 machines trying to remove error and fail
    messages from the boot, mostly successfully, but five anomalies are
    proving hard to fix, despite which the PCs all seem to work ...

    1) IOAPIC

    This is occurring on more than one PC. Searching for ...

    ERROR: Unable to locate IOAPIC for GSI 37

    .... gave hits but seemingly no useful way of finding and removing the
    cause. Relevant output from 'journalctl -b' is apparently just ...

    Dec 07 18:49:55 HOSTNAME kernel: ERROR: Unable to locate IOAPIC for GSI 37
    Dec 07 18:49:55 HOSTNAME kernel: ERROR: Unable to locate IOAPIC for GSI 38

    .... but here, in case it's relevant, are those lines in a wider context:

    Dec 07 18:49:55 HOSTNAME kernel: scsi 4:0:0:0: Direct-Access ATA
    KINGSTON SKC600M 0105 PQ: 0 ANSI: 5
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: Attached scsi generic sg2
    type 0
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] 500118192 512-byte
    logical blocks: (256 GB/238 GiB)
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] 4096-byte physical blocks Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] Write Protect is off
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] Mode Sense: 00 3a 00 00
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] Write cache: enabled,
    read cache: enabled, doesn't support DPO or FUA
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] Preferred minimum I/O
    size 4096 bytes
    Dec 07 18:49:55 HOSTNAME kernel: sdb: sdb1 sdb2 sdb3 sdb4
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] supports TCG Opal
    Dec 07 18:49:55 HOSTNAME kernel: sd 4:0:0:0: [sdb] Attached SCSI disk
    Dec 07 18:49:55 HOSTNAME kernel: ERROR: Unable to locate IOAPIC for GSI 37
    Dec 07 18:49:55 HOSTNAME kernel: ERROR: Unable to locate IOAPIC for GSI 38
    Dec 07 18:49:55 HOSTNAME kernel: e1000e 0000:00:19.0 eno1: renamed from eth0 Dec 07 18:49:55 HOSTNAME kernel: EXT4-fs (sdb4): mounted filesystem 028fbdaf-b201-4b38-9471-d1c09dbc51c5 ro with ordered data mode. Quota
    mode: none.


    2) blkmapd

    I think this is occurring on ALL my Ubuntu 22 machines. Appears to be
    related to NFS, but networking seems fine (apart from a minor seemingly unrelated issue already solved). Searching for ...

    "blkmapd[717]: open pipe file /run/rpc_pipefs/nfs/blocklayout
    failed: No such file or directory"

    .... yielded hits linking to or repeating one potential solution here ...

    https://forums.raspberrypi.com/viewtopic.php?t=323585

    sudo mkdir /etc/systemd/system/nfs-blkmap.service.d
    sudo cat > /etc/systemd/system/nfs-blkmap.service.d/fixpipe.conf <<EOF
    #
    [Service]
    ExecStartPre=/usr/sbin/modprobe blocklayoutdriver
    EOF

    .... but the suggestion didn't work for me on any of my machines. Full
    log is just a single line:

    Dec 07 00:24:10 HOSTNAME blkmapd[717]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory


    3) CUPS Scheduler

    This also is occuring on many or all of my Ubuntu 22 machines, even a
    while after a successful boot. Oddly the status of cups service always
    shows it to be working.

    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: start operation timed
    out. Terminating.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Failed with result 'timeout'.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Failed to start CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Scheduled restart
    job, restart counter is at 5.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.path: Deactivated successfully.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Started CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.socket: Deactivated successfully.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Closed CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Listening on CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Starting CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME audit[1760]: AVC apparmor="DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=1760 comm="cupsd" capability=12 capname="net_>


    4) UBSAN

    This is on a laptop with two on-board GPUs, and I think is related to
    that fact. However, there were no hits under DuckDuckGo, Google, or
    Yahoo for ...

    Ubuntu 22 "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.8-W0MdK2/linux-hwe-6.8-6.8.0/drivers/gpu/drm/radeon/radeon_atombios.c:633:33"

    .... full output from 'journalctl -b' is (two sections):

    Dec 06 23:15:49 HOSTNAME kernel: ------------[ cut here ]------------
    Dec 06 23:15:49 HOSTNAME kernel: UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.8-W0MdK2/linux-hwe-6.8-6.8.0/drivers/gpu/drm/radeon/radeon_atombios.c:633:33
    Dec 06 23:15:49 HOSTNAME kernel: index 22 is out of range for type 'int
    [22]'
    Dec 06 23:15:49 HOSTNAME kernel: CPU: 1 PID: 376 Comm: systemd-udevd Not tainted 6.8.0-47-generic #47~22.04.1-Ubuntu
    Dec 06 23:15:49 HOSTNAME kernel: Hardware name: Dell Inc. Precision M6800/0F5HF3, BIOS A26 06/13/2019
    Dec 06 23:15:49 HOSTNAME kernel: Call Trace:
    Dec 06 23:15:49 HOSTNAME kernel: <TASK>
    Dec 06 23:15:49 HOSTNAME kernel: dump_stack_lvl+0x76/0xa0
    Dec 06 23:15:49 HOSTNAME kernel: dump_stack+0x10/0x20
    Dec 06 23:15:49 HOSTNAME kernel: __ubsan_handle_out_of_bounds+0xc6/0x110
    Dec 06 23:15:49 HOSTNAME kernel: radeon_get_atom_connector_info_from_object_table+0xa1e/0xa80 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: ? radeon_get_atom_connector_info_from_supported_devices_table+0x797/0x890 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: radeon_modeset_init+0x36b/0x3f0 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: ? radeon_modeset_init+0x36b/0x3f0 [radeon] Dec 06 23:15:49 HOSTNAME kernel: radeon_driver_load_kms+0xdf/0x300 [radeon] Dec 06 23:15:49 HOSTNAME kernel: drm_dev_register+0x12b/0x2a0
    Dec 06 23:15:49 HOSTNAME kernel: radeon_pci_probe+0xec/0x180 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: local_pci_probe+0x47/0xb0
    Dec 06 23:15:49 HOSTNAME kernel: pci_call_probe+0x55/0x1a0
    Dec 06 23:15:49 HOSTNAME kernel: pci_device_probe+0x84/0x120
    Dec 06 23:15:49 HOSTNAME kernel: really_probe+0x1cc/0x430
    Dec 06 23:15:49 HOSTNAME kernel: __driver_probe_device+0x8c/0x190
    Dec 06 23:15:49 HOSTNAME kernel: driver_probe_device+0x24/0xd0
    Dec 06 23:15:49 HOSTNAME kernel: __driver_attach+0x10b/0x210
    Dec 06 23:15:49 HOSTNAME kernel: ? __pfx___driver_attach+0x10/0x10
    Dec 06 23:15:49 HOSTNAME kernel: bus_for_each_dev+0x8d/0xf0
    Dec 06 23:15:49 HOSTNAME kernel: driver_attach+0x1e/0x30
    Dec 06 23:15:49 HOSTNAME kernel: bus_add_driver+0x14e/0x290
    Dec 06 23:15:49 HOSTNAME kernel: driver_register+0x5e/0x130
    Dec 06 23:15:49 HOSTNAME kernel: ? __pfx_radeon_module_init+0x10/0x10 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: __pci_register_driver+0x5e/0x70
    Dec 06 23:15:49 HOSTNAME kernel: radeon_module_init+0x4c/0xff0 [radeon]
    Dec 06 23:15:49 HOSTNAME kernel: do_one_initcall+0x5e/0x340
    Dec 06 23:15:49 HOSTNAME kernel: do_init_module+0x97/0x290
    Dec 06 23:15:49 HOSTNAME kernel: load_module+0xb85/0xcd0
    Dec 06 23:15:49 HOSTNAME kernel: ? security_kernel_post_read_file+0x75/0x90 Dec 06 23:15:49 HOSTNAME kernel: init_module_from_file+0x96/0x100
    Dec 06 23:15:49 HOSTNAME kernel: ? init_module_from_file+0x96/0x100
    Dec 06 23:15:49 HOSTNAME kernel: idempotent_init_module+0x11c/0x2b0
    Dec 06 23:15:49 HOSTNAME kernel: __x64_sys_finit_module+0x64/0xd0
    Dec 06 23:15:49 HOSTNAME kernel: x64_sys_call+0x169c/0x24b0
    Dec 06 23:15:49 HOSTNAME kernel: do_syscall_64+0x81/0x170
    Dec 06 23:15:49 HOSTNAME kernel: ? do_syscall_64+0x8d/0x170
    Dec 06 23:15:49 HOSTNAME kernel: ? irqentry_exit+0x43/0x50
    Dec 06 23:15:49 HOSTNAME kernel: entry_SYSCALL_64_after_hwframe+0x78/0x80
    Dec 06 23:15:49 HOSTNAME kernel: RIP: 0033:0x7372f531e88d
    Dec 06 23:15:49 HOSTNAME kernel: Code: 5b 41 5c c3 66 0f 1f 84 00 00 00
    00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8
    4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00
    f7 d8 64 >
    Dec 06 23:15:49 HOSTNAME kernel: RSP: 002b:00007fff96f14378 EFLAGS:
    00000246 ORIG_RAX: 0000000000000139
    Dec 06 23:15:49 HOSTNAME kernel: RAX: ffffffffffffffda RBX:
    000063415483a6b0 RCX: 00007372f531e88d
    Dec 06 23:15:49 HOSTNAME kernel: RDX: 0000000000000000 RSI:
    00007372f5599441 RDI: 0000000000000015
    Dec 06 23:15:49 HOSTNAME kernel: RBP: 0000000000020000 R08:
    0000000000000000 R09: 0000000000000002
    Dec 06 23:15:49 HOSTNAME kernel: R10: 0000000000000015 R11:
    0000000000000246 R12: 00007372f5599441
    Dec 06 23:15:49 HOSTNAME kernel: R13: 000063415473c130 R14:
    000063415483a7a0 R15: 00006341548364e0
    Dec 06 23:15:49 HOSTNAME kernel: </TASK>
    Dec 06 23:15:49 HOSTNAME kernel: ---[ end trace ]---

    .... later ...

    Dec 07 17:08:16 HOSTNAME kernel: [drm] PCIE GART of 2048M enabled (table
    at 0x00000000001D6000).
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: WB enabled
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 0 use gpu addr 0x0000000080000c00
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 1 use gpu addr 0x0000000080000c04
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 2 use gpu addr 0x0000000080000c08
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 3 use gpu addr 0x0000000080000c0c
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 4 use gpu addr 0x0000000080000c10
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 5 use gpu addr 0x0000000000075a18
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 6 use gpu addr 0x0000000080000c18
    Dec 07 17:08:16 HOSTNAME kernel: radeon 0000:01:00.0: fence driver on
    ring 7 use gpu addr 0x0000000080000c1c
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_gfx' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_cp1' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_cp2' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_dma1' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_dma2' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 0 succeeded in 1 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 1 succeeded in 1 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 2 succeeded in 1 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 3 succeeded in 9 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 4 succeeded in 3 usecs
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_uvd' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 5 succeeded in 2 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] UVD initialized successfully.
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_vce1' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: debugfs: File 'radeon_ring_vce2' in
    directory '0' already present!
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 6 succeeded in 12 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ring test on 7 succeeded in 4 usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] VCE initialized successfully.
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ib test on ring 0 succeeded in 0
    usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ib test on ring 1 succeeded in 0
    usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ib test on ring 2 succeeded in 0
    usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ib test on ring 3 succeeded in 0
    usecs
    Dec 07 17:08:17 HOSTNAME kernel: [drm] ib test on ring 4 succeeded in 0
    usecs
    Dec 07 17:08:18 HOSTNAME kernel: [drm] ib test on ring 5 succeeded
    Dec 07 17:08:18 HOSTNAME kernel: [drm] ib test on ring 6 succeeded
    Dec 07 17:08:19 HOSTNAME kernel: [drm] ib test on ring 7 succeeded


    5) Initiating RAM registers

    The following is occurring very early in the logs on 2 machines with
    32GB RAM, and seems to be about how set up registers for RAM access, as
    per these two links ...

    https://web.archive.org/web/20190904223631/http://my-fuzzy-logic.de/blog/index.php?/archives/41-Solving-linux-MTRR-problems.html
    https://forum.proxmox.com/threads/solved-proxmox-8-1-3-secureboot-and-bad-gran-size-errors-how-to-fix.138470/

    Dec 06 21:18:47 HOSTNAME kernel: total RAM covered: 32702M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 64K
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size:
    128K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size:
    256K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size:
    512K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 1M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64K chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    1G num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64K chunk_size:
    2G num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size:
    128K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size:
    256K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size:
    512K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size: 1M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size:
    16M num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128K chunk_size:
    32M num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 64M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 128M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 256M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 512M
    num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 1G
    num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128K chunk_size: 2G
    num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size:
    256K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size:
    512K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size: 1M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size:
    16M num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256K chunk_size:
    32M num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 64M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 128M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 256M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 512M
    num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 1G
    num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 256K chunk_size: 2G
    num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size:
    512K num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size: 1M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size:
    16M num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 512K chunk_size:
    32M num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 64M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 128M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 256M
    num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 512M
    num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 1G
    num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 512K chunk_size: 2G
    num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 1M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 1M chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    1G num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 1M chunk_size:
    2G num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 2M chunk_size: 2M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 2M chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 2M chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 2M chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 2M chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    1G num_reg: 10 lose cover RAM: -272M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 2M chunk_size:
    2G num_reg: 10 lose cover RAM: -1296M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 4M chunk_size: 4M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 4M chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 4M chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 4M chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    1G num_reg: 10 lose cover RAM: -270M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 4M chunk_size:
    2G num_reg: 10 lose cover RAM: -1294M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 8M chunk_size: 8M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 8M chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 8M chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    1G num_reg: 10 lose cover RAM: -266M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 8M chunk_size:
    2G num_reg: 10 lose cover RAM: -1290M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 16M chunk_size: 16M
    num_reg: 10 lose cover RAM: 110M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 16M chunk_size: 32M
    num_reg: 10 lose cover RAM: 238M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    64M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    128M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    256M num_reg: 10 lose cover RAM: -18M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    512M num_reg: 10 lose cover RAM: -274M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    1G num_reg: 10 lose cover RAM: -242M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 16M chunk_size:
    2G num_reg: 10 lose cover RAM: -1266M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 32M chunk_size: 32M
    num_reg: 10 lose cover RAM: 62M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 32M chunk_size: 64M
    num_reg: 10 lose cover RAM: 30M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 32M chunk_size:
    128M num_reg: 10 lose cover RAM: 30M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 32M chunk_size:
    256M num_reg: 10 lose cover RAM: 30M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 32M chunk_size:
    512M num_reg: 10 lose cover RAM: -226M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 32M chunk_size: 1G
    num_reg: 10 lose cover RAM: 30M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 32M chunk_size:
    2G num_reg: 10 lose cover RAM: -994M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64M chunk_size: 64M
    num_reg: 10 lose cover RAM: 62M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64M chunk_size:
    128M num_reg: 10 lose cover RAM: 62M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64M chunk_size:
    256M num_reg: 10 lose cover RAM: 62M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64M chunk_size:
    512M num_reg: 10 lose cover RAM: -194M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 64M chunk_size: 1G
    num_reg: 10 lose cover RAM: 62M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 64M chunk_size:
    2G num_reg: 10 lose cover RAM: -962M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128M chunk_size:
    128M num_reg: 8 lose cover RAM: 190M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128M chunk_size:
    256M num_reg: 10 lose cover RAM: 190M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128M chunk_size: 512M
    num_reg: 10 lose cover RAM: -66M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 128M chunk_size: 1G
    num_reg: 10 lose cover RAM: 190M
    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128M chunk_size: 2G
    num_reg: 10 lose cover RAM: -834M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256M chunk_size:
    256M num_reg: 6 lose cover RAM: 446M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256M chunk_size:
    512M num_reg: 6 lose cover RAM: 446M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256M chunk_size: 1G
    num_reg: 7 lose cover RAM: 446M
    Dec 06 21:18:47 HOSTNAME kernel: gran_size: 256M chunk_size: 2G
    num_reg: 8 lose cover RAM: 446M
    Dec 06 21:18:47 HOSTNAME kernel: Linux version 6.8.0-47-generic (buildd@lcy02-amd64-019) (x86_64-linux-gnu-gcc-12 (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #47~22.04.1-Ubuntu SMP PREEMPT_D


    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Andy Burns@2:250/1 to All on Sunday, December 08, 2024 12:00:34
    Java Jive wrote:

          ERROR: Unable to locate IOAPIC for GSI 37

    Possible dodgy config tables in BIOS/UEFI.

    firmware upgrade?
    or legacy options you can disable?

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From vallor@2:250/1 to All on Monday, December 09, 2024 03:36:32
    On Sun, 8 Dec 2024 11:41:05 +0000, Java Jive <java@evij.com.invalid> wrote
    in <vj40kk$3p436$1@dont-email.me>:

    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128M chunk_size: 2G
    num_reg: 10 lose cover RAM: -834M

    I don't know why this would happen, but if it happened to me,
    I'd run "lsmem" and "free" and make sure all the memory eventually
    made it online...

    Also, it looks like it might have something to do with mtrr. On
    my host, dmesg reads:

    [ 0.000000] total RAM covered: 3071M
    [ 0.000000] Found optimal setting for mtrr clean up
    [ 0.000000] gran_size: 64K chunk_size: 128M num_reg: 3
    lose cover RAM: 0G
    [ 0.000000] MTRR map: 7 entries (3 fixed + 4 variable; max 20), built
    from 9 variable MTRRs

    So these entries may be due to MTRR, which according to Documentation/arch/x86/mtrr.rst is getting phased out. On my
    system, I see this when I cat /proc/mtrr:

    $ sudo cat /proc/mtrr
    reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
    reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
    reg02: base=0x0ba0a0000 ( 2976MB), size= 64KB, count=1: uncachable

    Give a gander to that document, it outlines what mtrr might be used for
    in modern-day kernels.

    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/arch/x86/mtrr.rst?h=v6.12.3

    --
    -v System76 Thelio Mega v1.1 x86_64 NVIDIA RTX 3090 Ti
    OS: Linux 6.12.3 Release: Mint 21.3 Mem: 258G

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: Air Applewood, The Linux Gateway to the UK & Eire (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Monday, December 09, 2024 20:33:25
    On 2024-12-09 03:36, vallor wrote:

    On Sun, 8 Dec 2024 11:41:05 +0000, Java Jive <java@evij.com.invalid> wrote
    in <vj40kk$3p436$1@dont-email.me>:

    Dec 06 21:18:47 HOSTNAME kernel: *BAD*gran_size: 128M chunk_size: 2G
    num_reg: 10 lose cover RAM: -834M

    I don't know why this would happen, but if it happened to me,
    I'd run "lsmem" and "free" and make sure all the memory eventually
    made it online...

    My first reaction on seeing the messages - which in my OP I should
    have mentioned that I'd already done, but it slipped my mind - was to
    boot into memcheck and let a full cycle complete, but no memory problems
    were found.

    Also, it looks like it might have something to do with mtrr.

    Yes, as originally I linked.

    On
    my host, dmesg reads:

    [ 0.000000] total RAM covered: 3071M
    [ 0.000000] Found optimal setting for mtrr clean up
    [ 0.000000] gran_size: 64K chunk_size: 128M num_reg: 3
    lose cover RAM: 0G
    [ 0.000000] MTRR map: 7 entries (3 fixed + 4 variable; max 20), built
    from 9 variable MTRRs

    So these entries may be due to MTRR, which according to Documentation/arch/x86/mtrr.rst is getting phased out. On my
    system, I see this when I cat /proc/mtrr:

    $ sudo cat /proc/mtrr
    reg00: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
    reg01: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
    reg02: base=0x0ba0a0000 ( 2976MB), size= 64KB, count=1: uncachable

    Give a gander to that document, it outlines what mtrr might be used for
    in modern-day kernels.

    https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/Documentation/arch/x86/mtrr.rst?h=v6.12.3

    These are both Dell Precisions, an M6700 and an M6800, so no chance of
    PAT. One of the links I gave suggests giving some kernel boot
    parameters to suggest a good compromise setting which minimises unused
    RAM and thereby saves the failed testing which gave rise to the testing
    trace that I quoted. I wan't very specific in my OP, but it think the
    best I can do is determine such boot parameters, and any help with that
    from someone more knowledgeable than myself would be much appreciated.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Tuesday, December 10, 2024 13:42:31
    On 2024-12-08 12:00, Andy Burns wrote:

    Java Jive wrote:

           ERROR: Unable to locate IOAPIC for GSI 37

    Possible dodgy config tables in BIOS/UEFI.

    firmware upgrade?
    or legacy options you can disable?

    Yes, thanks, thinking that there might be something to any of your
    suggestions above, I've spent the last day or so trying to determine
    exactly which PCs were showing that fault. Originally, I'm fairly sure
    that there were more than just one, but now it seems to be just one,
    this one. I think the difference between the time of my OP and now is
    that in the meantime I've continued trying to clean up the boot
    messages, and this has involved uninstalling Virtualbox on all of the
    PCs, including this one. So ...

    My best guess for the other PCs is that Virtualbox was causing the messages.

    As for this PC, which is a Dell Precision M6800, something that I've
    noticed for the M6700/6800 series, which are of near identical design to
    each other, is that sometimes the COM port shows up under Windows as
    having problems and not installing properly. I'm not quite sure why
    this should be, but as there is no external serial connector anyway, and
    I haven't used an actual COM port [*] for around two decades either, I
    haven't bothered to investigate this phenomenon further.

    * I've used USB-to-low-voltage-serial cables, such as the Sony DKU-5
    cable that used to be used to connect their phones to a PC, to flash
    hardware such as routers, but not an actual COM port.

    Putting the above together with the fact that I've noticed that, on a
    Dell Inspiron, a daughterboard for connecting an NVMe only actually has
    the connector on those model variants that were supplied originally with
    an NVMe drive, other models have an otherwise identical daughterboard
    but without the actual connector - thus saving a few cents per PC
    sale! - I'm wondering if with these M6700/6800s Dell may have been
    doing something similar, populating the boards with some, but not all,
    of the hardware necessary for the COM port, this time saving on some of
    the actual chips required instead of just a connector.

    Either that or a PCB has a fault, but, being ATM in the middle of
    'churning' my hardware, I have three of these machines, and two of them
    show similar symptoms relating to the COM port, which seems a rather
    high improbability of 2 out of 3 machines bought pseudo randomly from different eBay suppliers having the same board fault on arrival?

    At any rate, my best guess for this PC is that the original message that
    I queried is being caused by the 'faulty' COM port.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Paul@2:250/1 to All on Tuesday, December 10, 2024 14:40:30
    On Tue, 12/10/2024 8:42 AM, Java Jive wrote:
    On 2024-12-08 12:00, Andy Burns wrote:

    Java Jive wrote:

           ERROR: Unable to locate IOAPIC for GSI 37

    Possible dodgy config tables in BIOS/UEFI.

    firmware upgrade?
    or legacy options you can disable?

    Yes, thanks, thinking that there might be something to any of your suggestions above, I've spent the last day or so trying to determine exactly which PCs were showing that fault.  Originally, I'm fairly sure that there were more than just one, but now it seems to be just one, this one.  I think the difference between the time of my OP and now is that in the meantime I've continued trying to clean up the boot messages, and this has involved uninstalling Virtualbox on all of the PCs, including this one.  So ...

    My best guess for the other PCs is that Virtualbox was causing the messages.

    As for this PC, which is a Dell Precision M6800, something that I've noticed for the M6700/6800 series, which are of near identical design to each other, is that sometimes the COM port shows up under Windows as having problems and not installing properly.  I'm not quite sure why this should be, but as there is no external serial connector anyway, and I haven't used an actual COM port [*] for around two decades either, I haven't bothered to investigate this phenomenon further.

    *  I've used USB-to-low-voltage-serial cables, such as the Sony DKU-5 cable that used to be used to connect their phones to a PC, to flash hardware such as routers, but not an actual COM port.

    Putting the above together with the fact that I've noticed that, on a Dell Inspiron, a daughterboard for connecting an NVMe only actually has the connector on those model variants that were supplied originally with an NVMe drive, other models have an otherwise identical daughterboard but without the actual connector  -  thus saving a few cents per PC sale!  -  I'm wondering if with these M6700/6800s Dell may have been doing something similar, populating the boards with some, but not all, of the hardware necessary for the COM port, this time saving on some of the actual chips required instead of just a connector.

    Either that or a PCB has a fault, but, being ATM in the middle of 'churning' my hardware, I have three of these machines, and two of them show similar symptoms relating to the COM port, which seems a rather high improbability of 2 out of 3 machines bought pseudo randomly from different eBay suppliers having the same board fault on arrival?

    At any rate, my best guess for this PC is that the original message that I queried is being caused by the 'faulty' COM port.


    This is just a random suggestion, with no evidence to back it up.

    Power off the machine, remove one of the DIMMs and try your dmesg
    readout a second time. and see if your granularity issue changes.

    it could be that the address map on the chipset is defective somehow.

    The machine I'm typing on, has such a problem, and it used to freeze
    in the graphics driver, because the shared memory was somehow double defined
    or something. It's my suspicion that with less than max RAM installed,
    it would behave itself.

    *******

    The other idea I tried out here, is I figured a machine with Intel Management Engine,
    the addressing may need to provide space for Minux to run. And maybe the offset causes by that, is the problem. But when I tested that theory on the Optiplex 780,
    dmesg was as clean as could be. It looked like PAT had been used. So that does not
    look like a credible possibility.

    Paul

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)
  • From Theo@2:250/1 to All on Wednesday, December 11, 2024 14:53:10
    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:
    I'm going round my Ubuntu 22 machines trying to remove error and fail messages from the boot, mostly successfully, but five anomalies are
    proving hard to fix, despite which the PCs all seem to work ...

    1) IOAPIC

    This is occurring on more than one PC. Searching for ...

    ERROR: Unable to locate IOAPIC for GSI 37

    That means it can't work out which interrupt controller is used for a particular interrupt. Likely means the ACPI tables are incomplete. If everything works you can ignore this.

    2) blkmapd

    I think this is occurring on ALL my Ubuntu 22 machines. Appears to be related to NFS, but networking seems fine (apart from a minor seemingly unrelated issue already solved). Searching for ...

    "blkmapd[717]: open pipe file /run/rpc_pipefs/nfs/blocklayout
    failed: No such file or directory"

    Are you actually running NFS? If not you can ignore this.

    3) CUPS Scheduler

    This also is occuring on many or all of my Ubuntu 22 machines, even a
    while after a successful boot. Oddly the status of cups service always shows it to be working.

    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: start operation timed out. Terminating.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Failed with result 'timeout'.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Failed to start CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Scheduled restart
    job, restart counter is at 5.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.path: Deactivated successfully.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Started CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.socket: Deactivated successfully. Dec 07 17:06:00 HOSTNAME systemd[1]: Closed CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Listening on CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Starting CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME audit[1760]: AVC apparmor="DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=1760 comm="cupsd" capability=12 capname="net_>

    I'm not seeing a problem there? CUPS didn't start first time round because something was busy, but tried again and succeeded.

    4) UBSAN

    This is on a laptop with two on-board GPUs, and I think is related to
    that fact. However, there were no hits under DuckDuckGo, Google, or
    Yahoo for ...

    Ubuntu 22 "UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.8-W0MdK2/linux-hwe-6.8-6.8.0/drivers/gpu/drm/radeon/radeon_atombios.c:633:33"

    UBSAN is the Undefined Behaviour Sanitiser, ie a debugging tool. Something went wrong in the driver for AMD Radeon GPUs, ie a bug, maybe due to the relatively elderly GPU you have. If you aren't using the AMD GPU you can ignore it if it's not actually causing a crash (or could disable the driver
    if you wanted).

    5) Initiating RAM registers

    The following is occurring very early in the logs on 2 machines with
    32GB RAM, and seems to be about how set up registers for RAM access, as
    per these two links ...

    This seems to be related to an MTRR problem - maybe the hardware doesn't let the kernel find the optimal memory layout with more RAM than it was
    originally designed for. How much RAM does Linux show you have after it's booted? Does it lose any memory, and can you live with having the amount
    that remains?


    Most of these seem like 'new Linux, old hardware' issues, but nothing
    actually to worry me there. I'd check you're on the latest BIOS as that
    might help some of the ACPI related issues.

    Theo

    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: University of Cambridge, England (2:250/1@fidonet)
  • From Java Jive@2:250/1 to All on Thursday, December 12, 2024 12:11:05
    On 2024-12-11 14:53, Theo wrote:

    In uk.comp.os.linux Java Jive <java@evij.com.invalid> wrote:

    I'm going round my Ubuntu 22 machines trying to remove error and fail
    messages from the boot, mostly successfully, but five anomalies are
    proving hard to fix, despite which the PCs all seem to work ...

    1) IOAPIC

    This is occurring on more than one PC. Searching for ...

    ERROR: Unable to locate IOAPIC for GSI 37

    That means it can't work out which interrupt controller is used for a particular interrupt. Likely means the ACPI tables are incomplete. If everything works you can ignore this.

    Everything of any importance seems to be working, but see also my reply
    to Andy regarding the COM port.

    2) blkmapd

    I think this is occurring on ALL my Ubuntu 22 machines. Appears to be
    related to NFS, but networking seems fine (apart from a minor seemingly
    unrelated issue already solved). Searching for ...

    "blkmapd[717]: open pipe file /run/rpc_pipefs/nfs/blocklayout
    failed: No such file or directory"

    Are you actually running NFS? If not you can ignore this.

    Yes, and it *seems* to be running fine, so I'm not sure what is going on
    here.

    3) CUPS Scheduler

    This also is occuring on many or all of my Ubuntu 22 machines, even a
    while after a successful boot. Oddly the status of cups service always
    shows it to be working.

    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: start operation timed
    out. Terminating.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Failed with result
    'timeout'.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Failed to start CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.service: Scheduled restart
    job, restart counter is at 5.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.path: Deactivated successfully.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopped CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Started CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: cups.socket: Deactivated successfully. >> Dec 07 17:06:00 HOSTNAME systemd[1]: Closed CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Stopping CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME systemd[1]: Listening on CUPS Scheduler.
    Dec 07 17:06:00 HOSTNAME systemd[1]: Starting CUPS Scheduler...
    Dec 07 17:06:00 HOSTNAME audit[1760]: AVC apparmor="DENIED"
    operation="capable" profile="/usr/sbin/cupsd" pid=1760 comm="cupsd"
    capability=12 capname="net_>

    I'm not seeing a problem there? CUPS didn't start first time round because something was busy, but tried again and succeeded.

    I see, but then presumably it's an error by the folk who wrote the code
    to flag it as an error.

    4) UBSAN

    This is on a laptop with two on-board GPUs, and I think is related to
    that fact. However, there were no hits under DuckDuckGo, Google, or
    Yahoo for ...

    Ubuntu 22 "UBSAN: array-index-out-of-bounds in
    /build/linux-hwe-6.8-W0MdK2/linux-hwe-6.8-6.8.0/drivers/gpu/drm/radeon/radeon_atombios.c:633:33"

    UBSAN is the Undefined Behaviour Sanitiser, ie a debugging tool. Something went wrong in the driver for AMD Radeon GPUs, ie a bug, maybe due to the relatively elderly GPU you have. If you aren't using the AMD GPU you can ignore it if it's not actually causing a crash (or could disable the driver if you wanted).

    I understand. I'm not aware of any problems with graphics under Ubuntu.

    5) Initiating RAM registers

    The following is occurring very early in the logs on 2 machines with
    32GB RAM, and seems to be about how set up registers for RAM access, as
    per these two links ...

    This seems to be related to an MTRR problem - maybe the hardware doesn't let the kernel find the optimal memory layout with more RAM than it was originally designed for. How much RAM does Linux show you have after it's booted? Does it lose any memory, and can you live with having the amount that remains?

    lsmem gives ...

    root@HOSTNAME:home# lsmem
    RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x00000000bfffffff 3G online yes 0-23 0x0000000100000000-0x000000083fffffff 29G online yes 32-263

    Memory block size: 128M
    Total online memory: 32G
    Total offline memory: 0B

    .... so Ubuntu seems to be able to access all the actual physical RAM.

    Most of these seem like 'new Linux, old hardware' issues, but nothing actually to worry me there. I'd check you're on the latest BIOS as that might help some of the ACPI related issues.

    I see. Thanks very much for your help, Theo, much appreciated.

    --

    Fake news kills!

    I may be contacted via the contact address given on my website: www.macfh.co.uk


    --- MBSE BBS v1.1.0 (Linux-x86_64)
    * Origin: A noiseless patient Spider (2:250/1@fidonet)