Talos CM4 Node 3
# │forum
a
Hi all, I know this has been discussed to death already but I'm trying understand if there's a working workaround for the boot issue when trying to run Talos on Node3. https://github.com/siderolabs/talos/issues/7358 If I were to rebuild the Talos image with u-boot v2024.01 could I expect it to work or is additional work needed? Apologies again for the forum post, but I'm struggling to find the current threads in the chat channel! Thanks!
c
The other necessary fix is [this commit]() landing in Linux 6.8, which stops the kernel from panicking on boot. It can be easily applied to 6.1/6.6 though.
a
Ah righty thank you!
I've been able to build u-boot and kernel images for amd64. The hardest part was working out which patches were needed. Now having trouble building for arm64, think I'd be better off finding an arm system to do the build as the emulation is so slow!
c
Install your distro-of-choice's packages for
aarch64-none-elf-gcc
, then you can set these environment variables once:
export ARCH=arm64 CROSS_COMPILE=aarch64-none-elf-
...and Linux and U-Boot's buildsystems will both understand from that what you expect. By the way, you're far from the only person to apply these patches to get Talos usable on a CM4 in slot 3. I think there are already-patched images out there (not that I could quickly find one with a 2 minute GitHub search) that you might want to use instead, unless you enjoy the challenge of compiling all this stuff yourself. 🙂
a
Thanks, I have managed to get everything built and the image now boots. Was a cool little learning experience. Deffo underestimated it, but was nice to go through everyone's contributions and learn how to build everything. Thank you for your help!
b
Thank goodness I've found this thread. Just ran into this problem today.
Does this also affect RK1? I guess I'll find out at the end of the week when mine come in...
m
It happend to me with nixos on cm4 but ubuntu was fine. now on the RK1 is happening too with ubuntu
nevermind... it was my fault
a
I see Talos 1.7.0 has dropped and uses Linux 6.6.23. The patch made it into the 6.8 release so I'll try and give the build another go. u-boot has been moved into the SBC overlay but has the updated version of u-boot so I think this should just work 🤞
Having said all that, I do want to try and RK1s as well so I wonder if I should look at pl4nty's fork which is using 6.9-rc4 Just takes so long to build!
c
@Alex I see you got this working... Any chance you have some instructions ?
a
I had it working on 1.6 but I've not got it working with 1.7 yet. Trying to see if I can get github to build the images
c
Thanks I am gonna give it a go trying to compile it then as well... I have 3xrk1 but wan to run the rpi in slot 3 as I want it to use a SSD sicne the RK1 all have an NVME :/
Success... cross compiling a kernel took forever but I am up and running ! If anyone needs an image I am happy to share it 🙂
Copy code
kgno -o wide
NAME     STATUS   ROLES           AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION   CONTAINER-RUNTIME
rk1-1    Ready    control-plane   108d   v1.30.0   192.168.0.3     <none>        Talos (v1.7.1)   6.6.29-talos     containerd://1.7.16
rk1-2    Ready    control-plane   108d   v1.30.0   192.168.0.4     <none>        Talos (v1.7.1)   6.6.29-talos     containerd://1.7.16
rk1-3    Ready    control-plane   108d   v1.30.0   192.168.0.5     <none>        Talos (v1.7.1)   6.6.29-talos     containerd://1.7.16
rpi4-1   Ready    <none>          79s    v1.30.0   192.168.0.116   <none>        Talos (v1.7.1)   6.8.9-talos      containerd://1.7.16
a
Github doesn't have shared arm runners yet so my build failed after 6 hours 😅
c
I cheated and use one of my work server with 80CPU Cores and 512G of ram to cross compile... I tried at home on my desktop and cross compiling failed I think due to OOM
If you want to grab the image I built is here: https://ocis.camsab.me/s/jeUZqoKQKONkPlj it has iscsi utils and linux-util to run longhorn. pinky swear I am not an hacker and I am not trying to mess with you but if you don't want to I totally get it.
a
Did you use a newer Linux kernel version instead of patching it? The one in the pkgs dir is 6.6.30
c
Yes I am using 6.8.9
This is what I did Change the Kernel Version in the Pkgfile in the PKG repo
Copy code
# renovate: datasource=git-tags extractVersion=^v(?<version>.*)$ depName=git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
  linux_version: 6.8.9
  linux_sha256: f905f1238ea7a8e85314bacf283302e8097006010d25fcea726d0de0ea5bc9b6
  linux_sha512: 67056eae13be9130e11ea7e4d394d1f0b6b1dccc4f080f72c136870d4486fdebc2c315d149ca4f1e57af4c79dedf849e31c439426166544691508edafca3d350
# In the PKG repo
Copy code
make kernel REGISTRY=127.0.0.1:5005 PUSH=true TAG=6.8.9
# In The talos repo
Copy code
make kernel initramfs PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:6.8.9 PLATFORM=linux/arm64
make imager PKG_KERNEL=127.0.0.1:5005/siderolabs/kernel:6.8.9 PLATFORM=linux/amd64,linux/arm64 INSTALLER_ARCH=all REGISTRY=127.0.0.1:5005 PUSH=true
docker run --rm -t -v /dev:/dev --privileged -v $PWD/_out:/out 127.0.0.1:5005/siderolabs/imager:v1.7.1 rpi_generic  --arch arm64 --system-extension-image ghcr.io/siderolabs/iscsi-tools:v0.1.4@sha256:4370d0740f27a7ae7aee56a8da6cd4f00ed8019bd4024fa73b44cb388ec86194 --system-extension-image ghcr.io/siderolabs/util-linux-tools:2.39.3@sha256:6a0d86f1cfbb296dfe2c29e033d9cb3d9f78ed98413865522238f4e1505365c4 --overlay-image ghcr.io/siderolabs/sbc-raspberrypi:v0.1.0-beta.0@sha256:47c6b7dc1cf697fc1ced0928eb4e8a37e83e99898289f59aaa49f8ed97249352 --overlay-name=rpi_generic
a
Good to see you didn't need to make any changes to u-boot. I'm going to have a quick go at using an EC2 instance before giving up and downloading your image 😄
c
How did you go? Just curious 👀
a
I have finally been able to build the image thanks to your docker instructions. I've stuck with the kernel version currently used by v1.7.2 which is 6.6.30 and patched instead. The kernel build took just over 4 hours.. I don't think I'm going to want to do that each time there's a new talos release!
c
Cool! yeah is not something I will be doing very often but tbh I don't really see the need to upgrade talos every time a new release comes out. SO far everything has been working super stable and K8s 1.30 is gonna be fine for a long time as well.
v
Does it working with talos nfs support, i read you need to prepare something for extra parameters options, in my case failed, but i used version 1.6 since february and i quite lost so many new messages. .. So we still using custom image.. No way to update it like talos procedure.. And someone testes rke2 ? Because it support now arm