Talos/Ubuntu Dual Boot Guide
# │forum
d
# Running Talos on NVMe via Ubuntu on eMMC: A Successful Experiment I wondered why the official Talos 1.9 guide requires multiple steps to flash eMMC—first installing Ubuntu, then Talos, and finally U-Boot. I decided to keep Ubuntu on eMMC for emergencies, as it defaults to booting NVMe unless interrupted. Result: Talos boots from NVMe via Ubuntu's U-Boot on eMMC. ## Setup Overview - Hardware: Turing Pi 2.5 - Nodes: Four RK1 modules, each with NVMe - Boot Medium: SD card for BMC ## Flashing Ubuntu on Nodes I copied the Ubuntu image to the SD card:
Copy code
bash
scp ./ubuntu-22.04.3.img root@turingpi.local:/mnt/sdcard/
Then used a script for flashing:
Copy code
bash
#!/usr/bin/env bash
IMAGE="$1"
NODES=(1 2 3 4)
for node in "${NODES[@]}"; do
  tpi flash -n "$node" -l -i "$IMAGE"
done
Ran the script:
Copy code
bash
ssh root@turingpi.local
cd /mnt/sdcard
chmod +x ./flashall.sh
./flashall.sh /mnt/sdcard/ubuntu-22.04.3.img
In about 30 minutes, all nodes were ready. ## Booting Ubuntu Manually First, from the **BMC terminal**:
Copy code
bash
picocom /dev/ttyS1 -b 115200
In another terminal window, power on the node:
Copy code
bash
tpi power on -n 1
Quickly return to the
picocom
terminal, interrupt U-Boot (by pressing any key), and run:
Copy code
bash
setenv boot_targets mmc0
boot
Ubuntu boots from eMMC. ## Flashing Talos to NVMe Copy Talos image to the node:
Copy code
bash
scp ./metal-arm64.raw ubuntu@x.x.x.x:/home/ubuntu/
Flash NVMe from Ubuntu:
Copy code
bash
sudo dd if=metal-arm64.raw of=/dev/nvme0n1 status=progress
sudo reboot
## Conclusion Each node has Ubuntu on eMMC for emergencies but boots Talos from NVMe by default. To access Ubuntu, interrupt U-Boot and boot manually. Initial tests look good. I’ll finalize the Kubernetes setup soon. Thanks to @rlhailey3 for the U-Boot trick!
i
Very cool indeed. I’ll be trying this out during the holidays. Please share the Kubernetes setup when you have time I’ll share my experience too in the coming days and weeks. Thanks for sharing! Merry Christmas!🎄
x
wow, also going to give this a go over the holiday period, thanks for sharing, have a fantastic holiday break. thanks again.
r
Does this only work with the "official" Ubuntu images for the Turing RK1 from https://firmware.turingpi.com/turing-rk1/, which are still 22.04 LTS? Or does this also work with the newer Ubuntu 24.04 LTS images from https://joshua-riek.github.io/ubuntu-rockchip-download/boards/turing-rk1.html ?
Like @ImJabro I'm also interested in other people's Kubernetes setup. Talos recommends using 3 nodes for the control plane which I understand from a redundancy perspective and the ability to update the control plane without losing quorum, but on a 4 node cluster this seems a bit wasteful. I'd also like to have more than 1 node to run actual workloads.
n
This has some drawbacks, Talos has newer versions for u-boot, rkbin and arm trusted firmware. If you choose to use the ubuntu u-boot, there are some things that most def do not work...
t
Just because it recommends 3 control-planes doesn't mean you can't have any worker nodes. If I ever get my first flashed Talos unbricked, I intend to do a 3 node cluster (or 4 if I can decide whether to get an Orin NX 16 GB or RK1 32 GB for the 4th slot). Control planes can schedule nodes too. Wouldn't want to do that on large clusters, but for home lab a control-plane only cluster is fine, just remove the NoSchedule taint from the nodes and they can schedule pods regularly like any worker node
r
Ah yes, I found out about the NoSchedule taint which, once removed, allows workloads to be scheduled on control plane nodes. I personally have a Turing Pi 2 with 4 Turing RK1 16GB nodes in it. Would you recommend using all 4 as control+worker nodes or rather stick with a 3 node control plane? As it's easier to maintain quorum with an odd number of nodes, I don't know how Talos would handle an even number of control plane nodes.
I found this old guide (https://www.talos.dev/v0.13/guides/troubleshooting-control-plane/#how-many-control-plane-nodes-should-be-deployed) which warns against an even number of control plane nodes. So I guess I'll either go for 3 control plane nodes with their taint removed in my 4 node cluster, or add an additional Talos VM on my Proxmox server to the cluster.
i
Interesting, what other things would not work? In your opinion, is it better not doing this and use Talos as the only OS? I’m planing to use Talos on 4xRK1 (16MB) with Longhorn for storage on 4x2TB NVMEs. Most likely 3 nodes (control planes and workers on all nodes) in a cluster for production and 1 node for staging/testing using Flux.io for the pipeline. Or move the staging node to a Proxmox VM and use the last RK1 as a worker node.
t
@Rush I don't think there's any real downside of 4, 6 or 8 master nodes in a home lab/1-machine setup (other may correct me if I'm wrong) or anything that's not absolutely mission critical, other than "You don't gain any additional fault tolerance compared to 3, 5 and 7. For raids or systems that need to keep data in synch uneven numbers make sense (such as zfs raid or min.io nodes etc.) since they commit state and you'd want only one to be the source of truth in case of a network split. Say you have 6 nodes and you a 3 / 3 network split happens, then commits done on both would be done on a quorum of 3. When the network split resolves, you end up with 3 split clusters both having the same quorum when they join together again and can't decide which ones data/commits are authoritative. On a 7, 9 or 11 sized cluster you don't have that issues, as on a split you'd have this scenario, unless multiple segments of equal size split from each other at the same time (i.e. 3/3/3 on 9 nodes), but normally you'd have like geo redudancy where 3 of each nodes are stored in a different location and when one goes down the other two typically are still there. Network splits unlikely to happen with TPi2 when on same board. Could only be an issue if you ran some kind of raid or distributed file system/database on 2x4 cluster on 2 TPi2 boards and have high consistency requirements
n
- KASLR see dmesg - memory frequency control using dvfs. - more optimal memory training - different nvme patches
d
I'm not sure will following commands override the u-boot?
Copy code
tpi advanced msd -n 1
dd if=u-boot-rockchip-spi.bin of=/dev/sda1
if yes, then ubuntu and talos have the same u-boot. If not - I have no idea what have I done 🙂 as everything works as before and u-boot version is the same on other nodes where ubuntu was installed, and where I "override" the u-boot with commands above.
@Nico can you comment, please?
257 Views