Kernel flags
# │forum
d
Anyone modify the kernel flags on the Orin NX? I need the following flags set in order to use cilium for my Kubernetes cluster. https://docs.cilium.io/en/stable/operations/system_requirements/#base-requirements
Copy code
CONFIG_NET_CLS_BPF=y
CONFIG_NET_SCH_INGRESS=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CGROUP_BPF=y

CONFIG_NETFILTER_XT_TARGET_CT=m

CONFIG_NET_SCH_FQ=m
I'm fairly new to this level of tinkering with Linux. I asked chatGPT, and got a pretty straight forward response:
Copy code
cd /usr/src/linux-headers-5.10.104-tegra-ubuntu20.04_aarch64/kernel-5.10/
cp .config .config.bak
make menuconfig
make -j$(nproc)
sudo make install
sudo make modules_install
sudo update-grub
sudo reboot
uname -r
I'm a bit confused on the GRUB part, as it seems the Jetson NX is using EFI. Checking
which update-grub
returned nothing, so I'm leary to continue. Anyone done this already? Could these flags be added during the preliminary flashing process?
c
Changing the kernel build configuration requires rebuilding the kernel. Since I'm seeing references to Ubuntu 20.04 above, here's Ubuntu's guide for doing that:
d
Could these flags be added during the preliminary flashing process?
I'd rather make the change "upstream" so to speak so that everyone with an orin NX can have success with a kubernetes cluster using the benefits of eBPF
w
I think you'd have to log a bug with NVIDIA asking for that
but I think home kubernetes clusters are a bit of a niche use case for these modules... they are marketed for robotics (e.g. drones - they tend to be found in debris in Ukraine for example)
d
Can we not make the changes during the flashing process? Or do I really have to rebuild the kernel? 😅
and TIL about the war debris... that's kinda crazy.
w
I mean somebody has to build the kernel with those flags
They're an input to the kernel building process
j
This changes cannot be done during flash of the image. It requires a recompile of the kernel as CFSworks stated. The kernel is baked into the image that is flashed. So you have two options. rebuilding the kernel after installation of the image to the SBC or you rebuild the image with a rebuild kernel. Nevertheless opening a bug at NVIDIA might get that fixed in the longrun as Cilium is on of the emerging CNIs(and other stuff), so that NVIDIA might have interesst in that.
d
There's really no such thing as image
Even the Nvidia's flashing tool creates "the image" as it goes
You can download the kernel and compile it before flashing, but I never attempted this and won't be on much help here
When you go to the Jetson Linux Archive: https://developer.nvidia.com/embedded/jetson-linux-archive and then to a specific version like this one: https://developer.nvidia.com/embedded/jetson-linux-r3531, you can download Driver Package Sources and I believe this should let you compile the kernel in a way required to run on these modules. But I can;t help any further than this
The SDK Manager is actually doing more or less the same, there's no image that it downloads and flashes, this is why it requires specific Ubuntu version to run
My guess is your best bet is to go to Nvidia Developer forums and ask for that there and see where this gets you
d
I might just try the in-place rebuild, knowing the flash wipes the drive
j
It is nearly the same for all Nvidia aarch64 equipment Jetson/Bluefield[23] pre-build rootfs/images with at least some source code to change. Im more familare with the latter, If you're done don't forget to report back here, so others might get knowledge out of your experience.
d
So far, I've been able to rebuild a kernel, but I'm still trying to figure out how to load it. I'm scared to replace
/boot/Image
... Maybe I just yolo, though, and reflash if it goes sour?
This all happens directly on the Jetson. Steps for rebuilding the kernel are as follows:
Copy code
sh
# Create a temporary home for the new files
mkdir ~/kernel
cd ~/kernel
export TEGRA_KERNEL_OUT=`pwd`

mkdir ~/modules
cd ~/modules
export TEGRA_MODULES_OUT=`pwd`

export TOP=/usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64/kernel-5.10

# Gather proper source files
cd ~
vim source_sync.sh
chmod +x source_sync.sh
./source_sync.sh -h
./source_sync.sh -k -t jetson_35.4.1
sudo cp -r ~/sources/kernel/* /usr/src/linux-headers-5.10.120-tegra-ubuntu20.04_aarch64

# Build kernel
cd $TOP
make O=$TEGRA_KERNEL_OUT tegra_defconfig
make O=$TEGRA_KERNEL_OUT nconfig

# CONFIG_NET_CLS_BPF=y
# CONFIG_NET_SCH_INGRESS=y
# CONFIG_CRYPTO_USER_API_HASH=y
# CONFIG_CGROUP_BPF=y
# CONFIG_NETFILTER_XT_TARGET_CT=m

make O=$TEGRA_KERNEL_OUT -j 8 Image
make O=$TEGRA_KERNEL_OUT -j 8 modules
make O=$TEGRA_KERNEL_OUT INSTALL_MOD_PATH=$TEGRA_MODULES_OUT modules_install

# Verify kernel exists
cd $TEGRA_KERNEL_OUT
find . -name Image
https://cdn.discordapp.com/attachments/1135261223173230725/1144769818071924886/source_sync.sh
I can't help but feel like this would be useful before flashing the board... maybe we should update the docs accordingly?
@DhanOS (Daniel Kukiela) could you help me with the steps to load this Image into place safely? The suggestion on the nvidia forum was to reach out here for this step, since it’s not an nvidia carrier.
d
What exactly do you want to do? Update U-boot? I did not follow closely the above, so I need some condensed information about what did you do and what you want to do now.
c
Boot a replacement kernel without risking bricking the Jetson if the kernel is bad
d
I'm not sure I'm the right person to ask then. What I would do would be to make a copy of the drive onto another drive and experiment there. You won't hard-brick it of course and there should be still a way to replace the kernel back, but I don't know it. One thing that comes to my mind is maybe UART will let you interrupt the boot process and boot using the kernel you want (for example this new one for testing). I have not attempted such a thing on my own, though. A quick Google search shows that it should be possible to interrupt the boot at the U-boot level.
c
I might recommend test-booting the new kernel with kexec, then if it comes up okay, overwrite the image.
d
> Incidentally, U-Boot has not been used for some time. Its last use was in the earlier releases of L4T R32.x, but this then was removed, and CBoot had U-Boot features merged directly into CBoot (or at least a subset of features). In R34.x+ all boot content was migrated to a UEFI boot (which is a big help for the future I think).
I've never heard of kexec... will check it out.
Do I need to repeat any of the flashing steps? e.g.
Copy code
sed -i 's/cvb_eeprom_read_size = <0x100>/cvb_eeprom_read_size = <0x0>/g' Linux_for_Tegra/bootloader/t186ref/BCT/tegra234-mb2-bct-misc-p3767-0000.dts
sudo ./apply_binaries.sh  
sudo ./tools/l4t_flash_prerequisites.sh
sudo ./tools/l4t_create_default_user.sh
c
For kexec? It should be as easy as SCPing the new kernel image to your module and doing
sudo kexec path/to/Image
on that.
No flashing steps should be needed here.
d
Are the flashing steps necessary at all to create the new Image? Meaning, should I start over? lol
Lemme give kexec a go
I killed it...
Copy code
$ sudo kexec ~/kernel/arch/arm64/boot/Image
Can't open (/proc/kcore).
Can't open (/proc/kcore).
Connection to 192.168.4.53 closed by remote host.
Connection to 192.168.4.53 closed.
[~] ssh jetson1                                                                       
ssh: connect to host 192.168.4.53 port 22: Operation timed out
[~] ssh jetson1                                                                       
ssh: connect to host 192.168.4.53 port 22: Operation timed out
[~] ssh jetson1                                                                       
ssh: connect to host 192.168.4.53 port 22: Operation timed out
[~] ssh jetson1                                                                       
ssh: connect to host 192.168.4.53 port 22: Host is down
[~] ssh jetson1                                                                       
ssh: connect to host 192.168.4.53 port 22: Host is down
[~] ssh jetson1
c
Manually restart it and give it a second try? Maybe try kexecing the known-working kernel just to make sure kexec isn't at fault?
d
tpi -p off/on 👍
lemme try with /boot/Image
That did come back up. How do I know if I'm... "in" the kexec'd image?
ooohhhh,
sudo dmesg
has some goodies
c
Ah okay
d
This is hard...
c
Trying to compile the kernel and get it to boot successfully is hard enough when you aren't doing it blind. Can you first try to reproduce the exact kernel that NVIDIA provides before making changes?
d
I've been recompiling Jetson Nano kernel to enable overclocking (was super useful when I was compiling a lot of stuff on it where a single compilation could last even for almost 3 days)
d
(on line ~92)
c
mmm it's probably better to modify the
config
yourself - but, first get an unmodified kernel going, just to confirm you have that part figured out first
d
nvbuild is still running... My hopes are high! Do I need to pass
--initrd
to kexec?
w
ooh, how did you manage that?
I've been trying to get iscsi on mine, but building the module seems to not-work because there's some kernel build flag needed independent of building the module
d
Here's my modified nvbuild.sh. I was able to kexec the new kernel (!!!), but I had a hard time verifying that I was looking at the right thing, so updated the script to take a suffix for
LOCALVERSION
. Rebuilding now. https://cdn.discordapp.com/attachments/1135261223173230725/1145530653258039377/nvbuild.sh
Put the 3 scripts somewhere. I chose my home directory.
Copy code
cd ~
vim source_sync.sh
chmod +x source_sync.sh
vim nvbuild.sh
chmod +x nvbuild.sh
vim nvcommon_build.sh
chmod +x nvcommon_build.sh
Commands are simply:
Copy code
cd ~
./source_sync.sh -h
./source_sync.sh -k -t jetson_35.4.1
mkdir kernel_out
cp nvbuild.sh nvcommon_build.sh sources

cd sources
./nvbuild.sh -h
./nvbuild.sh -o ~/kernel_out -s cilium
c
I also check the build timestamp in
uname -a
to confirm I'm on the kernel I intend.
d
It doesn't appear to have worked...
Copy code
$ sudo kexec ~/kernel_out/arch/arm64/boot/Image
$ uname -a
Linux jetson1 5.10.120-tegra #1 SMP PREEMPT Tue Aug 1 12:32:50 PDT 2023 aarch64 aarch64 aarch64 GNU/Linux
c
Did it go down for a reboot after you did the kexec command (and you just omitted it)?
d
Copy code
$ sudo kexec ~/kernel_out/arch/arm64/boot/Image
Can't open (/proc/kcore).
Can't open (/proc/kcore).
Connection to 192.168.4.53 closed by remote host.
Connection to 192.168.4.53 closed.
[~] ssh jetson1
ssh: connect to host 192.168.4.53 port 22: Operation timed out
[~] ssh jetson1
ssh: connect to host 192.168.4.53 port 22: Operation timed out
[~] ssh jetson1
ssh: connect to host 192.168.4.53 port 22: Host is down
[~] ssh jetson1
ssh: connect to host 192.168.4.53 port 22: Host is down
[~] ssh jetson1
Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.10.120-tegra aarch64)
c
Okay, good, so it did do the kexec. Not good that the kexec seemed to result in the same kernel booting. I wonder if it got stuck and rebooted back to the installed kernel? 🤔
d
it didn't fix itself with the first kernel I built, oddly enough
I was playing around with the
-l
flag originally, though, might have botched it even worse
I feel like --initrd is necessary? just not sure what to pass it
c
I don't expect that L4T uses an initrd but let me check mine
...turns out I do have a
/boot/initrd
d
not sure what I'd pass it. I don't have any img files in kernel_out
$ find ~/kernel_out -type f -name "*.img"
returns nada
c
Do you have
/boot/initrd
?
d
Copy code
$ ls -la | grep initrd
-rw-r--r--  1 root root  9782446 Aug  1 19:49 initrd
-rw-r--r--  1 root root  9782448 Aug  1 19:49 initrd.t19x
-rw-r--r--  1 root root     4096 Aug  1 19:49 initrd.t19x.sig
c
That first one is most likely the file to pass.
d
is there a "new" one to pass? from the build process?
c
initrd isn't built as part of the kernel build process, there's usually a separate script (like
mkinitramfs
) that takes care of it. As long as you're on the same version of the kernel they should be compatible.
d
Newp. That doesn't appear to do it
c
And you don't see serial output on the BMC? The serial output would greatly help.
d
I'm ssh'd onto the BMC as well... what should I be seeing?
I'm onto something. absolute path vs ~
w
Serial output from Jetson devices (Nano at least) does not go to BMC
it goes to the external UART ports
d
Ah, right
I'm not plugged in
w
v frustrating
d
I'll plug the UART later. I'm not making much progress...
ok, I see the difference. these are not the same
c
I can report that at least on the Xavier,
/dev/ttyTHS0
goes to the BMC
d
Copy code
sudo kexec /home/dudo/kernel_out/arch/arm64/boot/Image --initrd=/boot/initrd

vs

sudo kexec -l /home/dudo/kernel_out/arch/arm64/boot/Image --initrd=/boot/initrd
sudo kexec -e
c
They aren't equivalent?
This is news to me.
d
-l / -e will not boot afterwards
no args reboots with the normal kernel
c
I encountered a bug in some version of Linux (6.0??) where kexec would just trigger a normal reboot.
Does -l / -e not boot even if you're booting the original kernel?
d
when I installed kexec-tools it asked me if it wanted to control reboots, to which I anwered yes
oh, lemme do the normal kernel... duh
so... -e with my image hangs, and -e with /boot/Image loads. It does appear to be an issue with the Image I'm building 😩
I'll take this back to the nvidia forum, and see if they can't help me sus this out
c
What about without the initrd?
It might or might not work. Usually the purpose of an initrd is to load some drivers that the kernel needs to access the rootfs in order to continue booting.
d
I tried it without initrd originally 😕
c
So, it works with no initrd, that sounds like you don't need it (which is probably good, it's one less thing that might be getting in the way)
d
omg...
Copy code
$ uname -a
Linux jetson1 5.10.120-tegra-cilium #1 SMP PREEMPT Mon Aug 28 01:25:02 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
I had to copy the modules into /lib/modules 😓
This is so satisfying
Copy code
$ for config in "${REQUIRED_CONFIGS[@]}"
> do
>     zcat /proc/config.gz | grep "${config}[ =]"
> done

CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NET_CLS_BPF=y
CONFIG_BPF_JIT=y
CONFIG_NET_CLS_ACT=y
CONFIG_NET_SCH_INGRESS=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_BPF=y
CONFIG_PERF_EVENTS=y
CONFIG_SCHEDSTATS=y
CONFIG_NETFILTER_XT_TARGET_TPROXY=m
CONFIG_NETFILTER_XT_TARGET_CT=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_SOCKET=m
c
Ah. The command for that is
make modules_install
from the Linux source directory. Dang, sorry I didn't think to suggest that. I'm too used to building my kernel with all the stuff I need builtin.
d
You know… I learned a lot while floundering about. I updated the nvbuild script again to include modules_install (I’m going to try to get this pushed upstream). This could probably be an entry in our docs.
c
That's the fun of open-source for me: debug something, learn a ton in the process, and contribute the fixes so that (hopefully) I'm the last person to have the problem.
d
I’ll actually be able to sleep tonight… sweet relief
So, now that the Image is validated... I just replace /boot/Image??
c
In principle yeah. You might want to rename the old one to something else just in case (but I dunno how to access the L4T bootloader to override its selection)
d
I'd have to add an entry to
/boot/extlinux/extlinux.conf
if I wanted to fall back, I think?
Just called
ls /boot
and see these. Should I be concerned?
Copy code
sh
Image
Image.t19x
Image.t19x.sig
initrd
initrd.t19x
initrd.t19x.sig
I replaced /boot/Image
and it booted!!!
I'm wondering what will happen the next time we get a tegra upgrade though. Will it just upgrade my existing kernel? Will it upgrade the previous stable tegra specific kernel? Will I need to redo this process every time a release comes out?
And cilium is installed on a Jetson Orin NX. Mission accomplished.
3 Views