Has anyone gotten the NPU on the RK1 to work with ...
# │forum
s
I posted this question in the #1031260012720427139 channel, and got deafening silence, so I'm asking here as well, to be annoying: Has anyone got the NPU on the RK1 to work with anything at all? If so, how? I have gone over the code from Rockchip and had absolutely zero luck making any sense of it. Most of the documentation assumes deployment to an Android phone from an
x86_64
PC, neither of which I have access to. Surely the number 1 use-case for an NPU library for the RK3588 is to build things on an RK3588? Am I the only one on crazy-pills or something? There is work being done by Tomeu Vizoso on an open-source kernel driver for the RK3588 NPU, which has been submitted to the mainline for Linux/Mesa (see here: https://blog.tomeuvizoso.net/2024/06/rockchip-npu-update-4-kernel-driver-for.html?m=1), but that may take many many months to trickle down to something usable by mere mortals such as us. Then there is the talk of "the binary blob" for the RockChip NPU. Where is it? How do I install it? Weeks of googling and reading documentation have lead me round in circles. Can anybody give me a starting point to be able to try this stuff out? Honestly, it seemed like this was the "raison d'etre" for the RK1, but seems to be largely ignored. I find that extremely puzzling and disappointing.
And again… deafening silence… @DhanOS ? Anyone from @User?
s
Thanks, @Spooky First helpful response I've had.
s
Will check that out too. Thanks @nsky
b
Hi @suranyami, yes I did get it to run, but I have some questions too. I tested the Yolov5 demo with a stream from Youtube and it works pretty well. But when I look at the NPU performance, 1) the NPU is only utilized by around 30% and 2) it's only one core active. If I run two demos in parallel, then the second NPU starts to be active but also just around 25% I wonder what is the bottleneck here. I run it from an NVME with around 2GB/s throuput. As you can see from the pictures its also not the CPU that's throttling. Hope anyone has some suggestions to reveal the full performance. https://cdn.discordapp.com/attachments/1254048767032950784/1296043192415682613/1_NPU_Running.png?ex=6710d9c8&is=670f8848&hm=ce81506d3076bba15fe44aabfe2a31e17ffa5dc089dd7f68ceaaf0e82221e8b4& https://cdn.discordapp.com/attachments/1254048767032950784/1296043192655020128/2_NPU_Running.png?ex=6710d9c8&is=670f8848&hm=5eb66bf94ebd389e146841552304086f996fe8c133b9535b665654aeeb71f1b2&
t
@suranyami I was initially interested in building
ffmpeg
with hardware acceleration provided by the RK1's NPU, but also didn't get that far I later decided to use a Proxmox VM on my Dell R530 for transcoding instead... I'm unsure which would be faster...
b
@Teslamax: I researched a little bit about this topic as well and AFAK the NPU has nothing to do with video encoding. Normally there is a seperate core for video encoding. If I'm not mistaken this is the RGA in case of the RK1
t
Perhaps it was the multimedia core… I couldn’t find significant information regarding that either
f
It's the VPU for video encoding/decoding. It's works with the latest Jellyfin, check out the jellyfin docs. It needs the BSP kernel (e.g. rockchip-ubuntu) at the moment though support is being worked on in the mainline kernel. For ffmpeg that works with VPU see https://github.com/nyanmisaka/ffmpeg-rockchip. That also lists the VPU capabilities.
t
Please pardon the tangent but I have a question: Would the VPU’s advantage be speed? I thought that was the main or sole advantage of hardware encoding was speed. I thought encoding quality was generally better with software encoding. Are these accurate assumptions?
f
Speed and power efficiency. As to the quality, it depends on the specific implementation on either software/hardware. I'm using transcoding with Jellyfin on the RK1 for a few months now and have had no issue with performance or quality.
481 Views