BMC 2.0.0 Firmware early adopters
# │forum
w
Putting together a forum thread to collect experiences with the 2.0.0 BMC firmware
To summarize my adventures with the 2.0 firmware RC: 1. You can install it by downloading the ZIP, and doing something like
dd if=tp2-firmware-sdcard-v2.0.0-RC1.img status=progress | ssh root@192.168.124.54 dd of=/dev/mmcblk0 bs=1M
2. Once you reboot the BMC it goes into a "do you really want to reflash the whole board?" thing, which you can skip by typing CONFIRM over UART or triple-tapping the power or reset button 3. Once it's reflashed the board, you have to either pull the microSD card, or you can erase it by connecting to UART, pressing any key when prompted to interrupt boot, and then using
mmc erase 0 10000
to erase the MicroSD card.
More tips: hold KEY1 while booting to go into safe mode - you can then erase the SD card over SSH https://discord.com/channels/754950670175436841/754950670175436848/1170005285889388606
g
I was able to flash two of the three RK1 on my board, but can't get the third one to boot into Ubuntu for some reason. Have to wrap up for the weekend, but not sure what's going on there. Will dig in again next week
(running RC1)
t
We should have a separate Forum for RK1-specific topics that aren't related to flashing from the BMC.
I installed BMC firmware 2.0.0-RC1 yesterday. Still investigating how network initialization/configuration changed. I don't like the process order of calling /etc/setStaticNet.sh from /etc/init.d/S93startup, after /etc/init.d/S40network has already brought the network connections up. This causes multiple entries in my Ubiquiti DMP's DHCP server which can be confusing. I want to use a static IP DHCP reservation based on a MAC address. It appears the 2.0.0 firmware obtains a MAC address from somewhere and is no longer random/dynamic. Is this MAC address unique to each TPi2 T113-S3 or will they all be the same? Makes me wonder whether the static MAC and IP address logic is an artifact that needs to either be removed or optimized. Either way, this logic should occur in a separate /etc/init.d script prior to network initialization by S40network. Once I figure out the networking. I'll be looking at user interfaces. Specifically, the functionality of the KEY1 and RESET buttons, S-LED and P-LED and ATX power supply control (case fan and storage device power management). In addition, I'll explore whether the web UI and tpi commands yield consistent results. I really liked how BMC firmware pre-1.1.0 worked. I can already tell node power state is not being preserved across BMC reboots.
c
The MAC address is derived from the CPU serial number, so will be both unique and static. The point about setStaticNet is a great one though. That mechanism is obsolete but should perhaps be a S35* script that updates /etc/network/interfaces for backwards compatibility.
w
At least it seems to be being restored to ON rather than to OFF
s
Im trying to consolidate the behavior of the power commands, in particular what does it mean when you issue an 'on' command for all nodes? do you power on all nodes? or the ones that are active (actually have a module inserted)? the challenge is that there is no convenient way to measure which nodes are "active". there are a couple of inputs that control power, apart from some bugs i fixed on top of RC1, the behavior im aiming for: * powering on via TPI/API/web UI -> activates and powers the node * hitting KEY_1 toggles all activated nodes to on or off. * long press KEY_1 resets all activated nodes. If one or more nodes were on, all nodes will be labeled not activated. If the amount of active nodes was 0, all nodes are labelled activated Its therefore still possible to turn on the nodes via the tpi tool, hit the key_1 button, reboot, and see the nodes go on, even though they were off before the reboot happened
t
It isn't practical for the BMC to know any node state other than power is on or off. Anything more would require an agent (Nagios), which I don't think should be a Turing Machines core competency. The BMC firmware is open source, so those who want/need more can extend the functionality. With 32GB RK1s, nodes will be running more processes, and when Turing Machines releases a TPi with more than four node sockets, there will be further horizontal scaling. There does need to be state consistency between KEY1 presses, the tpi command, the web UI and the REST API. I assume the node power state engine is centrally controlled by bmcd with incoming requests being serialized and mutexed.
w
I say keep it simple. The vast majority of use cases will want all nodes powered the majority of the time.
I expect that even then, most times nodes are turned off it will be temporary, to power cycle, change settings or do some maintenance
It a general principle of interface design that simple things should be easy, complex things should be possible, and the latter should interfere with the former as little as possible.
t
@svenrademakers I investigated node power management a bit. Using either the tpi command or the web UI, node power state is persisted across BMC reboots. I set "write_timeout" in /etc/bmcd/config.yaml to 5 for testing purpose. We can debate whether 300 is too long or not. The KEY1 depress logic, with respect to the KV store, appears broken. KEY1 does correctly turn all nodes on and off. However, after KEY1 is depressed the first time, the key-value store, /var/lib/bmcd/bmcd.bin, always indicates to bmcd that the prior state was "all nodes on" regardless of whether KEY1 is used to power all nodes off, the tpi command is used to power nodes off or the web UI is used. I don't know the binary format of the KV store, but that's where I'd look. If you want/need an issue opened, I can do so and provide two bmcd.bin files for examination.
r
This is useful for those without access to UART, press power button 3 times to begin the update process https://discord.com/channels/754950670175436841/1167785630982484018/1167882637927534642
s
@Terarex (Dan Donovan) Thanks for taking a detailed look at it. I also gave it another taught and concluded that a plain straight forward power controller is the best option here. We were trying to do something smart with having a global on/off state, but that does not make sense for the trivial use-cases, or at least its very confusing/annoying for the end user. Secondly, i made sure that all power commands are going through the storage. This was not the case before which meant that in some scenarios you can end up with a different stored state as actually is active. Lastly, i think Linux is quite good in caching file read/writes, so i share your sentiment on the high write timeout config parameter. I think we can put it even lower as 5. for those interested, you can find the related changes in this commit: https://github.com/turing-machines/bmcd/commit/d5db79d67ec725fa0a2da578f7be4f33dc1b0c57
v2-RC2 will be available soon. The build needs an hour at least: https://github.com/turing-machines/BMC-Firmware/actions/runs/6773906243/job/18409813735
Copy code
v2-RC1 improvements

UI
* Updated styling of bmc-UI
* UI: Fixes in the progress-bar and labels

BMCD
* Simplified power controller
* pressing KEY_1 will toggle all nodes on if 3 or less nodes are powered
  on
* pressing KEY_1 when all nodes are off will turn all nodes on
* pressing KEY_1 when all nodes are on will turn all nodes off
* long press KEY_1 will force every node to on
* persistency: improve logging, decrease write timeout
* ban_patrol: simplify ban deadline
* Update default_config.yaml
* serial: replace custom ring buffer impl with a crate
    (Fixes #135)
* move the default www directory to /srv/bmcd/www
* authentication: auto reload users
* auth: ban consecutive failed requests
* humantime: convert durations into human readable
* expose reboot over API

TPI
* rename `tpi power get` to `tpi power status`
* return error response from server on authentication error
* throw away token if invalid. Prevents constant retry of authentication
request with wrong token
* add spinner for 'verifying checksum'
* add reboot command
* allow local file transfers to be issued outside BMC, fix phrasing
Let me know if there are any logs I should sent?
c
OTA from RC1->RC2 is expected to work. If you SSH in, what does
ubinfo -a
show?
(For brevity: is there a volume named
rootfs_new
?)
s
This was a bug in the UI of RC1. Upload should work anyway 🙂
t
Bored at work… wondering if I dare do this remotely 😉
c
I've done it remotely (from another state) many times, but I have a Bus Pirate wired to my BMC UART and reset button.
t
Where do you get such wonderful toys? (apologies for going off-topic)
I didn’t know things like this existed…
t
RC2 is looking good. After performing the μSD upgrade from RC1 to RC2, I used the web UI to stream the RC2 tpu file. Seemed to work fine. I also used the web UI's control page to reset the BMC. Seems like this is a viable remote update solution.
r
Yes, thanks for that..
perfect, just needed a reboot!
Worked for me!
@svenrademakers just some feedback, not sure if I'm missing something, but the Power Control is now very unclear, can't see the current set status of the power to the nodes, so not sure what action you are taking when toggling, it may be that selected means on, but select and submit triggers my brain to think that we are toggling between states?
I have logged a issue, #136, it might just be me in this case, I feel that it talks to usability.
g
Tested the rc1 version trough ssh procedure works spendid !! No issues
Even with a corrupted v1.1.0 version good work
Is there a way besides to compiling to get rc2 ?
s
If you scroll down there is a zip you can download, it contains a .tpu file which can be Installed via the UI (im not sure if you need to be signed in github to see the zip) https://github.com/turing-machines/BMC-Firmware/actions/runs/6773906243/job/18409813735
Thank you! I agree with you that the checkboxes are unluckily chosen. I need to do another pass over the UI anyway, there are still things that can be improved
r
Always will be, but, I must thank you for all the improvements that have been made, I really like the direction you are taking the software!
s
doing my best, hope it will be useful for you guys !
w
Can you just upload the files uncompressed? It's kind of a pain to have to unzip them. Bandwidth is pretty cheap
Well upload is a relative term I know it comes from CI
s
for sure, im panning to have the "official" release hosted on the turingpi server
w
But I was just thinking the zipping doesn't really add much
s
im not sure we can do something about the zip container of the CI builds. Its something github enforces
c
It might not be a bad idea to reimplement the logic of
osupdate
in
bmcd
to allow streaming the volume update over the network -- including through a zip-extracting filter layer. So if someone provides a .zip instead of a .tpu the
bmcd
implementation can detect that and do the correct thing.
s
> It might not be a bad idea to reimplement the logic of
osupdate
in
bmcd
to allow streaming the volume update over the network -- including through a zip-extracting filter layer. So if someone provides a .zip instead of a .tpu the
bmcd
implementation can detect that and do the correct thing "streaming" is a bit of a loaded term for me. the
bmcd
is already capable of receiving a chunked volume, the only caveat is that its written to tmpfs before
osupdate
executes it. This should be fine for now as we have plenty RAM.
i would like
osupdate
to be reimplemented as wel somewhere in the future, calling shell scripts from an API is not really ideal regarding security
t
got RC2 running on mine... no other comments as of yet
b
I recommend https://pikvm.org/ I like the quality of it 👍
t
seen that too and have thought about it
w
not cheap in Australia though, A$500 (US$300)
b
Definitely not cheap, but just works, and well too
w
How is RC2 working out for folks? Maybe I should give that upgrade a whirl
t
I skipped 2.0 RC1 and all seems fine. GUI looks better. I’ll be honest, I haven’t tried anything fancy though.
c
Just to make sure there's no confusion about this among people lurking in the thread: You only need the microSD upgrade method to go from 1.x -> 2.x. Upgrades from 2.x->2.x (including RC1->RC2) can be done using the OTA update method (followed by a reboot).
t
Yesterday, as an intellectual exercise, I tried to use the webUI in RC2 to flash the SD card on a CM4 with a compressed lite Raspian image. Th webUI stopped updating at less than 6%, so I aborted it. Then used the tpi command after scp'ing an uncompressed image file to the BMC's SD card. It took quite a while, but worked. I'm attempting to flash an uncompressed SD card image file via the webUI now. The image file is huge, so it may take several days. 🤓
j
Guessing that's because of the 100mbit lan on the bmc...
s
im curious, is transferring with
tpi
from the bmc's sdcard to the CM4 any slower compared to using rpi-imager over the
usb-a
port?
my guess is that writing emmc over usb is the actual bottleneck currently. I dont have the data to back this up, but im intending to do some profiling soon to find out
t
Note that my CM4s are lite (no eMMC, just μSD cards). I didn't try that, but I'm certain it would be much faster over a USB connection. The file I'm transferring via the webUI is an uncompressed, 256GB μSD image (generated from a working card via Rufus). My guess is that, due to limited BMC memory for tmpfs, the webUI uses a relatively small block size for each transfer. Obviously, the fastest way is to image the μSD card on the host machine that is running RPi Imager.
s
block and buffer sizes are terribly chosen, perhaps a nice goal for the weekend to tune it a bit better
c
Another good benchmark (if you don't mind erasing your CM4 and thus needing to reimage it) would be to do
time dd if=/dev/zero of=/dev/sda bs=8M
on the BMC while the CM4 is in flashing mode.
And ofc play with the
bs=
parameter
b
Copy code
time xz -d < metal-rpi_generic-arm64.img.xz | sudo dd of=/dev/sda bs=1M status=progress conv=fsync
893665280 bytes (894 MB, 852 MiB) copied, 3 s, 298 MB/s1306525696 bytes (1,3 GB, 1,2 GiB) copied, 3,82763 s, 341 MB/s

0+152937 records in
0+152937 records out
1306525696 bytes (1,3 GB, 1,2 GiB) copied, 99,591 s, 13,1 MB/s

real    1m39,600s
user    0m3,607s
sys    0m0,226s
c
= ~104.8 Mbps, which is a fair amount lower than the 480 Mbps theoretical of USB 2.0.
b
c
Oh wait was this running from your desktop, not on the BMC?
b
Yes, you asked for a comparacence ☺️
c
Good to have a baseline then 👍
b
Exactly
I flashed via web gui of the firmware as well, but did not time it
s
i ran some tests with my rpi CM4, and fixed the performance bug with using tpi over the network. Decreased the flashing time from 16m to 6m40s.
Copy code
Conducted a dd test locally on the BMC to determine the correct block
    size for writing to a raspberry CM4.
    Times measured using a 206Mb file:

    512B     1m50.22s
    512K     43.0s
    1M       43.03s
    4M       42.96s
    8M       43.19s
    16M      43.00s
    32M      45.95s

    Concluded 4M is the optimal blocksize.

    Ran some tests writing `2023-05-03-raspios-bullseye-armhf-lite.img`
    * using dd 4M                   => 6m23.86s
    * using tpi with -l flag        => 6m22s
    * using tpi over local network  => 16m02s
    * using tpi over local network with 4M blocks  => 6m40s
*these times are measured without the crc read step which accounts for an extra 3 minutes on top
c
206MB in 42.96s is 4.80MB/s or 38.4Mbps which is a lot lower than the desktop baseline determined yesterday. I have to think there's still some low-hanging-fruit improvement to be made hiding somewhere.
s
its more or less equivalent to what im getting using a desktop and
rpiboot
. since we use the same backend on the BMC, i would start my investigation there
c
If it's equivalent then the bottleneck might be the CM4 SD/eMMC which would mean no further improvements to the BMC's speed are needed here. Though that also seems a bit on the low end for that to be true... Hmm...
s
to be continued for sure 🙂
t
The image transfer finally finished last night. The webUI indicates a write error, but the CM4 boots up just fine. I'm not certain whether "UPGRADE NODE" is a good button caption. Maybe "FLASH NODE OPERATING SYSTEM"? https://cdn.discordapp.com/attachments/1170114820457111592/1172577879956144259/TPiv2-WebUI-CM4-flash-error.jpg?ex=6560d340&is=654e5e40&hm=4257b0872a5c91d77c538aa89d197e84bb04ebf4bf8ef95bb26fd1511c0e5ca9&
c
"OVERWRITE NODE" might also be a good option; makes the "destructive" nature of the operation clear to the user so they know to back up any important data before using it.
t
Yeah. The operation is destructive, so "UPGRADE" is misleading. Maybe just "INSTALL OS". Mentioning NODE is redundant to the tab's context. I know it's nitpicky, but don't expose the button until after the user has selected the node and the file and both inputs have been validated. It might be wise to pop up a dialog to tell the user the operation will overwrite the node's boot device and confirm their intent.
c
"INSTALL OS" + confirmation dialog sounds great to me
s
These are good suggestions! Lets see if i can slip it in!
c
btw a helpful command to run now and then (perhaps something the
bmcd
webui should do for you? 👀):
ssh root@bmc 'tar -cvC /mnt/overlay/upper .' | bzip2 > bmc-backup.tar.bz2
This will create a backup of only the files that you've modified after installing the base firmware.
s
sounds like a cool idea, we would need to think of frequency and were to best store it. Could you make a ticket of that on the BMC-Firmware repo?
c
I wasn't necessarily thinking of doing it automatically, just a button in the webui that downloads a backup .tar.bz2. I can make a ticket later today though.
s
That would work as well!
Consider it done: https://github.com/turing-machines/bmcd/commit/39774e1da2237e0b35b226827ced9e8c1321151b 2 open topics though: * security wise; private keys are archived as well. I would feel more comfortable if encryption is enabled, however it would degrade user experience. I can see myself needing my backup archive and then coming to the realization that i forgot my passphrase. Another option would be to omit them. * it would be nice to also have an option in the UI to restore these backups
c
Since
bmcd
is once again monitoring the node UARTs, it's going to create some confusion when people try to open the /dev/ttyS# devices with microcom and see random read bytes go missing. I wonder if there's a good way to deter that and encourage that users access serial through
bmcd
@svenrademakers As of
bcf0d024fff03f16e7e693310f8624aa7c6f6527
has the gadget been replaced with USB CDC-ACM or is it currently just unused?
s
i just removed the old configfs UDC, didnt found time yet to add a CDC-ACM
c
I'll spend a few minutes (not going to get heavily involved into it yet) seeing if there's an easy way to do it.
s
i already managed to create a Mass storage USB gadget without any issues, i bet CDC-ACM should not be any more complex ... 'famous last words'
c
I'm right there with you, doesn't seem hard. Looks like the only thing missing is the
acm
kernel driver?
s
that would be nice, im pretty toast.. time to recuperate from my screen
yes, i think so
c
# CONFIG_USB_CONFIGFS_ACM is not set
yeah that needs to be toggled on, but it should otherwise be easy
s
the rest is already in
this ADB server was eating a lot of RAM, i think we freed up like 40 meg by purging it
c
(Ooh the other one that might be useful is RNDIS... we could do Ethernet over the USB link.)
s
what benefits does it give us compared to using the dedicated ethernet device?
c
It's just convenient in case a user wants to plug their laptop into the USB port and use that to reach through to nodes, like if they don't have an Ethernet cable and/or adapter handy.
Probably not something we do by default.
s
well we could do both 😛
c
Indeed 😄
s
but i think there is a lot of value in having hte ACM driver, it means people dont have to buy USB-TTL cables anymore
w
how does one 'use bmcd'?
the obvious answer to your question is "remove microcom and picocom, and replace them with a message that tells users whatever it is that they're supposed to do instead"
c
I actually don't know. What I'm hoping is for a wonderful
screen
-like thing in
tpi
like...
tpi uart -n 1 attach
...and, like
screen
, it shows you the current terminal state and takes over CTRL+C, CTRL+D, etc. -- and connects through a WebSocket so you get a nice low-latency connection. That might be a little ways away though. 😦
I wonder if I should get my feet wet with
bmcd
development and send a PR implementing the necessary UART stuff, or if Sven/Ruslan are focused on that.
w
mmmm, that sounds nice
I like the sound of that
t
Put it in as a enhancement request (PR). Sven an Ruslan will get it sorted.
s
Random read bytes go missing?
i would expect this when you use 'tpi uart get' as its doing a lossy utf encoding at some point.
a
I was able to update to RC1 via sdcard, and OTA for RC2 (ran into the same UI bug but it did flash; note: when i did
cat /etc/*release*
it still showed the VERSION as RC1 but the git hash seemed to match the action run linked here. I didn't read everyone's messages between but I noticed when I power on a node through the webui and refresh the page it did not save the state of my powered on nodes (they were unchecked- though powered on). edit: when refreshing, it does appear to be getting the payload correctly-- just a UI bug?
Copy code
{
  "response": [
    {
      "result": [
        {
          "node1": "1",
          "node2": "0",
          "node3": "0",
          "node4": "0"
        }
      ]
    }
  ]
}
s
yes i think so, if you like you can try the latest version (which will be the final v2 version). I spent this weekend educating myself on frontend development and UI bugs.. i fixed a bunch. https://github.com/turing-machines/BMC-Firmware/actions/runs/6843032217
a
sick, ya that fixed it 🎉
c
If two processes have a stream (TCP socket, pipe, serial device, ...) open and both try a read(), only one of them gets the next byte(s) to arrive (I think it's whichever one did the read() first). Usually microcom wins, I'm guessing because it uses blocking read() while bmcd is using an async reactor to learn when bytes are available followed by a read(), but sometimes bmcd wins and the user on microcom misses output.
s
I recall reviewing this.. if this is the case we should find some ways to work around it. Replacing it all together with a nice websocket + renderer has my preference. actually im hoping to port the openbmc implementation into our firmware 😅
t
The final BMC v2.0.X firmware release looks great! The OTA update and subsequent bmc reboot worked fine. My /etc config file and /etc/init.d script additions/mods were preserved so I'll do a fresh install from SD to confirm everything is good to go for me. Looking forward to RK1 (32GB) and TPi v2.5. The USB hardware changes in the TPiv2.5 could open a number of new feature possibilities.
s
Thanks for testing it! I think people will like the additional tweaks you suggested
t
I have additional usability suggestions. I'll create a PR for them.
w
Might be another conversation but for the UART it seems there are basically two options - either take the screen/tmux/mosh approach and run a local terminal emulator & sync its state to all clients. But maybe better is the kind of pub/sub approach where you multiplex the I/O instead - better for monitoring use cases like getting the board type
c
In my head I was picturing two different possible websocket requests:
wss://bmc/.../4?terminal=true
and
?terminal=false
, which indicates whether the client wants to be synced up to the terminal emulator running inside
bmcd
or just wants to be subscribed to the byte stream going forward.
The former is actually just a dump of the current terminal state followed by the latter.
Rust also has the [vt100]() crate which does the necessary terminal-emulator stuff (so,
bmcd
just has to shovel bytes coming out of the node into a
vt100::Parser
instance), and the crate also has
parser.screen().state_formatted()
which generates the byte sequence necessary to get a connecting terminal to the same state as the emulated terminal. In short all of this means that the
terminal=true
param just controls whether or not the client wants the output of
parser.screen().state_formatted()
prepended, and either case subscribes them to the stream bytes coming out of the node UART.
t
@svenrademakers I may have lied when I said everything in the latest RC was good to go. I didn't check this function until this evening. By the look of it and some Chat and GitHub comments, I assume it is supposed to back up files under /mnt/overlay/upper that a user has customized to a compressed tar file on the SD card. I tried it, but received the message "Error generating backup archive" in a pop up. The installed SD card has very little on it. Note the file system on the SD card is exFAT. Upon a little additional investigation, I find the UI's text fields for "BMC" and "SD card" have some kind of issue. I'll have to look at the web content source to diagnose. Gotta put that education in HTML5 to some use. https://cdn.discordapp.com/attachments/1170114820457111592/1173814819741249576/TPiv2-v2.0.0-RC2-4.png?ex=6565533d&is=6552de3d&hm=025bda87b13e8851d537b38878962497df5f40d618a38222b31eb85198f1af5c&
r
Just wanted to check on how to engage safe mode on the new BMC, should that be possible before the BMC boots from the recovery SD card? and will the network be active in safe mode?
c
The design (it's a bug if you find an exception to this rule) is that if you hold KEY1 for 5 seconds at power-on (or after releasing BMC_RESET), is will boot to safemode. It's currently implemented as a "this boot only" reset to factory defaults, so the network and everything will come up as if you had just performed a fresh installation from the microSD.
(I should clarify, you have to hold KEY1 for 4-6 seconds. If you hold it for 15, you go into a firmware repair mode which is not safemode.)
r
Ok, with the SD recovery (BMC rev 2.0) in the board, it seems to boot the recovery partition (get's to the light and network slowly flashing on and off stage) not the safemode it seems. This is with power completely off, KEY1 held in power on, 5 seconds, KEY1 release. or power off, power on, KEY1 held, 5 seconds, KEY1 release. Not sure if that is designed or not.
What I am attempting to do is post SD card update of BMC, I'd like to access the BMC and wipe the SD card for normal use.
Trying to not have to remove the SD card and re-format in another machine.
The lights flash stage seems to happen about 6 seconds after power on.
c
I definitely designed for this exact thing:
If it's not working for you it may be a bug.
r
Ok, then I am doing something wrong..
c
In a pinch you can hold KEY1 for 15 seconds at power-on with the USB-B connector connected to your computer, and that will give you USB->microSD access.
r
I like this idea..
c
If that works but the 5-second safemode doesn't, I might've made a mistake with the latter.
r
I might also be confused with KEY1, I am thinking it's the same key that we use to power on and off the nodes when booted normally.. and the one I press three times to start the flash?\
c
KEY1 is a little surface-mount button on the PCB next to the BMC

https://files.readme.io/3bf121b-image.png

r
Thanks - issue was me.. I thought the external key was also key1
holding the button on the board itself as you poineted out worked perfectly.
Thanks!
c
Sweet, okay, super glad that I don't have a bug in that bit of code 😅
r
Are you working on a set of documentation, I might be able to review and provide some updates, I have learnt quite a bit from a normal end-user viewpoint..
So I see things and miss things that more involved people might not think about.
c
I know it's being documented but I'm not the one writing it. I think that's @DhanOS (Daniel Kukiela)
r
@svenrademakers - love the way the power settings for the nodes now work!
s
Hi Dan, its suppose to archive
/mnt/overlay/upper
and download it to your computer not to your SD card. Which browser are you using? If its failing, the stdout of the BMC should show as well what the cause was
w
fwiw I achieved this by interrupting u-boot and looking up the u-boot command to erase the sd card
s
yes we are, and a extra pair of eyes would be really appreciated. As a non native speaker is quite taxing to write a sensible story. i will share a link once its available
r
Please do, I'd be happy to try and give back to the community
t
Sven, I'm running Chrome Version 119.0.6045.124 (Official Build) (64-bit) on Windows 10. When I click on the "BACKUP USER DATA" button, the Chrome console (not BMC) displays the errors shown in the screen capture. https://cdn.discordapp.com/attachments/1170114820457111592/1174011716984635483/TPiv2-v2.0.0-RC2-4-userdata_backup.jpg?ex=65660a9d&is=6553959d&hm=efeb62cdceae369d9f11c76499d64f8b53645a6eece101d3d115b39392ca447c&
s
what is the response body of that 500 response?
t
I'll let you know when I get back.
s
FYI, in my quest to improve node flashing i did some tests with decoding xz images on the fly with the BMC. Xz archives are roughly 10 times smaller as their uncompressed variants so i expected to see some big improvements with regards to sending huge files over a slow network. The results are a bit of a let down, as writing a xz file is not really faster, to make it worse the system becomes unstable. Available memory is to blame here. The xz images i tested require a minimal of 66mb of RAM which is extremely tight. bmcd needs around 40mb at least, so combined you eat up 110mb of the 128 mb available. Next to that, the cpu has difficulties keeping up, downgrading network performance. I parked these efforts for now. used deamon code: https://github.com/turing-machines/bmcd/commit/fd45f4769b2e0f3328cf8980b88b8b13f8b5ca8e
i improved the archiving implementation so that it streams the archive back over HTTP instead of copying it around in memory. In hesitant to squeeze it in as i want to cut a release
w
TL;dr - to decompress,
xz
uses ~10-20% of the memory required to compress.
If you control the compression, might be good to not use the max settings
s
Yup. i just took the latest raspberry pi bookworm image from the official website as a baseline. RK1 images we can more or less control the compression ratio. But unfortunately not for other images
w
Maybe there's some silly way to do the decompression on the client side in wasm
Plenty of ram on whatever computer is being used to upload the image
s
I thought about this. But the only gain we have is the comfort of not having to call one extra command ‘xz -d’ before uploading your image. Which im not sure is worth the effort
w
It does seem like (only) a nice to have
You might consider using --memlimit with your patch so it fails rather than making the system unstable if one gives a highly compressed file
"nice to have" here being used as a negative term. As opposed to essential.
s
When i have some extra time, i want to see if i can lower the memory footprint of the bmcd. Worse case we still dont have decompression but a more optimized program
Currently i have set a ceiling to 66m. which actually is around 6 mb too high. The challenge is that a certain memlimit works for a vanilla system, but will crash for someone who is running other stuff in parallel
I feel comfortable to merge it to master, maybe it will help a few people, but my guess is the majority of people will get an “out of memory” error back when trying to upload xz images
w
Ah. I suppose in theory you could look for free ram and set based on that but might not be worth the effort
s
i feel this is an excellent opportunity to develop and sell decompression extension boards 😂
w
Heh, they'd have to be cheaper than just throwing more ram at the problem though
s
you can fix that with proper marketing 😄
i will shutup now
> Ah. I suppose in theory you could look for free ram and set based on that but might not be worth the effort thanks for thinking out loud here, this was actually still missing from my implementation
c
It may be too big of a step down from xz in terms of ratio, but lz4 decompression is lightning fast and requires only 64K of working memory.
And this one's more me thinking aloud than an actual suggestion, but I wonder if there's some convention for indicating "don't care" blocks in a disk image. Like, if a block in the image consists of the ASCII string
SKIPSKIPSKIPSKIP
... it's the creator of the image saying the partition layout, filesystem, or whatever doesn't use that block, so an imaging tool should just skip writing it to save time.
s
adding more decompressors is quite trivial, but Im wondering what would be a better user experience. saying "here user you can upload xz images but its 50% chance it will work", or "hey user we dont do that here use lz4"
c
Good point, I imagine the vast majority of users would grumble at having to decompress the file just to recompress it with lz4 (since nobody uses this for disk images in the real world) in order to save a little BMC network bandwidth
s
good question, there must be some open container format for this. This is basically how the phoenix suit images worked. from what i could tell they had some header or meta table which described like "write this blob at this offset"
t
Sounds good. I looked at Chrome's debugging info last night, but didn't see anything useful. I also started looking at the UI source because the "User Storage" fields on the Info tab aren't being populated correctly. Unfortunately, app.js isn't exactly human readable.
s
what about this one?
im dumping "promise/then/fail/finally" cleanup in a couple of minutes though,
im learning more front end every day 😄
t
I'll take a look at that repo. I was trying to work with the files on the BMC itself since I expect functionality depends on having bmcd. (Note that I'm using a custom hostname.)
s
the easiest would be to checkout the repo and build the pages and then scp them over to the board. You should also be able to instert a 'pre-setup' ajax hook to defer calls to your board so that you can just run the pages locally on your machine. but i havent gone through the trouble yet
t
Will do. Once v2.0 firmware is released, I'll update my local build environment. Yeah, I know I could do everything on GitHub, but I'm old-school. 🤓
s
> I also started looking at the UI source because the "User Storage" fields on the Info tab aren't being populated correctly. Unfortunately, app.js isn't exactly human readable. you mean the UI itself or the actual values? the former is fixed a hour ago
t
The UI. The field contents are garbage.
b
I may be doing something wrong here. I attempted to build off
master
today using the
README.md
instructions and the build container, and everythign "works" up until the last line of
Copy code
cd output/images/
cp -a ../../../tp2bmc/swupdate/* .
./genSWU.sh 1.0.0
. The shell script doesn't exist. Is there a different way to create the OTA version for 2.0-RC2?
c
Ah, yeah, the
README.md
is woefully outdated, needs to be updated for 2.0.0. The OTA update is the
buildroot/output/images/rootfs.erofs
file. The OTA files are just EROFS images, renamed to .tpu.
w
Somebody somewhere had implemented a nice new UI that I had installed on my OG Community Edition installation
( @phearzero )
b
I like the shell 🙂 🙏
k
+
b
so, I can
cp buildroot/output/images/rootfs.erofs tp2bmc-2.0-rc2.tpu
or similar and use it for OTA? Sounds easy! Thanks~
t
Building the latest commit (4 hours ago according to GitHub), BMC-Firmware-2.0.0-RC3, now. Looks like all the changes for 2.0.0 have been merged to master with finishing touches occurring there.
Built and installed (tpu file) nicely. The Info page's BMC and SD card text fields are valid now. Clicking on the "BACKUP USER DATA" changes the text to an animated spinner. There is no client-side pop-up to select a download location. Update: Enabled the Chrome developer tools and now I see the "500 (Internal Server Error)" message from "Backup". The error in Backup points to app.js in this code block: try { a.send(e.hasContent && e.data || null) <----- line flagged as generating the error } catch (e) { if (t) throw e } The error reported in app.js is: app.js?_v=20231115232853:2 Uncaught ReferenceError: settimeout is not defined at Object.c (app.js?_v=20231115232853:2:197927) at l (app.js?_v=20231115232853:2:30431) at Object.fireWith [as rejectWith] (app.js?_v=20231115232853:2:31179) at _ (app.js?_v=20231115232853:2:82784) at XMLHttpRequest. (app.js?_v=20231115232853:2:85227) c @ app.js?_v=20231115232853:2 l @ app.js?_v=20231115232853:2 fireWith @ app.js?_v=20231115232853:2 _ @ app.js?_v=20231115232853:2 (anonymous) @ app.js?_v=20231115232853:2 load (async) send @ app.js?_v=20231115232853:2 ajax @ app.js?_v=20231115232853:2 (anonymous) @ app.js?_v=20231115232853:2 dispatch @ app.js?_v=20231115232853:2 g.handle @ app.js?_v=20231115232853:2 The link in this message points at this code block: function c(e, t, n) { settimeout((()=>{ let n = t + " : " + e.responsetext; showtoastnotification(n, "error") } ), 300) }
r
Just installed RC3 myself off the last build on GitHub.. worked perfectly, looking good, I haven't played around too much but no issues from my side so far
s
that is correct!
ffs.. i made a big boo .. i will need to make another RC..
thanks for the stack trace though
its not going to fix your 500 error response though.
b
probably already asked by someone: can we keep the nodes powered up during a firmware upgrade, and consequent reboot? (I already like that the power state of the nodes is remembered!)
t
Yeah, "No such file or directory (os error 2)". My assumption is the code is referencing "/api/bmc/backup", under ServerRoot (/srv/bmcd/www). (I've been retired for 5 years. The mental cobwebs are showing.)
s
The BMC runs EROFS filesystem (thanks to @CFSworks). Actually its archiving
/mnt/overlay/upper/
on your board which is the upper layer that gets mounted over the read only filesystem. I wonder if your mount is broken or perhaps there is something funky going on with symlinks that you created?
c
I suppose it's also possible that the upper layer is just huge?
s
Unfortunately this is not possible due to hardware limitations. v2.5 will be able to do this as it will get latches
c
Like if you're running from microSD and not flash, you could put a bunch of node images in your /root directory and those would get included in the backup
t
No symlinks here ... oh wait ... when I changed the timezone to be America/Los_Angeles, I also removed and recreated the localtime symlink, but it's still a relative link. These and other small changes I made did get backed up into /mnt/overlay/upper. I intend to make these changes in the build environment's overlay directory.
s
This is a problem as the current implementation archives in a memory buffer.. but i doubt it will give you a ‘no such file or directory’ error back. I would expect the BMC to hang or best case return a out of memory response
c
Hm.
s
Also call ‘sync’ just to make sure you flushed all your fs buffers
c
This is not a very scientific test but I figured I'd put the results here anyway:
Copy code
@ xz -d < ubuntu-22.04.3-preinstalled-server-arm64-turing-rk1.img.xz | ssh black-bmc 'time dd of=/dev/sda bs=4M'
real    13m 27.79s
user    0m 0.97s
sys     3m 41.30s
@ xz -d < ubuntu-22.04.3-preinstalled-server-arm64-turing-rk1.img.xz | lz4 | ssh black-bmc 'time ./lz4 -d | dd of=/dev/sda bs=4M'
real    12m 40.04s
user    1m 2.23s
sys     0m 33.84s
It takes about ~6% less time to image a node if I trans-compress it to lz4 and have the BMC do the lz4 decompression, since it gets that 100Mbps Ethernet bottleneck out of the way.
s
Btw, the tpi tool transfers with gzip compression enabled. I havent measured the performance difference of this in a while.
w
EROFS sounds like an error code
lol it is
b
Is this most recent build?
t
That's my SOP. It's a habit I got into in the mid-1980s UNIX dark ages (pre-Linux). I did not integrate my /etc and /etc/init.d customizations into the RC3 build I did on Wednesday. Just figured the .tpu update would pick them up from the on-BMC /mnt/overlay/upper in prior 2.0 RCs. Last night I installed my vanilla (no mods) RC3 build from SD to wipe the BMC's internal NAND storage of all prior customization. The Backup function worked fine. I then made several manual changes to /etc (hostname, hosts, timezone and the localtime symlink). This broke the Backup function. I haven't isolated exactly which change it was, but I'll research further. To test a theory, I modified and re-integrated all of my BMC firmware 1.X.X customizations, including scripts under /etc/init.d, into my local build tree last night and kicked off the make before hitting the sack. Installed from SD card this morning. The Backup function works now.
Doesn't look like it. The subsequent RC output files have been versioned. I built downloaded RC3 from GitHub and did a local build on Wednesday. Sven has been making additional commits, but they all appear to be related to the Docker image. @bplein are you upgrading from 1.X.X BMC firmware or is it already on 2.0.X?
b
Earlier builds had exactly this error
b
I built of master/main just now. , then
cp buildroot/output/images/rootfs.erofs ~/tp2-bmc-2.0-RC.tpu
and copied that from my build machine to my desktop and uploaded it.
b
Can you try rc3? I installed rc3 from GitHub build, yesterday i installed rc3 and this 'upgrade failed' message was gone (and btw the upgrade did succeed then in my case)
b
Sorry, for some reason I didn't see the RC3 tag when I looked. I feel dumb now. Will give it a try
(I also didn't do a
make clean
so there's a chance I didn't pick up everything, well, cleanly)
t
Got RC3 running… looks beautiful
i
Hi, What steps you were following to spin it up? And what was your earlier tp FW version?
t
@svenrademakers Figured out why my change to localtime was causing backup of customized files in /mnt/overlay/upper to fail. The symlink I created between /etc/localtime and /usr/share/zoneinfo/America/Los_Angeles was relative (../usr/...), rather than absolute (/usr/...). Ordinarily, this would not be an issue, but when this relative symlink was copied into /mnt/overlay/upper/etc, the relative link no longer pointed to a valid location. When you invoke tar to create the backup, are you following symlinks (-h option)?
c
Symlinks should for sure be grabbed as-is rather than followed
t
I just pulled it pre-built off the repo... was running RC2
I'm just following the changes, not really testing per se.
g
Just installed RC3 , NICE interface !! 👍 upgrade trough sd card , tpu upgrade gave me error 255 and dd was not working due to error 0 blocks written
t
ultimately... a dark mode option would be nice cosmetically (so many folks were grumpy about Proxmox until it got a dark mode option)
t
It seems at least a few people are running BMC v1.0.2 firmware because they are only now able to order and CM4s or equivalent. Is there is a clearly documented procedure for getting from v1.0.2 to v2.0.0 without going through v1.1.0 with Phoenix Suit? The current on-line documentation doesn't seem to explicitly address this particular scenario. I thought about downgrading my TPiv2 from v2.0.0-RC3 to v1.0.2 in order to validate a procedure. However, with the BMC NAND's new layout, I'm not certain whether that would be advisable.
w
That's what I did
I never used PhoenixSuit
I went from whatever the max firmware that doesn't need it straight to 2.0.0-RC1
t
Thanks. Since I used Phoenix Suit for the 1.0.2 upgrade to 1.1.0 (where the BMC's NAND format changed), I wasn't certain about the SD card upgrade directly from 1.0.2 to 2.0.0.
s
this is no problem, SD card upgrade is not dependent on anything that is currently present on the flash
b
I found this useful for flashing the BMC from SD card, as it avoids the use of USB and other software entirely. https://discord.com/channels/754950670175436841/1174117775460016128/1174122860172812318
t
Thanks Bill. I'll copy that link to the appropriate Tech Support thread.
@svenrademakers I noticed an error in the README.md file that was committed yesterday. It's in the last paragraph: "Both UI and tpi can be used to write the upgrade package to your board. When you try to upload the image via the firmware upgrade tab on the UI, You will notice that the file extension is not matching the one the UI expects. You can ignore this, a .tpu image is nothing more than a rename of rootfs.erofs image. To smoothen the experience, you can decide to change the extension of the file to .tpu." rootfs.eros is the .img, not .tpu, file.
s
Actually its written correctly in the readme: tp2-ota-v2.x.x.tpu == rootfs.eros, its just a ubi volume containing a full rootfs. What happens on a firmware upgrade is that a new volume, or partition for that matter, is created. The .eros file is written to it and on a reboot, the bootloader flips a switch to boot from this new partition. If it fails it tries again with the old volume. p2-firmware-sdcard-v2.x.x.img is a raw image file, which can contain mutliple volumes and in this case bootloader + rootfs
t
Okay. My understanding was that the .erofs file was the raw image. The .img file was not present in buildroot/output/images. Otherwise, I would have dd'ed it to a raw SD card. Note, I have not built the latest commit.
s
this is what the CI does:
Copy code
- name: stamp images
        run: |
          mkdir artifacts
          sudo mv buildroot/output/images/tp2-bmc-firmware-sdcard.img artifacts/tp2-firmware-sdcard-${{ env.BUILD_VERSION }}.img
          sudo mv buildroot/output/images/rootfs.erofs artifacts/tp2-ota-${{ env.BUILD_VERSION }}.tpu
b
@Terarex (Dan Donovan) this is what was built on my successful build of RC3:
Copy code
# ls -la images/
total 180992
drwxr-xr-x 4 root root     4096 Nov 17 15:13 .
drwxr-xr-x 6 root root     4096 Nov 17 15:13 ..
-rw-r--r-- 1 root root  1120290 Nov 17 15:13 installer.cpio.gz
-rw-r--r-- 1 root root 39550976 Nov 17 15:13 rootfs.erofs
-rw-r--r-- 1 root root 69171200 Nov 17 15:13 rootfs.tar
-rw-r--r-- 1 root root 27794977 Nov 17 15:13 rootfs.tar.gz
drwxr-xr-x 3 root root     4096 Nov 17 15:13 sdcard-bootpart
-rw-r--r-- 1 root root 16777216 Nov 17 15:13 sdcard-bootpart.img
-rwxr-xr-x 1 root root    45830 Nov 17 15:11 sun8iw20p1-t113-turingmachines-tp2bmc.dtb
drwxr-xr-x 3 root root     4096 Nov 17 15:13 tmp
-rw-r--r-- 1 root root 57376768 Nov 17 15:13 tp2-bmc-firmware-sdcard.img
-rw-r--r-- 1 root root   611188 Nov 17 15:10 u-boot-sunxi-with-spl.bin
-rw-r--r-- 1 root root   578356 Nov 17 15:10 u-boot.bin
-rw-r--r-- 1 root root  3880528 Nov 17 15:11 zImage
Note
tp2-bmc-firmware-sdcard.img
t
I probably am working with a bit earlier version of RC3. I'll pull the latest source tar file and build from it. I tore my TPiv2 system apart yesterday to allow soldering an EMC2301 fan controller chip and PWM fan header.
c
It's a tad confusing, because while
rootfs.erofs
is the "OTA image" (containing all of userspace and the kernel), it's also a partition in the
tp2-bmc-firmware-sdcard.img
file (for direct booting and/or installing from the microSD card)
t
Thanks for that clarification. I noticed that @svenrademakers committed a license file to master this morning. That's probably the final step before release. I pulled a tar file of it, added my overlay mods and kicked off a build. I'll put the appropriate .img file on an SD and test later today or tomorrow. I want to solder the fan control components and put the system back together first.
h
Can you use the web UI to update from RC1 to RC3 or is that not avail until RC3?
t
You can use the Web UI.
h
damn shame the sdcard was not mounted on an edge make it easier to access.
c
If you already have a microSD installed, you can back it up, image the installer on there, reset the BMC, do the installation, reset the BMC again, go into safe mode, then erase the microSD card, all without removing it.
But if you're already on RC1 you can use the web UI to update to any future version (RC or otherwise). Just note that RC1 has a bug where it throws an error despite the update working successfully.
w
+1 - I did this but with UART attached
s
Ah i see you found my git tag 😅
b
yes, really great work, big leap forwards
w
ooh, now working out where I can download the artifacts from
I think you need to add the .zip extension 😄
oh it's just not visible in the UI
I got this error when trying to install the final version from there
perhaps this is what I'm encountering
yeah ok so if you get this error it succeeded but didn't reboot
so you just have to reboot
b
And do it again for closure 👌🏻 issue will be gone
i
Hi guys, not so long time ago, Sven put the installation steps for the BMC FW and I cannot find it in the history 😄 If someone has it please paste the link here, many thanks
BTW is there any tool in the discord which can be used to save a message?
s
i will paste it in a bit..
Just so you guys are aware, there will be a v2.0.4. Which will be final final final. i promise.. It will contain a small api change in the USB tab to make sure the UI works the same as the old UI. Secondly, there are some ui fixes for windows users
Its always interesting that it was relatively quiet last week, but when i was about to hit the big release button today, some things popped up 😅
download the latest artifacts from the BMC-Firmware repo: https://github.com/turing-machines/BMC-Firmware/actions/runs/6973770338 -> it contains a tp2-firmware-sdcard-xxxx.img -> dd this to a micro sd card -> put it in the back of the bmc -> reset het bmc -> wait until the lan leds start to blink -> hit KEY1 3 times -> lan leds start to walk from left to right -> wait until lan leds start to flash twice -> eject sd card from the back and restart bmc -> Done Warning: flashing with SD card method removes all data from your BMC flash
i
Successfully updated to v2.0.4 Many thanks!
g
If you are on 2.0.3 is there added value to do the sd card upgrade or is the ota upgrade sufficient?
w
just ota I believe
t
I've just upgraded from latest 1.x to 2.0.4 from GitHub. Did someone else have to run the
generate_self_signedx509.sh
script after upgrading in order to get bmcd to boot? Without this the program failed at start stating that it lacked the required pem files.
w
I didn't have to for 2.0.3 - I was wondering if there's instructions somewhere about bringing my own SSL certificate
e
me neither. my system generated the certificates on first boot
I din't find any instructions. but in
/etc/bmcd/config,yml
are config keys for key files. you can probably configure your own cert that way https://cdn.discordapp.com/attachments/1170114820457111592/1177559625697067008/image.png?ex=6572f2dc&is=65607ddc&hm=2c92962808f4fa73f22e546f1d35c9bf53c6d5844241fba4f4cd651be87efa98&
t
Me too!
g
@me too trough tpu file !
p
@werdnum if you are still interested I can give it a polish for v2
w
Not my call, but I do like it! See what @svenrademakers thinks
s
Sorry guys, i was out for the weekend.@phearzero are you the one maintaining the xterm UI fork?
p
No worries at all, yea this project: https://turing-pi-ui.vercel.app/ Should have a ticket in GitHub where I'm most active
t
That is slick!
c
Hmm there appears to be a bug in the
rockusb
driver (?) where it mistakenly believes the eMMC of the RK1 is 2064384MiB larger than it is.
It gives the correct size modulo 32GiB but it sets 6 bits that shouldn't be set. I'm wondering if this is a Rockchip quirk rather than a driver bug though.
I'm thinking it's actually a bug in the version of
usbplug
that's loaded on the RK3588 by bmcd. I don't remember this coming up in my testing but I was also using the latest binary from Rockchip.
s
liking the terminal you made in there. We dont have any active plans for the UI currently. So any form of contribution is welcome
its finally public https://github.com/turing-machines/BMC-Firmware/releases/tag/v2.0.5 Thank you all for taking the earlier v2.0.x versions for a spin, Thanks to you guys the firmware got better
i didnt want to update the usb plug to latest as i did not want to risk it. but if there are issues with it we might need to do it anyway
u
ota upgrade works pimp
c
Anyone tested flashing from WebUI for raspberry pi cm4? It's working as intended?
t
I successfully flashed an SD-equipped CM4 from the WebUI. I believe it was RC2. Improvements were made in RC3 for performance.
c
Hmm, I hope it'll work on non-sd card CM4
d
It will. This flashing method works fine and can flash the eMMC
Flashing an SD card this way is not supported by the RPi Foundation and may fail
c
Neat! Thank you
Also, what about ttySx? I tried to use microcom commands from docs, but no output in serial still 😦
d
And what about it?
You see nothing when you boot CM4?
c
CM4 have an OS for now, Ubuntu When I power it on and use any serial from that list: https://docs.turingpi.com/discuss/652d892d4fa166000de16cee No output or any reaction Tried all slots
Copy code
Node 1: /dev/ttyS2
Node 2: /dev/ttyS1
Node 3: /dev/ttyS4
Node 4: /dev/ttyS5
d
Ubuntu required setting the UART out iirc
Is there
raspi-cconfig
available?
If not, see RPi forum on how to enable UART\
c
Thanks! Will try
Login worked thanks But is it possible to see whole boot sequence in Serial? Or it's not an option?
d
UART is a serial console. It shows what's being sent to it. Physically it is just a serial link, so whenever you connect, you see what's being sent over. This means no history. But if you logged in, you may try
dmesg
command
b
Sorry if this is asked again, reboot without power loss of the nodes, is that in the cards? Rather not have the nodes going down on every firmware upgrade
c
Earlier this morning I was idly wondering how difficult it would be to reboot the T113 "by hand" and not touch the pin controller in the reset sequence. It wouldn't alter how the
BMC_RESET
button behaves but it would mean Linux can reboot without affecting the nodes.
d
It is, but for the Turing Pi 2 v2.5 board (the next revision) - it'll contain latches to hold the states (like node power) on the BMC reboot
b
a, thanks @DhanOS (Daniel Kukiela) so its imposible on tpi v2.4?
d
As for our current knowledge - no. Maybe there is a way to do so, like what CFSworks was looking at, but for now we do not know about any wat to not reset the GPIO states at reboot
d
How long should we expect our CR1220 batteries to last for the RTC? After flashing my board the clock was reset to Jan 1. I'd have expected the RTC to still have an accurate time. Thinking it may be time to change mine.
s
Hey guys, I've flashed and inserted the SDcard. But I'm not getting the update prompt, is there something I am missing here?
d
My best guess is you touched the board around the battery or BMC while inserting/removing the SD card which may reset the RTC time. The battery should last for months, but no one ever tested this 🙂 If you keep the board powered on (the nodes can be down but the BMC works), the battery does not discharge, it discharges when the board has no power
So the BMC is booting normally even with the card inserted? How did you flash the card?
s
Win32 Disk Imager as directed
I'm kinda confused, because the BMC startup declares that the SD card is mounted.
So like, either there's a problem with the flash (Unsure how I would check that) or it's not reading it
d
Are you sure you... flashed the right drive?
s
100%
d
Can you SSH into BMC?
s
Yep
d
Can you list the sd card content? Will be mounted as
/mnt/sdcard
s
Can do, hold one
So seemingly the flash has failed?
d
Looks empty and it should not be empty
I prepared this part of the docs and I ran this countless times. Unless you chose not the correct drive letter, I'm not sure what else could have happened
Or maybe the card is corrupt?
Just verify I'm doing the right thing here @DhanOS (Daniel Kukiela)
c
It looks right. Just to verify it's actually writing, if you unplug and reconnect your microSD(/adapter), can you then go into
K:\
and see
install.txt
?
s
Empty
c
Try creating a file in there, then safely removing and replugging. I'm wondering if something silly is going on, like the writeprotect switch on the adapter is turned on 🤷‍♂️
So, yeah, it retains
That works, annoyingly
So its not that
@DhanOS (Daniel Kukiela) just tried it with a seperate SD card, same deal
c
I wonder if it's Win32DiskImager's fault. Maybe try Rufus or BalenaEtcher?
s
Tried rufus, same deal
d
Okay I found your issue
This is not the correct file size
The file is over 50MB in size
Download it again and make sure you downloaded the full file
c
ugh, no, it's a bug in my
rockusb
code somewhere. Worse, it doesn't happen on my own system but it does happen when compiled for the BMC.
r
@svenrademakers Thanks, for all the effort, just updated to v2.0.5 - Very happy customer
s
perhaps we should add a sha hash file to the directory for these kind of things. Im thinking, it would be even nicer to have the BMC verify known images on sha hashes
u
I was going to mention shasums
d
I totally agree. Or we could have additional field to provide sha file
s
Worked @DhanOS (Daniel Kukiela)!
Thanks
b
I would love to see a free format description field besides the nodes, I now literally keep a notepad next to my machine to track where which node is 🙂
w
Heh, just name the machines after what slot they're in
b
A, thanks, but i want to track the modules, and what you jean bij naming them? In dns u mean?
w
Yeah give them host names that reflect their slot number
I have a little cheekily, slot#3 is "shamrock" (shamrocks notoriously have 3 leaves)
b
Haha
w
(my nodes are named after places in Maastricht of all things)
b
4 if your lucky 😇
Haha, do you have enough places there 😉
w
Lol, I do need to strain a little after 4 or 5
But it's a student town, there are plenty of bars. Actually the whole scheme began as a pun, I named my first Raspberry Pi "vlaai", because of course it's a type of raspberry pie
Well more often cherry but oh well
b
I like vlaai (not from cows)
a
Should the BMC's sd card still auto-mount? I'm not seeing that behavior. I put the card in and the kernel detects it, but I have to manually invoke
mount /dev/mmcblk0p1 /mnt/sdcard
. It doesn't seem to properly auto-unmount either and I have to manually invoke
umount /mnt/sdcard
.
Oddly it seems like the turingpi UI will still recognize if I disconnect + reconnect the sdcard while the board is running since it shows in the UI when the card is there or not and the correct size, but when reconnected it's completely unusable and attempted writes result in an io error.
It does seem like after that initial manual invocation of
mount
that it will auto-mount after restart, but I did have to ssh in and manually invoke the command first.
s
Im having a bit of brain fog, someone here can perhaps past the correct command for this. The behavior you describe i usually have when not correctly ejecting the SDcard, did you try to call the obvious commands? e.g.
sync && eject /mnt/sdcard
i was born in the big metropol north of maastricht.. Sittard. Finding names related to places is not even your biggest challenge there. Im not sure if the concept of electronic boxes hooked up to the internet already landed there
w
"big" - when you said "big Metropole North" I figured you meant Eindhoven
s
Sometimes i fail to give it the right sarcastic tone 😅😀
b
@svenrademakers @werdnum @Nico when you are west of Utrecht, let me know!
a
I didn't and probably won't be able to replicate that again unless I take the whole thing back out of the chasis so that I can try hot unplugging the sdcard. I think the bigger problem is that it wasn't showing up the first time until I manually `mount`'d it rather than the hotswap not working.
i
Guys, I have one delicate situation after updating to the latest FW. My nodes are not accessible nor visible via ssh/nmap, but in the same time I'm able to see them as a nodes in the k3s. Any suggestion how to debugg this? (turning off/on nodes wasn't helpful)
w
Are they actually active in k3s? Would be unusual... But you can use
kubectl debug node
to get a shell
Or create a daemonset with a privileged container
i
Yes, they are fully active in the k3s. It seems like nodes lost their static assigned IP addresses, really strange behaviour. I'm little bit confused if IP changed how k3s was able to add the node again in the pool, hmm Anyhow, many thanks for your input Werdnum
b
how many K3s nodes and how many of them are masters?
i
One master and 6 additional nodes
t
Anything new since 2.0.5?
d
If you mean a new firmware version then no, not yet
t
The post-2.0.5 BMC-Firmware GitHub updates mostly seem to address better RK1 flashing support. A new branch was created today to begin modifying buildroot to produce both BMC NAND and SDcard-native versions. See 'Info > github" for additional detail.
t
Will it be possible to configure nodes post-flash using cloud-init at some time using the BMC? (I assume it’s already on the “to do list”)
c
It's a goal of mine to figure out how to get the BMC to host a 169.254.169.254 virtual IP service that isn't exposed off-board and knows which node/slot is accessing it. (I don't think there's any formal plan for that, mind, but I highly doubt such a PR would be turned away unless it's overcomplicated.)
t
I’ve used Proxmox on my R530 for quite some time. Administering/configuring nodes on the TP2 like VMs in Proxmox seems like a convenient feature.
I have little experience with containers, etc. so there might be several technologies for this I’m unaware of.
s
Out the top of my head: * Added support for downloading node and firmware images straight from a http remote. * Added support in the UI for installing node images that are local on the BMC * Fixed MSD mode for RK1 * Added backend for storing small node configuration, such as, name, module type and buad rate (still need front end) * Pending: serial console to a node over a web-socket. im 80% finished with the implementation. (UI front end still required)
* serial console to the BMC is exposed over the BMC_USB_OTG port. It means in theory that you wont need any USB ttl cables anymore
* Have a side branch which exposes a node as block device over the same BMC_USB_OTG port. Its not completely stable yet. Sometimes the UDC controller refuses to start the MSD function. I parked it for now
20 Views