https://turingpi.com logo
#│forum
BMC 2.0.0 Firmware early adopters
# │forum
w

werdnum

11/03/2023, 9:38 PM
Putting together a forum thread to collect experiences with the 2.0.0 BMC firmware
To summarize my adventures with the 2.0 firmware RC: 1. You can install it by downloading the ZIP, and doing something like
dd if=tp2-firmware-sdcard-v2.0.0-RC1.img status=progress | ssh root@192.168.124.54 dd of=/dev/mmcblk0 bs=1M
2. Once you reboot the BMC it goes into a "do you really want to reflash the whole board?" thing, which you can skip by typing CONFIRM over UART or triple-tapping the power or reset button 3. Once it's reflashed the board, you have to either pull the microSD card, or you can erase it by connecting to UART, pressing any key when prompted to interrupt boot, and then using
mmc erase 0 10000
to erase the MicroSD card.
More tips: hold KEY1 while booting to go into safe mode - you can then erase the SD card over SSH https://discord.com/channels/754950670175436841/754950670175436848/1170005285889388606
g

geerlingguy

11/03/2023, 10:35 PM
I was able to flash two of the three RK1 on my board, but can't get the third one to boot into Ubuntu for some reason. Have to wrap up for the weekend, but not sure what's going on there. Will dig in again next week
(running RC1)
t

terarex

11/05/2023, 3:52 PM
We should have a separate Forum for RK1-specific topics that aren't related to flashing from the BMC.
I installed BMC firmware 2.0.0-RC1 yesterday. Still investigating how network initialization/configuration changed. I don't like the process order of calling /etc/setStaticNet.sh from /etc/init.d/S93startup, after /etc/init.d/S40network has already brought the network connections up. This causes multiple entries in my Ubiquiti DMP's DHCP server which can be confusing. I want to use a static IP DHCP reservation based on a MAC address. It appears the 2.0.0 firmware obtains a MAC address from somewhere and is no longer random/dynamic. Is this MAC address unique to each TPi2 T113-S3 or will they all be the same? Makes me wonder whether the static MAC and IP address logic is an artifact that needs to either be removed or optimized. Either way, this logic should occur in a separate /etc/init.d script prior to network initialization by S40network. Once I figure out the networking. I'll be looking at user interfaces. Specifically, the functionality of the KEY1 and RESET buttons, S-LED and P-LED and ATX power supply control (case fan and storage device power management). In addition, I'll explore whether the web UI and tpi commands yield consistent results. I really liked how BMC firmware pre-1.1.0 worked. I can already tell node power state is not being preserved across BMC reboots.
c

cfsworks

11/05/2023, 6:24 PM
The MAC address is derived from the CPU serial number, so will be both unique and static. The point about setStaticNet is a great one though. That mechanism is obsolete but should perhaps be a S35* script that updates /etc/network/interfaces for backwards compatibility.
w

werdnum

11/05/2023, 8:09 PM
At least it seems to be being restored to ON rather than to OFF
s

svenrademakers

11/05/2023, 10:40 PM
Im trying to consolidate the behavior of the power commands, in particular what does it mean when you issue an 'on' command for all nodes? do you power on all nodes? or the ones that are active (actually have a module inserted)? the challenge is that there is no convenient way to measure which nodes are "active". there are a couple of inputs that control power, apart from some bugs i fixed on top of RC1, the behavior im aiming for: * powering on via TPI/API/web UI -> activates and powers the node * hitting KEY_1 toggles all activated nodes to on or off. * long press KEY_1 resets all activated nodes. If one or more nodes were on, all nodes will be labeled not activated. If the amount of active nodes was 0, all nodes are labelled activated Its therefore still possible to turn on the nodes via the tpi tool, hit the key_1 button, reboot, and see the nodes go on, even though they were off before the reboot happened
t

terarex

11/06/2023, 12:27 AM
It isn't practical for the BMC to know any node state other than power is on or off. Anything more would require an agent (Nagios), which I don't think should be a Turing Machines core competency. The BMC firmware is open source, so those who want/need more can extend the functionality. With 32GB RK1s, nodes will be running more processes, and when Turing Machines releases a TPi with more than four node sockets, there will be further horizontal scaling. There does need to be state consistency between KEY1 presses, the tpi command, the web UI and the REST API. I assume the node power state engine is centrally controlled by bmcd with incoming requests being serialized and mutexed.
w

werdnum

11/06/2023, 1:34 AM
I say keep it simple. The vast majority of use cases will want all nodes powered the majority of the time.
I expect that even then, most times nodes are turned off it will be temporary, to power cycle, change settings or do some maintenance
It a general principle of interface design that simple things should be easy, complex things should be possible, and the latter should interfere with the former as little as possible.
t

terarex

11/06/2023, 5:07 AM
@svenrademakers I investigated node power management a bit. Using either the tpi command or the web UI, node power state is persisted across BMC reboots. I set "write_timeout" in /etc/bmcd/config.yaml to 5 for testing purpose. We can debate whether 300 is too long or not. The KEY1 depress logic, with respect to the KV store, appears broken. KEY1 does correctly turn all nodes on and off. However, after KEY1 is depressed the first time, the key-value store, /var/lib/bmcd/bmcd.bin, always indicates to bmcd that the prior state was "all nodes on" regardless of whether KEY1 is used to power all nodes off, the tpi command is used to power nodes off or the web UI is used. I don't know the binary format of the KV store, but that's where I'd look. If you want/need an issue opened, I can do so and provide two bmcd.bin files for examination.
r

rmeier

11/06/2023, 8:39 AM
This is useful for those without access to UART, press power button 3 times to begin the update process https://discord.com/channels/754950670175436841/1167785630982484018/1167882637927534642
s

svenrademakers

11/06/2023, 2:17 PM
@terarex Thanks for taking a detailed look at it. I also gave it another taught and concluded that a plain straight forward power controller is the best option here. We were trying to do something smart with having a global on/off state, but that does not make sense for the trivial use-cases, or at least its very confusing/annoying for the end user. Secondly, i made sure that all power commands are going through the storage. This was not the case before which meant that in some scenarios you can end up with a different stored state as actually is active. Lastly, i think Linux is quite good in caching file read/writes, so i share your sentiment on the high write timeout config parameter. I think we can put it even lower as 5. for those interested, you can find the related changes in this commit: https://github.com/turing-machines/bmcd/commit/d5db79d67ec725fa0a2da578f7be4f33dc1b0c57
v2-RC2 will be available soon. The build needs an hour at least: https://github.com/turing-machines/BMC-Firmware/actions/runs/6773906243/job/18409813735
Copy code
v2-RC1 improvements

UI
* Updated styling of bmc-UI
* UI: Fixes in the progress-bar and labels

BMCD
* Simplified power controller
* pressing KEY_1 will toggle all nodes on if 3 or less nodes are powered
  on
* pressing KEY_1 when all nodes are off will turn all nodes on
* pressing KEY_1 when all nodes are on will turn all nodes off
* long press KEY_1 will force every node to on
* persistency: improve logging, decrease write timeout
* ban_patrol: simplify ban deadline
* Update default_config.yaml
* serial: replace custom ring buffer impl with a crate
    (Fixes #135)
* move the default www directory to /srv/bmcd/www
* authentication: auto reload users
* auth: ban consecutive failed requests
* humantime: convert durations into human readable
* expose reboot over API

TPI
* rename `tpi power get` to `tpi power status`
* return error response from server on authentication error
* throw away token if invalid. Prevents constant retry of authentication
request with wrong token
* add spinner for 'verifying checksum'
* add reboot command
* allow local file transfers to be issued outside BMC, fix phrasing
Let me know if there are any logs I should sent?
c

cfsworks

11/06/2023, 8:35 PM
OTA from RC1->RC2 is expected to work. If you SSH in, what does
ubinfo -a
show?
(For brevity: is there a volume named
rootfs_new
?)
s

svenrademakers

11/06/2023, 9:00 PM
This was a bug in the UI of RC1. Upload should work anyway 🙂
t

teslamax

11/06/2023, 11:04 PM
Bored at work… wondering if I dare do this remotely 😉
c

cfsworks

11/06/2023, 11:18 PM
I've done it remotely (from another state) many times, but I have a Bus Pirate wired to my BMC UART and reset button.
t

teslamax

11/06/2023, 11:36 PM
Where do you get such wonderful toys? (apologies for going off-topic)
I didn’t know things like this existed…
t

terarex

11/07/2023, 2:04 AM
RC2 is looking good. After performing the μSD upgrade from RC1 to RC2, I used the web UI to stream the RC2 tpu file. Seemed to work fine. I also used the web UI's control page to reset the BMC. Seems like this is a viable remote update solution.
r

rmeier

11/07/2023, 5:44 AM
Yes, thanks for that..
perfect, just needed a reboot!
Worked for me!
@svenrademakers just some feedback, not sure if I'm missing something, but the Power Control is now very unclear, can't see the current set status of the power to the nodes, so not sure what action you are taking when toggling, it may be that selected means on, but select and submit triggers my brain to think that we are toggling between states?
I have logged a issue, #136, it might just be me in this case, I feel that it talks to usability.
g

gunther2908

11/07/2023, 7:47 AM
Tested the rc1 version trough ssh procedure works spendid !! No issues
Even with a corrupted v1.1.0 version good work
Is there a way besides to compiling to get rc2 ?
s

svenrademakers

11/07/2023, 8:08 AM
If you scroll down there is a zip you can download, it contains a .tpu file which can be Installed via the UI (im not sure if you need to be signed in github to see the zip) https://github.com/turing-machines/BMC-Firmware/actions/runs/6773906243/job/18409813735
Thank you! I agree with you that the checkboxes are unluckily chosen. I need to do another pass over the UI anyway, there are still things that can be improved
r

rmeier

11/07/2023, 8:25 AM
Always will be, but, I must thank you for all the improvements that have been made, I really like the direction you are taking the software!
s

svenrademakers

11/07/2023, 8:31 AM
doing my best, hope it will be useful for you guys !
w

werdnum

11/07/2023, 8:39 AM
Can you just upload the files uncompressed? It's kind of a pain to have to unzip them. Bandwidth is pretty cheap
Well upload is a relative term I know it comes from CI
s

svenrademakers

11/07/2023, 8:41 AM
for sure, im panning to have the "official" release hosted on the turingpi server
w

werdnum

11/07/2023, 8:41 AM
But I was just thinking the zipping doesn't really add much
s

svenrademakers

11/07/2023, 8:42 AM
im not sure we can do something about the zip container of the CI builds. Its something github enforces
c

cfsworks

11/07/2023, 10:38 AM
It might not be a bad idea to reimplement the logic of
osupdate
in
bmcd
to allow streaming the volume update over the network -- including through a zip-extracting filter layer. So if someone provides a .zip instead of a .tpu the
bmcd
implementation can detect that and do the correct thing.
s

svenrademakers

11/07/2023, 10:44 AM
> It might not be a bad idea to reimplement the logic of
osupdate
in
bmcd
to allow streaming the volume update over the network -- including through a zip-extracting filter layer. So if someone provides a .zip instead of a .tpu the
bmcd
implementation can detect that and do the correct thing "streaming" is a bit of a loaded term for me. the
bmcd
is already capable of receiving a chunked volume, the only caveat is that its written to tmpfs before
osupdate
executes it. This should be fine for now as we have plenty RAM.
i would like
osupdate
to be reimplemented as wel somewhere in the future, calling shell scripts from an API is not really ideal regarding security
t

teslamax

11/07/2023, 7:10 PM
got RC2 running on mine... no other comments as of yet
b

bhuism

11/07/2023, 10:54 PM
I recommend https://pikvm.org/ I like the quality of it 👍
t

teslamax

11/08/2023, 12:10 AM
seen that too and have thought about it
w

werdnum

11/08/2023, 12:23 AM
not cheap in Australia though, A$500 (US$300)
b

bhuism

11/08/2023, 5:41 AM
Definitely not cheap, but just works, and well too
w

werdnum

11/08/2023, 10:01 PM
How is RC2 working out for folks? Maybe I should give that upgrade a whirl
t

teslamax

11/09/2023, 12:29 AM
I skipped 2.0 RC1 and all seems fine. GUI looks better. I’ll be honest, I haven’t tried anything fancy though.
c

cfsworks

11/09/2023, 12:34 AM
Just to make sure there's no confusion about this among people lurking in the thread: You only need the microSD upgrade method to go from 1.x -> 2.x. Upgrades from 2.x->2.x (including RC1->RC2) can be done using the OTA update method (followed by a reboot).
t

terarex

11/09/2023, 3:03 AM
Yesterday, as an intellectual exercise, I tried to use the webUI in RC2 to flash the SD card on a CM4 with a compressed lite Raspian image. Th webUI stopped updating at less than 6%, so I aborted it. Then used the tpi command after scp'ing an uncompressed image file to the BMC's SD card. It took quite a while, but worked. I'm attempting to flash an uncompressed SD card image file via the webUI now. The image file is huge, so it may take several days. 🤓
j

joesroom

11/09/2023, 11:55 AM
Guessing that's because of the 100mbit lan on the bmc...
s

svenrademakers

11/09/2023, 3:15 PM
im curious, is transferring with
tpi
from the bmc's sdcard to the CM4 any slower compared to using rpi-imager over the
usb-a
port?
my guess is that writing emmc over usb is the actual bottleneck currently. I dont have the data to back this up, but im intending to do some profiling soon to find out
t

terarex

11/09/2023, 3:39 PM
Note that my CM4s are lite (no eMMC, just μSD cards). I didn't try that, but I'm certain it would be much faster over a USB connection. The file I'm transferring via the webUI is an uncompressed, 256GB μSD image (generated from a working card via Rufus). My guess is that, due to limited BMC memory for tmpfs, the webUI uses a relatively small block size for each transfer. Obviously, the fastest way is to image the μSD card on the host machine that is running RPi Imager.
s

svenrademakers

11/09/2023, 3:58 PM
block and buffer sizes are terribly chosen, perhaps a nice goal for the weekend to tune it a bit better
c

cfsworks

11/09/2023, 3:59 PM
Another good benchmark (if you don't mind erasing your CM4 and thus needing to reimage it) would be to do
time dd if=/dev/zero of=/dev/sda bs=8M
on the BMC while the CM4 is in flashing mode.
And ofc play with the
bs=
parameter
b

bhuism

11/09/2023, 4:30 PM
Copy code
time xz -d < metal-rpi_generic-arm64.img.xz | sudo dd of=/dev/sda bs=1M status=progress conv=fsync
893665280 bytes (894 MB, 852 MiB) copied, 3 s, 298 MB/s1306525696 bytes (1,3 GB, 1,2 GiB) copied, 3,82763 s, 341 MB/s

0+152937 records in
0+152937 records out
1306525696 bytes (1,3 GB, 1,2 GiB) copied, 99,591 s, 13,1 MB/s

real    1m39,600s
user    0m3,607s
sys    0m0,226s
c

cfsworks

11/09/2023, 4:32 PM
= ~104.8 Mbps, which is a fair amount lower than the 480 Mbps theoretical of USB 2.0.
b

bhuism

11/09/2023, 4:32 PM
c

cfsworks

11/09/2023, 4:33 PM
Oh wait was this running from your desktop, not on the BMC?
b

bhuism

11/09/2023, 4:34 PM
Yes, you asked for a comparacence ☺️
c

cfsworks

11/09/2023, 4:35 PM
Good to have a baseline then 👍
b

bhuism

11/09/2023, 4:35 PM
Exactly
I flashed via web gui of the firmware as well, but did not time it
s

svenrademakers

11/10/2023, 7:39 AM
i ran some tests with my rpi CM4, and fixed the performance bug with using tpi over the network. Decreased the flashing time from 16m to 6m40s.
Copy code
Conducted a dd test locally on the BMC to determine the correct block
    size for writing to a raspberry CM4.
    Times measured using a 206Mb file:

    512B     1m50.22s
    512K     43.0s
    1M       43.03s
    4M       42.96s
    8M       43.19s
    16M      43.00s
    32M      45.95s

    Concluded 4M is the optimal blocksize.

    Ran some tests writing `2023-05-03-raspios-bullseye-armhf-lite.img`
    * using dd 4M                   => 6m23.86s
    * using tpi with -l flag        => 6m22s
    * using tpi over local network  => 16m02s
    * using tpi over local network with 4M blocks  => 6m40s
*these times are measured without the crc read step which accounts for an extra 3 minutes on top
c

cfsworks

11/10/2023, 2:08 PM
206MB in 42.96s is 4.80MB/s or 38.4Mbps which is a lot lower than the desktop baseline determined yesterday. I have to think there's still some low-hanging-fruit improvement to be made hiding somewhere.
s

svenrademakers

11/10/2023, 2:32 PM
its more or less equivalent to what im getting using a desktop and
rpiboot
. since we use the same backend on the BMC, i would start my investigation there
c

cfsworks

11/10/2023, 2:36 PM
If it's equivalent then the bottleneck might be the CM4 SD/eMMC which would mean no further improvements to the BMC's speed are needed here. Though that also seems a bit on the low end for that to be true... Hmm...
s

svenrademakers

11/10/2023, 2:36 PM
to be continued for sure 🙂
t

terarex

11/10/2023, 4:45 PM
The image transfer finally finished last night. The webUI indicates a write error, but the CM4 boots up just fine. I'm not certain whether "UPGRADE NODE" is a good button caption. Maybe "FLASH NODE OPERATING SYSTEM"? https://cdn.discordapp.com/attachments/1170114820457111592/1172577879956144259/TPiv2-WebUI-CM4-flash-error.jpg?ex=6560d340&is=654e5e40&hm=4257b0872a5c91d77c538aa89d197e84bb04ebf4bf8ef95bb26fd1511c0e5ca9&
c

cfsworks

11/10/2023, 4:50 PM
"OVERWRITE NODE" might also be a good option; makes the "destructive" nature of the operation clear to the user so they know to back up any important data before using it.
t

terarex

11/10/2023, 5:12 PM
Yeah. The operation is destructive, so "UPGRADE" is misleading. Maybe just "INSTALL OS". Mentioning NODE is redundant to the tab's context. I know it's nitpicky, but don't expose the button until after the user has selected the node and the file and both inputs have been validated. It might be wise to pop up a dialog to tell the user the operation will overwrite the node's boot device and confirm their intent.
c

cfsworks

11/10/2023, 5:13 PM
"INSTALL OS" + confirmation dialog sounds great to me
s

svenrademakers

11/10/2023, 6:52 PM
These are good suggestions! Lets see if i can slip it in!
c

cfsworks

11/11/2023, 4:07 PM
btw a helpful command to run now and then (perhaps something the
bmcd
webui should do for you? 👀):
ssh root@bmc 'tar -cvC /mnt/overlay/upper .' | bzip2 > bmc-backup.tar.bz2
This will create a backup of only the files that you've modified after installing the base firmware.
s

svenrademakers

11/11/2023, 4:35 PM
sounds like a cool idea, we would need to think of frequency and were to best store it. Could you make a ticket of that on the BMC-Firmware repo?
c

cfsworks

11/11/2023, 4:48 PM
I wasn't necessarily thinking of doing it automatically, just a button in the webui that downloads a backup .tar.bz2. I can make a ticket later today though.
s

svenrademakers

11/11/2023, 9:45 PM
That would work as well!
Consider it done: https://github.com/turing-machines/bmcd/commit/39774e1da2237e0b35b226827ced9e8c1321151b 2 open topics though: * security wise; private keys are archived as well. I would feel more comfortable if encryption is enabled, however it would degrade user experience. I can see myself needing my backup archive and then coming to the realization that i forgot my passphrase. Another option would be to omit them. * it would be nice to also have an option in the UI to restore these backups
c

cfsworks

11/12/2023, 8:34 PM
Since
bmcd
is once again monitoring the node UARTs, it's going to create some confusion when people try to open the /dev/ttyS# devices with microcom and see random read bytes go missing. I wonder if there's a good way to deter that and encourage that users access serial through
bmcd
@svenrademakers As of
bcf0d024fff03f16e7e693310f8624aa7c6f6527
has the gadget been replaced with USB CDC-ACM or is it currently just unused?
s

svenrademakers

11/12/2023, 8:53 PM
i just removed the old configfs UDC, didnt found time yet to add a CDC-ACM
c

cfsworks

11/12/2023, 8:53 PM
I'll spend a few minutes (not going to get heavily involved into it yet) seeing if there's an easy way to do it.
s

svenrademakers

11/12/2023, 8:55 PM
i already managed to create a Mass storage USB gadget without any issues, i bet CDC-ACM should not be any more complex ... 'famous last words'
c

cfsworks

11/12/2023, 8:57 PM
I'm right there with you, doesn't seem hard. Looks like the only thing missing is the
acm
kernel driver?
s

svenrademakers

11/12/2023, 8:57 PM
that would be nice, im pretty toast.. time to recuperate from my screen
yes, i think so
c

cfsworks

11/12/2023, 8:58 PM
# CONFIG_USB_CONFIGFS_ACM is not set
yeah that needs to be toggled on, but it should otherwise be easy
s

svenrademakers

11/12/2023, 8:58 PM
the rest is already in
this ADB server was eating a lot of RAM, i think we freed up like 40 meg by purging it
c

cfsworks

11/12/2023, 9:00 PM
(Ooh the other one that might be useful is RNDIS... we could do Ethernet over the USB link.)
s

svenrademakers

11/12/2023, 9:02 PM
what benefits does it give us compared to using the dedicated ethernet device?
c

cfsworks

11/12/2023, 9:02 PM
It's just convenient in case a user wants to plug their laptop into the USB port and use that to reach through to nodes, like if they don't have an Ethernet cable and/or adapter handy.
Probably not something we do by default.
s

svenrademakers

11/12/2023, 9:02 PM
well we could do both 😛
c

cfsworks

11/12/2023, 9:03 PM
Indeed 😄
s

svenrademakers

11/12/2023, 9:03 PM
but i think there is a lot of value in having hte ACM driver, it means people dont have to buy USB-TTL cables anymore
w

werdnum

11/12/2023, 11:34 PM
how does one 'use bmcd'?
the obvious answer to your question is "remove microcom and picocom, and replace them with a message that tells users whatever it is that they're supposed to do instead"
c

cfsworks

11/12/2023, 11:39 PM
I actually don't know. What I'm hoping is for a wonderful
screen
-like thing in
tpi
like...
tpi uart -n 1 attach
...and, like
screen
, it shows you the current terminal state and takes over CTRL+C, CTRL+D, etc. -- and connects through a WebSocket so you get a nice low-latency connection. That might be a little ways away though. 😦
I wonder if I should get my feet wet with
bmcd
development and send a PR implementing the necessary UART stuff, or if Sven/Ruslan are focused on that.
w

werdnum

11/12/2023, 11:47 PM
mmmm, that sounds nice
I like the sound of that
t

terarex

11/13/2023, 3:32 AM
Put it in as a enhancement request (PR). Sven an Ruslan will get it sorted.
s

svenrademakers

11/13/2023, 8:10 AM
Random read bytes go missing?
i would expect this when you use 'tpi uart get' as its doing a lossy utf encoding at some point.
a

ajtzak

11/13/2023, 1:28 PM
I was able to update to RC1 via sdcard, and OTA for RC2 (ran into the same UI bug but it did flash; note: when i did
cat /etc/*release*
it still showed the VERSION as RC1 but the git hash seemed to match the action run linked here. I didn't read everyone's messages between but I noticed when I power on a node through the webui and refresh the page it did not save the state of my powered on nodes (they were unchecked- though powered on). edit: when refreshing, it does appear to be getting the payload correctly-- just a UI bug?
Copy code
{
  "response": [
    {
      "result": [
        {
          "node1": "1",
          "node2": "0",
          "node3": "0",
          "node4": "0"
        }
      ]
    }
  ]
}
s

svenrademakers

11/13/2023, 1:59 PM
yes i think so, if you like you can try the latest version (which will be the final v2 version). I spent this weekend educating myself on frontend development and UI bugs.. i fixed a bunch. https://github.com/turing-machines/BMC-Firmware/actions/runs/6843032217
a

ajtzak

11/13/2023, 2:10 PM
sick, ya that fixed it 🎉
c

cfsworks

11/13/2023, 2:49 PM
If two processes have a stream (TCP socket, pipe, serial device, ...) open and both try a read(), only one of them gets the next byte(s) to arrive (I think it's whichever one did the read() first). Usually microcom wins, I'm guessing because it uses blocking read() while bmcd is using an async reactor to learn when bytes are available followed by a read(), but sometimes bmcd wins and the user on microcom misses output.
s

svenrademakers

11/13/2023, 3:07 PM
I recall reviewing this.. if this is the case we should find some ways to work around it. Replacing it all together with a nice websocket + renderer has my preference. actually im hoping to port the openbmc implementation into our firmware 😅
t

terarex

11/13/2023, 6:03 PM
The final BMC v2.0.X firmware release looks great! The OTA update and subsequent bmc reboot worked fine. My /etc config file and /etc/init.d script additions/mods were preserved so I'll do a fresh install from SD to confirm everything is good to go for me. Looking forward to RK1 (32GB) and TPi v2.5. The USB hardware changes in the TPiv2.5 could open a number of new feature possibilities.
s

svenrademakers

11/13/2023, 6:07 PM
Thanks for testing it! I think people will like the additional tweaks you suggested
t

terarex

11/13/2023, 6:14 PM
I have additional usability suggestions. I'll create a PR for them.
w

werdnum

11/13/2023, 11:02 PM
Might be another conversation but for the UART it seems there are basically two options - either take the screen/tmux/mosh approach and run a local terminal emulator & sync its state to all clients. But maybe better is the kind of pub/sub approach where you multiplex the I/O instead - better for monitoring use cases like getting the board type
c

cfsworks

11/13/2023, 11:14 PM
In my head I was picturing two different possible websocket requests:
wss://bmc/.../4?terminal=true
and
?terminal=false
, which indicates whether the client wants to be synced up to the terminal emulator running inside
bmcd
or just wants to be subscribed to the byte stream going forward.
The former is actually just a dump of the current terminal state followed by the latter.
Rust also has the [vt100]() crate which does the necessary terminal-emulator stuff (so,
bmcd
just has to shovel bytes coming out of the node into a
vt100::Parser
instance), and the crate also has
parser.screen().state_formatted()
which generates the byte sequence necessary to get a connecting terminal to the same state as the emulated terminal. In short all of this means that the
terminal=true
param just controls whether or not the client wants the output of
parser.screen().state_formatted()
prepended, and either case subscribes them to the stream bytes coming out of the node UART.
t

terarex

11/14/2023, 2:41 AM
@svenrademakers I may have lied when I said everything in the latest RC was good to go. I didn't check this function until this evening. By the look of it and some Chat and GitHub comments, I assume it is supposed to back up files under /mnt/overlay/upper that a user has customized to a compressed tar file on the SD card. I tried it, but received the message "Error generating backup archive" in a pop up. The installed SD card has very little on it. Note the file system on the SD card is exFAT. Upon a little additional investigation, I find the UI's text fields for "BMC" and "SD card" have some kind of issue. I'll have to look at the web content source to diagnose. Gotta put that education in HTML5 to some use. https://cdn.discordapp.com/attachments/1170114820457111592/1173814819741249576/TPiv2-v2.0.0-RC2-4.png?ex=6565533d&is=6552de3d&hm=025bda87b13e8851d537b38878962497df5f40d618a38222b31eb85198f1af5c&
r

rmeier

11/14/2023, 6:26 AM
Just wanted to check on how to engage safe mode on the new BMC, should that be possible before the BMC boots from the recovery SD card? and will the network be active in safe mode?
c

cfsworks

11/14/2023, 6:43 AM
The design (it's a bug if you find an exception to this rule) is that if you hold KEY1 for 5 seconds at power-on (or after releasing BMC_RESET), is will boot to safemode. It's currently implemented as a "this boot only" reset to factory defaults, so the network and everything will come up as if you had just performed a fresh installation from the microSD.
(I should clarify, you have to hold KEY1 for 4-6 seconds. If you hold it for 15, you go into a firmware repair mode which is not safemode.)
r

rmeier

11/14/2023, 7:08 AM
Ok, with the SD recovery (BMC rev 2.0) in the board, it seems to boot the recovery partition (get's to the light and network slowly flashing on and off stage) not the safemode it seems. This is with power completely off, KEY1 held in power on, 5 seconds, KEY1 release. or power off, power on, KEY1 held, 5 seconds, KEY1 release. Not sure if that is designed or not.
What I am attempting to do is post SD card update of BMC, I'd like to access the BMC and wipe the SD card for normal use.
Trying to not have to remove the SD card and re-format in another machine.
The lights flash stage seems to happen about 6 seconds after power on.
c

cfsworks

11/14/2023, 7:14 AM
I definitely designed for this exact thing:
If it's not working for you it may be a bug.
r

rmeier

11/14/2023, 7:20 AM
Ok, then I am doing something wrong..
c

cfsworks

11/14/2023, 7:21 AM
In a pinch you can hold KEY1 for 15 seconds at power-on with the USB-B connector connected to your computer, and that will give you USB->microSD access.
r

rmeier

11/14/2023, 7:21 AM
I like this idea..
c

cfsworks

11/14/2023, 7:22 AM
If that works but the 5-second safemode doesn't, I might've made a mistake with the latter.
r

rmeier

11/14/2023, 7:24 AM
I might also be confused with KEY1, I am thinking it's the same key that we use to power on and off the nodes when booted normally.. and the one I press three times to start the flash?\
c

cfsworks

11/14/2023, 7:24 AM
KEY1 is a little surface-mount button on the PCB next to the BMC

https://files.readme.io/3bf121b-image.png

r

rmeier

11/14/2023, 7:38 AM
Thanks - issue was me.. I thought the external key was also key1
holding the button on the board itself as you poineted out worked perfectly.
Thanks!
c

cfsworks

11/14/2023, 7:43 AM
Sweet, okay, super glad that I don't have a bug in that bit of code 😅
r

rmeier

11/14/2023, 7:46 AM
Are you working on a set of documentation, I might be able to review and provide some updates, I have learnt quite a bit from a normal end-user viewpoint..
So I see things and miss things that more involved people might not think about.
c

cfsworks

11/14/2023, 7:53 AM
I know it's being documented but I'm not the one writing it. I think that's @_dhanos_
r

rmeier

11/14/2023, 8:21 AM
@svenrademakers - love the way the power settings for the nodes now work!
s

svenrademakers

11/14/2023, 8:50 AM
Hi Dan, its suppose to archive
/mnt/overlay/upper
and download it to your computer not to your SD card. Which browser are you using? If its failing, the stdout of the BMC should show as well what the cause was
w

werdnum

11/14/2023, 10:17 AM
fwiw I achieved this by interrupting u-boot and looking up the u-boot command to erase the sd card
s

svenrademakers

11/14/2023, 10:19 AM
yes we are, and a extra pair of eyes would be really appreciated. As a non native speaker is quite taxing to write a sensible story. i will share a link once its available
r

rmeier

11/14/2023, 10:41 AM
Please do, I'd be happy to try and give back to the community
t

terarex

11/14/2023, 3:43 PM
Sven, I'm running Chrome Version 119.0.6045.124 (Official Build) (64-bit) on Windows 10. When I click on the "BACKUP USER DATA" button, the Chrome console (not BMC) displays the errors shown in the screen capture. https://cdn.discordapp.com/attachments/1170114820457111592/1174011716984635483/TPiv2-v2.0.0-RC2-4-userdata_backup.jpg?ex=65660a9d&is=6553959d&hm=efeb62cdceae369d9f11c76499d64f8b53645a6eece101d3d115b39392ca447c&
s

svenrademakers

11/14/2023, 6:45 PM
what is the response body of that 500 response?
t

terarex

11/14/2023, 6:53 PM
I'll let you know when I get back.
s

svenrademakers

11/15/2023, 10:30 AM
FYI, in my quest to improve node flashing i did some tests with decoding xz images on the fly with the BMC. Xz archives are roughly 10 times smaller as their uncompressed variants so i expected to see some big improvements with regards to sending huge files over a slow network. The results are a bit of a let down, as writing a xz file is not really faster, to make it worse the system becomes unstable. Available memory is to blame here. The xz images i tested require a minimal of 66mb of RAM which is extremely tight. bmcd needs around 40mb at least, so combined you eat up 110mb of the 128 mb available. Next to that, the cpu has difficulties keeping up, downgrading network performance. I parked these efforts for now. used deamon code: https://github.com/turing-machines/bmcd/commit/fd45f4769b2e0f3328cf8980b88b8b13f8b5ca8e
i improved the archiving implementation so that it streams the archive back over HTTP instead of copying it around in memory. In hesitant to squeeze it in as i want to cut a release
w

werdnum

11/15/2023, 1:10 PM
TL;dr - to decompress,
xz
uses ~10-20% of the memory required to compress.
If you control the compression, might be good to not use the max settings
s

svenrademakers

11/15/2023, 1:14 PM
Yup. i just took the latest raspberry pi bookworm image from the official website as a baseline. RK1 images we can more or less control the compression ratio. But unfortunately not for other images
w

werdnum

11/15/2023, 1:14 PM
Maybe there's some silly way to do the decompression on the client side in wasm
Plenty of ram on whatever computer is being used to upload the image
s

svenrademakers

11/15/2023, 1:18 PM
I thought about this. But the only gain we have is the comfort of not having to call one extra command ‘xz -d’ before uploading your image. Which im not sure is worth the effort
w

werdnum

11/15/2023, 1:18 PM
It does seem like (only) a nice to have
You might consider using --memlimit with your patch so it fails rather than making the system unstable if one gives a highly compressed file
"nice to have" here being used as a negative term. As opposed to essential.
s

svenrademakers

11/15/2023, 1:20 PM
When i have some extra time, i want to see if i can lower the memory footprint of the bmcd. Worse case we still dont have decompression but a more optimized program
Currently i have set a ceiling to 66m. which actually is around 6 mb too high. The challenge is that a certain memlimit works for a vanilla system, but will crash for someone who is running other stuff in parallel
I feel comfortable to merge it to master, maybe it will help a few people, but my guess is the majority of people will get an “out of memory” error back when trying to upload xz images
w

werdnum

11/15/2023, 1:28 PM
Ah. I suppose in theory you could look for free ram and set based on that but might not be worth the effort
s

svenrademakers

11/15/2023, 1:31 PM
i feel this is an excellent opportunity to develop and sell decompression extension boards 😂
w

werdnum

11/15/2023, 1:39 PM
Heh, they'd have to be cheaper than just throwing more ram at the problem though
s

svenrademakers

11/15/2023, 1:40 PM
you can fix that with proper marketing 😄
i will shutup now
> Ah. I suppose in theory you could look for free ram and set based on that but might not be worth the effort thanks for thinking out loud here, this was actually still missing from my implementation
c

cfsworks

11/15/2023, 2:55 PM
It may be too big of a step down from xz in terms of ratio, but lz4 decompression is lightning fast and requires only 64K of working memory.
And this one's more me thinking aloud than an actual suggestion, but I wonder if there's some convention for indicating "don't care" blocks in a disk image. Like, if a block in the image consists of the ASCII string
SKIPSKIPSKIPSKIP
... it's the creator of the image saying the partition layout, filesystem, or whatever doesn't use that block, so an imaging tool should just skip writing it to save time.
s

svenrademakers

11/15/2023, 3:29 PM
adding more decompressors is quite trivial, but Im wondering what would be a better user experience. saying "here user you can upload xz images but its 50% chance it will work", or "hey user we dont do that here use lz4"
c

cfsworks

11/15/2023, 3:31 PM
Good point, I imagine the vast majority of users would grumble at having to decompress the file just to recompress it with lz4 (since nobody uses this for disk images in the real world) in order to save a little BMC network bandwidth
s

svenrademakers

11/15/2023, 3:35 PM
good question, there must be some open container format for this. This is basically how the phoenix suit images worked. from what i could tell they had some header or meta table which described like "write this blob at this offset"
t

terarex

11/15/2023, 4:14 PM
Sounds good. I looked at Chrome's debugging info last night, but didn't see anything useful. I also started looking at the UI source because the "User Storage" fields on the Info tab aren't being populated correctly. Unfortunately, app.js isn't exactly human readable.
s

svenrademakers

11/15/2023, 4:18 PM
what about this one?
im dumping "promise/then/fail/finally" cleanup in a couple of minutes though,
im learning more front end every day 😄
t

terarex

11/15/2023, 4:22 PM
I'll take a look at that repo. I was trying to work with the files on the BMC itself since I expect functionality depends on having bmcd. (Note that I'm using a custom hostname.)
s

svenrademakers

11/15/2023, 4:26 PM
the easiest would be to checkout the repo and build the pages and then scp them over to the board. You should also be able to instert a 'pre-setup' ajax hook to defer calls to your board so that you can just run the pages locally on your machine. but i havent gone through the trouble yet
t

terarex

11/15/2023, 4:30 PM
Will do. Once v2.0 firmware is released, I'll update my local build environment. Yeah, I know I could do everything on GitHub, but I'm old-school. 🤓
s

svenrademakers

11/15/2023, 4:45 PM
> I also started looking at the UI source because the "User Storage" fields on the Info tab aren't being populated correctly. Unfortunately, app.js isn't exactly human readable. you mean the UI itself or the actual values? the former is fixed a hour ago
t

terarex

11/15/2023, 6:54 PM
The UI. The field contents are garbage.
b

bplein

11/15/2023, 6:58 PM
I may be doing something wrong here. I attempted to build off
master
today using the
README.md
instructions and the build container, and everythign "works" up until the last line of
Copy code
cd output/images/
cp -a ../../../tp2bmc/swupdate/* .
./genSWU.sh 1.0.0
. The shell script doesn't exist. Is there a different way to create the OTA version for 2.0-RC2?
c

cfsworks

11/15/2023, 7:05 PM
Ah, yeah, the
README.md
is woefully outdated, needs to be updated for 2.0.0. The OTA update is the
buildroot/output/images/rootfs.erofs
file. The OTA files are just EROFS images, renamed to .tpu.
w

werdnum

11/15/2023, 7:35 PM
Somebody somewhere had implemented a nice new UI that I had installed on my OG Community Edition installation
( @phearzero )
b

bhuism

11/15/2023, 8:47 PM
I like the shell 🙂 🙏
k

kjellra

11/15/2023, 9:10 PM
+
b

bplein

11/15/2023, 9:32 PM
so, I can
cp buildroot/output/images/rootfs.erofs tp2bmc-2.0-rc2.tpu
or similar and use it for OTA? Sounds easy! Thanks~
t

terarex

11/16/2023, 3:48 AM
Building the latest commit (4 hours ago according to GitHub), BMC-Firmware-2.0.0-RC3, now. Looks like all the changes for 2.0.0 have been merged to master with finishing touches occurring there.
Built and installed (tpu file) nicely. The Info page's BMC and SD card text fields are valid now. Clicking on the "BACKUP USER DATA" changes the text to an animated spinner. There is no client-side pop-up to select a download location. Update: Enabled the Chrome developer tools and now I see the "500 (Internal Server Error)" message from "Backup". The error in Backup points to app.js in this code block: try { a.send(e.hasContent && e.data || null) <----- line flagged as generating the error } catch (e) { if (t) throw e } The error reported in app.js is: app.js?_v=20231115232853:2 Uncaught ReferenceError: settimeout is not defined at Object.c (app.js?_v=20231115232853:2:197927) at l (app.js?_v=20231115232853:2:30431) at Object.fireWith [as rejectWith] (app.js?_v=20231115232853:2:31179) at _ (app.js?_v=20231115232853:2:82784) at XMLHttpRequest. (app.js?_v=20231115232853:2:85227) c @ app.js?_v=20231115232853:2 l @ app.js?_v=20231115232853:2 fireWith @ app.js?_v=20231115232853:2 _ @ app.js?_v=20231115232853:2 (anonymous) @ app.js?_v=20231115232853:2 load (async) send @ app.js?_v=20231115232853:2 ajax @ app.js?_v=20231115232853:2 (anonymous) @ app.js?_v=20231115232853:2 dispatch @ app.js?_v=20231115232853:2 g.handle @ app.js?_v=20231115232853:2 The link in this message points at this code block: function c(e, t, n) { settimeout((()=>{ let n = t + " : " + e.responsetext; showtoastnotification(n, "error") } ), 300) }
r

rmeier

11/16/2023, 7:51 AM
Just installed RC3 myself off the last build on GitHub.. worked perfectly, looking good, I haven't played around too much but no issues from my side so far
s

svenrademakers

11/16/2023, 11:05 AM
that is correct!
ffs.. i made a big boo .. i will need to make another RC..
thanks for the stack trace though
its not going to fix your 500 error response though.
b

bhuism

11/16/2023, 9:11 PM
probably already asked by someone: can we keep the nodes powered up during a firmware upgrade, and consequent reboot? (I already like that the power state of the nodes is remembered!)
t

terarex

11/16/2023, 9:41 PM
Yeah, "No such file or directory (os error 2)". My assumption is the code is referencing "/api/bmc/backup", under ServerRoot (/srv/bmcd/www). (I've been retired for 5 years. The mental cobwebs are showing.)
s

svenrademakers

11/16/2023, 9:57 PM
The BMC runs EROFS filesystem (thanks to @cfsworks). Actually its archiving
/mnt/overlay/upper/
on your board which is the upper layer that gets mounted over the read only filesystem. I wonder if your mount is broken or perhaps there is something funky going on with symlinks that you created?
c

cfsworks

11/16/2023, 9:58 PM
I suppose it's also possible that the upper layer is just huge?
s

svenrademakers

11/16/2023, 9:58 PM
Unfortunately this is not possible due to hardware limitations. v2.5 will be able to do this as it will get latches
c

cfsworks

11/16/2023, 9:58 PM
Like if you're running from microSD and not flash, you could put a bunch of node images in your /root directory and those would get included in the backup
t

terarex

11/16/2023, 10:01 PM
No symlinks here ... oh wait ... when I changed the timezone to be America/Los_Angeles, I also removed and recreated the localtime symlink, but it's still a relative link. These and other small changes I made did get backed up into /mnt/overlay/upper. I intend to make these changes in the build environment's overlay directory.
s

svenrademakers

11/16/2023, 10:04 PM
This is a problem as the current implementation archives in a memory buffer.. but i doubt it will give you a ‘no such file or directory’ error back. I would expect the BMC to hang or best case return a out of memory response
c

cfsworks

11/16/2023, 10:04 PM
Hm.
s

svenrademakers

11/16/2023, 10:05 PM
Also call ‘sync’ just to make sure you flushed all your fs buffers
c

cfsworks

11/16/2023, 10:27 PM
This is not a very scientific test but I figured I'd put the results here anyway:
Copy code
@ xz -d < ubuntu-22.04.3-preinstalled-server-arm64-turing-rk1.img.xz | ssh black-bmc 'time dd of=/dev/sda bs=4M'
real    13m 27.79s
user    0m 0.97s
sys     3m 41.30s
@ xz -d < ubuntu-22.04.3-preinstalled-server-arm64-turing-rk1.img.xz | lz4 | ssh black-bmc 'time ./lz4 -d | dd of=/dev/sda bs=4M'
real    12m 40.04s
user    1m 2.23s
sys     0m 33.84s
It takes about ~6% less time to image a node if I trans-compress it to lz4 and have the BMC do the lz4 decompression, since it gets that 100Mbps Ethernet bottleneck out of the way.
s

svenrademakers

11/16/2023, 10:55 PM
Btw, the tpi tool transfers with gzip compression enabled. I havent measured the performance difference of this in a while.
w

werdnum

11/17/2023, 1:55 AM
EROFS sounds like an error code
lol it is
b

bhuism

11/17/2023, 4:29 PM
Is this most recent build?
t

terarex

11/17/2023, 4:37 PM
That's my SOP. It's a habit I got into in the mid-1980s UNIX dark ages (pre-Linux). I did not integrate my /etc and /etc/init.d customizations into the RC3 build I did on Wednesday. Just figured the .tpu update would pick them up from the on-BMC /mnt/overlay/upper in prior 2.0 RCs. Last night I installed my vanilla (no mods) RC3 build from SD to wipe the BMC's internal NAND storage of all prior customization. The Backup function worked fine. I then made several manual changes to /etc (hostname, hosts, timezone and the localtime symlink). This broke the Backup function. I haven't isolated exactly which change it was, but I'll research further. To test a theory, I modified and re-integrated all of my BMC firmware 1.X.X customizations, including scripts under /etc/init.d, into my local build tree last night and kicked off the make before hitting the sack. Installed from SD card this morning. The Backup function works now.
Doesn't look like it. The subsequent RC output files have been versioned. I built downloaded RC3 from GitHub and did a local build on Wednesday. Sven has been making additional commits, but they all appear to be related to the Docker image. @bplein are you upgrading from 1.X.X BMC firmware or is it already on 2.0.X?
b

bhuism

11/17/2023, 4:45 PM
Earlier builds had exactly this error
b

bplein

11/17/2023, 5:04 PM
I built of master/main just now. , then
cp buildroot/output/images/rootfs.erofs ~/tp2-bmc-2.0-RC.tpu
and copied that from my build machine to my desktop and uploaded it.
b

bhuism

11/17/2023, 5:14 PM
Can you try rc3? I installed rc3 from GitHub build, yesterday i installed rc3 and this 'upgrade failed' message was gone (and btw the upgrade did succeed then in my case)
b

bplein

11/17/2023, 8:42 PM
Sorry, for some reason I didn't see the RC3 tag when I looked. I feel dumb now. Will give it a try
(I also didn't do a
make clean
so there's a chance I didn't pick up everything, well, cleanly)
t

teslamax

11/18/2023, 1:00 AM
Got RC3 running… looks beautiful
i

inquisitor_1337

11/18/2023, 1:24 AM
Hi, What steps you were following to spin it up? And what was your earlier tp FW version?
t

terarex

11/18/2023, 5:22 AM
@svenrademakers Figured out why my change to localtime was causing backup of customized files in /mnt/overlay/upper to fail. The symlink I created between /etc/localtime and /usr/share/zoneinfo/America/Los_Angeles was relative (../usr/...), rather than absolute (/usr/...). Ordinarily, this would not be an issue, but when this relative symlink was copied into /mnt/overlay/upper/etc, the relative link no longer pointed to a valid location. When you invoke tar to create the backup, are you following symlinks (-h option)?
c

cfsworks

11/18/2023, 6:00 AM
Symlinks should for sure be grabbed as-is rather than followed
t

teslamax

11/18/2023, 8:23 AM
I just pulled it pre-built off the repo... was running RC2
I'm just following the changes, not really testing per se.
g

gunther2908

11/18/2023, 10:24 AM
Just installed RC3 , NICE interface !! 👍 upgrade trough sd card , tpu upgrade gave me error 255 and dd was not working due to error 0 blocks written
t

teslamax

11/19/2023, 8:54 PM
ultimately... a dark mode option would be nice cosmetically (so many folks were grumpy about Proxmox until it got a dark mode option)
t

terarex

11/20/2023, 5:14 AM
It seems at least a few people are running BMC v1.0.2 firmware because they are only now able to order and CM4s or equivalent. Is there is a clearly documented procedure for getting from v1.0.2 to v2.0.0 without going through v1.1.0 with Phoenix Suit? The current on-line documentation doesn't seem to explicitly address this particular scenario. I thought about downgrading my TPiv2 from v2.0.0-RC3 to v1.0.2 in order to validate a procedure. However, with the BMC NAND's new layout, I'm not certain whether that would be advisable.
w

werdnum

11/20/2023, 6:24 AM
That's what I did
I never used PhoenixSuit
I went from whatever the max firmware that doesn't need it straight to 2.0.0-RC1
t

terarex

11/20/2023, 1:27 PM
Thanks. Since I used Phoenix Suit for the 1.0.2 upgrade to 1.1.0 (where the BMC's NAND format changed), I wasn't certain about the SD card upgrade directly from 1.0.2 to 2.0.0.
s

svenrademakers

11/20/2023, 2:19 PM
this is no problem, SD card upgrade is not dependent on anything that is currently present on the flash
b

bplein

11/20/2023, 3:39 PM
I found this useful for flashing the BMC from SD card, as it avoids the use of USB and other software entirely. https://discord.com/channels/754950670175436841/1174117775460016128/1174122860172812318
t

terarex

11/20/2023, 3:40 PM
Thanks Bill. I'll copy that link to the appropriate Tech Support thread.
@svenrademakers I noticed an error in the README.md file that was committed yesterday. It's in the last paragraph: "Both UI and tpi can be used to write the upgrade package to your board. When you try to upload the image via the firmware upgrade tab on the UI, You will notice that the file extension is not matching the one the UI expects. You can ignore this, a .tpu image is nothing more than a rename of rootfs.erofs image. To smoothen the experience, you can decide to change the extension of the file to .tpu." rootfs.eros is the .img, not .tpu, file.
s

svenrademakers

11/21/2023, 2:56 PM
Actually its written correctly in the readme: tp2-ota-v2.x.x.tpu == rootfs.eros, its just a ubi volume containing a full rootfs. What happens on a firmware upgrade is that a new volume, or partition for that matter, is created. The .eros file is written to it and on a reboot, the bootloader flips a switch to boot from this new partition. If it fails it tries again with the old volume. p2-firmware-sdcard-v2.x.x.img is a raw image file, which can contain mutliple volumes and in this case bootloader + rootfs
t

terarex

11/21/2023, 3:07 PM
Okay. My understanding was that the .erofs file was the raw image. The .img file was not present in buildroot/output/images. Otherwise, I would have dd'ed it to a raw SD card. Note, I have not built the latest commit.
s

svenrademakers

11/21/2023, 3:09 PM
this is what the CI does:
Copy code
- name: stamp images
        run: |
          mkdir artifacts
          sudo mv buildroot/output/images/tp2-bmc-firmware-sdcard.img artifacts/tp2-firmware-sdcard-${{ env.BUILD_VERSION }}.img
          sudo mv buildroot/output/images/rootfs.erofs artifacts/tp2-ota-${{ env.BUILD_VERSION }}.tpu
b

bplein

11/21/2023, 3:33 PM
@terarex this is what was built on my successful build of RC3:
Copy code
# ls -la images/
total 180992
drwxr-xr-x 4 root root     4096 Nov 17 15:13 .
drwxr-xr-x 6 root root     4096 Nov 17 15:13 ..
-rw-r--r-- 1 root root  1120290 Nov 17 15:13 installer.cpio.gz
-rw-r--r-- 1 root root 39550976 Nov 17 15:13 rootfs.erofs
-rw-r--r-- 1 root root 69171200 Nov 17 15:13 rootfs.tar
-rw-r--r-- 1 root root 27794977 Nov 17 15:13 rootfs.tar.gz
drwxr-xr-x 3 root root     4096 Nov 17 15:13 sdcard-bootpart
-rw-r--r-- 1 root root 16777216 Nov 17 15:13 sdcard-bootpart.img
-rwxr-xr-x 1 root root    45830 Nov 17 15:11 sun8iw20p1-t113-turingmachines-tp2bmc.dtb
drwxr-xr-x 3 root root     4096 Nov 17 15:13 tmp
-rw-r--r-- 1 root root 57376768 Nov 17 15:13 tp2-bmc-firmware-sdcard.img
-rw-r--r-- 1 root root   611188 Nov 17 15:10 u-boot-sunxi-with-spl.bin
-rw-r--r-- 1 root root   578356 Nov 17 15:10 u-boot.bin
-rw-r--r-- 1 root root  3880528 Nov 17 15:11 zImage
Note
tp2-bmc-firmware-sdcard.img
t

terarex

11/21/2023, 3:45 PM
I probably am working with a bit earlier version of RC3. I'll pull the latest source tar file and build from it. I tore my TPiv2 system apart yesterday to allow soldering an EMC2301 fan controller chip and PWM fan header.
c

cfsworks

11/21/2023, 4:46 PM
It's a tad confusing, because while
rootfs.erofs
is the "OTA image" (containing all of userspace and the kernel), it's also a partition in the
tp2-bmc-firmware-sdcard.img
file (for direct booting and/or installing from the microSD card)
t

terarex

11/21/2023, 5:08 PM
Thanks for that clarification. I noticed that @svenrademakers committed a license file to master this morning. That's probably the final step before release. I pulled a tar file of it, added my overlay mods and kicked off a build. I'll put the appropriate .img file on an SD and test later today or tomorrow. I want to solder the fan control components and put the system back together first.
h

hagak1349

11/21/2023, 8:37 PM
Can you use the web UI to update from RC1 to RC3 or is that not avail until RC3?
t

terarex

11/21/2023, 8:50 PM
You can use the Web UI.
h

hagak1349

11/21/2023, 8:57 PM
damn shame the sdcard was not mounted on an edge make it easier to access.
c

cfsworks

11/21/2023, 9:03 PM
If you already have a microSD installed, you can back it up, image the installer on there, reset the BMC, do the installation, reset the BMC again, go into safe mode, then erase the microSD card, all without removing it.
But if you're already on RC1 you can use the web UI to update to any future version (RC or otherwise). Just note that RC1 has a bug where it throws an error despite the update working successfully.
w

werdnum

11/21/2023, 9:10 PM
+1 - I did this but with UART attached
s

svenrademakers

11/22/2023, 8:56 PM
Ah i see you found my git tag 😅
b

bhuism

11/22/2023, 9:02 PM
yes, really great work, big leap forwards
w

werdnum

11/23/2023, 1:14 AM
ooh, now working out where I can download the artifacts from
I think you need to add the .zip extension 😄
oh it's just not visible in the UI
I got this error when trying to install the final version from there
perhaps this is what I'm encountering
yeah ok so if you get this error it succeeded but didn't reboot
so you just have to reboot
b

bhuism

11/23/2023, 6:04 AM
And do it again for closure 👌🏻 issue will be gone
i

inquisitor_1337

11/23/2023, 2:02 PM
Hi guys, not so long time ago, Sven put the installation steps for the BMC FW and I cannot find it in the history 😄 If someone has it please paste the link here, many thanks
BTW is there any tool in the discord which can be used to save a message?
s

svenrademakers

11/23/2023, 3:41 PM
i will paste it in a bit..
Just so you guys are aware, there will be a v2.0.4. Which will be final final final. i promise.. It will contain a small api change in the USB tab to make sure the UI works the same as the old UI. Secondly, there are some ui fixes for windows users
Its always interesting that it was relatively quiet last week, but when i was about to hit the big release button today, some things popped up 😅
download the latest artifacts from the BMC-Firmware repo: https://github.com/turing-machines/BMC-Firmware/actions/runs/6973770338 -> it contains a tp2-firmware-sdcard-xxxx.img -> dd this to a micro sd card -> put it in the back of the bmc -> reset het bmc -> wait until the lan leds start to blink -> hit KEY1 3 times -> lan leds start to walk from left to right -> wait until lan leds start to flash twice -> eject sd card from the back and restart bmc -> Done Warning: flashing with SD card method removes all data from your BMC flash
i

inquisitor_1337

11/24/2023, 6:52 AM
Successfully updated to v2.0.4 Many thanks!
g

ggbiker

11/24/2023, 8:59 AM
If you are on 2.0.3 is there added value to do the sd card upgrade or is the ota upgrade sufficient?
w

werdnum

11/24/2023, 9:38 AM
just ota I believe
t

tobiaskohlbau

11/24/2023, 10:21 AM
I've just upgraded from latest 1.x to 2.0.4 from GitHub. Did someone else have to run the
generate_self_signedx509.sh
script after upgrading in order to get bmcd to boot? Without this the program failed at start stating that it lacked the required pem files.
w

werdnum

11/24/2023, 10:29 AM
I didn't have to for 2.0.3 - I was wondering if there's instructions somewhere about bringing my own SSL certificate
e

echelon101

11/24/2023, 10:34 AM
me neither. my system generated the certificates on first boot
I din't find any instructions. but in
/etc/bmcd/config,yml
are config keys for key files. you can probably configure your own cert that way https://cdn.discordapp.com/attachments/1170114820457111592/1177559625697067008/image.png?ex=6572f2dc&is=65607ddc&hm=2c92962808f4fa73f22e546f1d35c9bf53c6d5844241fba4f4cd651be87efa98&
t

teslamax

11/26/2023, 12:42 AM
Me too!
g

gunther2908

11/26/2023, 7:06 AM
@me too trough tpu file !
p

phearzero

11/27/2023, 2:15 AM
@werdnum if you are still interested I can give it a polish for v2
w

werdnum

11/27/2023, 2:16 AM
Not my call, but I do like it! See what @svenrademakers thinks
s

svenrademakers

11/27/2023, 9:37 AM
Sorry guys, i was out for the weekend.@phearzero are you the one maintaining the xterm UI fork?
p

phearzero

11/27/2023, 9:51 PM
No worries at all, yea this project: https://turing-pi-ui.vercel.app/ Should have a ticket in GitHub where I'm most active
t

teslamax

11/27/2023, 10:46 PM
That is slick!
c

cfsworks

11/28/2023, 6:20 AM
Hmm there appears to be a bug in the
rockusb
driver (?) where it mistakenly believes the eMMC of the RK1 is 2064384MiB larger than it is.
It gives the correct size modulo 32GiB but it sets 6 bits that shouldn't be set. I'm wondering if this is a Rockchip quirk rather than a driver bug though.
I'm thinking it's actually a bug in the version of
usbplug
that's loaded on the RK3588 by bmcd. I don't remember this coming up in my testing but I was also using the latest binary from Rockchip.
s

svenrademakers

11/28/2023, 10:34 AM
liking the terminal you made in there. We dont have any active plans for the UI currently. So any form of contribution is welcome
its finally public https://github.com/turing-machines/BMC-Firmware/releases/tag/v2.0.5 Thank you all for taking the earlier v2.0.x versions for a spin, Thanks to you guys the firmware got better
i didnt want to update the usb plug to latest as i did not want to risk it. but if there are issues with it we might need to do it anyway
u

uberdouche

11/28/2023, 3:13 PM
ota upgrade works pimp
c

cyphersnep

11/28/2023, 3:24 PM
Anyone tested flashing from WebUI for raspberry pi cm4? It's working as intended?
t

terarex

11/28/2023, 3:50 PM
I successfully flashed an SD-equipped CM4 from the WebUI. I believe it was RC2. Improvements were made in RC3 for performance.
c

cyphersnep

11/28/2023, 3:51 PM
Hmm, I hope it'll work on non-sd card CM4
u

_dhanos_

11/28/2023, 3:53 PM
It will. This flashing method works fine and can flash the eMMC
Flashing an SD card this way is not supported by the RPi Foundation and may fail
c

cyphersnep

11/28/2023, 3:53 PM
Neat! Thank you
Also, what about ttySx? I tried to use microcom commands from docs, but no output in serial still 😦
u

_dhanos_

11/28/2023, 3:55 PM
And what about it?
You see nothing when you boot CM4?
c

cyphersnep

11/28/2023, 3:56 PM
CM4 have an OS for now, Ubuntu When I power it on and use any serial from that list: https://docs.turingpi.com/discuss/652d892d4fa166000de16cee No output or any reaction Tried all slots
Copy code
Node 1: /dev/ttyS2
Node 2: /dev/ttyS1
Node 3: /dev/ttyS4
Node 4: /dev/ttyS5
u

_dhanos_

11/28/2023, 3:57 PM
Ubuntu required setting the UART out iirc
Is there
raspi-cconfig
available?
If not, see RPi forum on how to enable UART\
c

cyphersnep

11/28/2023, 3:58 PM
Thanks! Will try
Login worked thanks But is it possible to see whole boot sequence in Serial? Or it's not an option?
u

_dhanos_

11/28/2023, 4:08 PM
UART is a serial console. It shows what's being sent to it. Physically it is just a serial link, so whenever you connect, you see what's being sent over. This means no history. But if you logged in, you may try
dmesg
command
b

bhuism

11/28/2023, 6:47 PM
Sorry if this is asked again, reboot without power loss of the nodes, is that in the cards? Rather not have the nodes going down on every firmware upgrade
c

cfsworks

11/28/2023, 6:54 PM
Earlier this morning I was idly wondering how difficult it would be to reboot the T113 "by hand" and not touch the pin controller in the reset sequence. It wouldn't alter how the
BMC_RESET
button behaves but it would mean Linux can reboot without affecting the nodes.
u

_dhanos_

11/28/2023, 7:51 PM
It is, but for the Turing Pi 2 v2.5 board (the next revision) - it'll contain latches to hold the states (like node power) on the BMC reboot
b

bhuism

11/28/2023, 7:53 PM
a, thanks @_dhanos_ so its imposible on tpi v2.4?
u

_dhanos_

11/28/2023, 7:55 PM
As for our current knowledge - no. Maybe there is a way to do so, like what CFSworks was looking at, but for now we do not know about any wat to not reset the GPIO states at reboot
d

dethtungue

11/28/2023, 9:12 PM
How long should we expect our CR1220 batteries to last for the RTC? After flashing my board the clock was reset to Jan 1. I'd have expected the RTC to still have an accurate time. Thinking it may be time to change mine.
u

1solon

11/28/2023, 9:22 PM
Hey guys, I've flashed and inserted the SDcard. But I'm not getting the update prompt, is there something I am missing here?
u

_dhanos_

11/28/2023, 9:23 PM
My best guess is you touched the board around the battery or BMC while inserting/removing the SD card which may reset the RTC time. The battery should last for months, but no one ever tested this 🙂 If you keep the board powered on (the nodes can be down but the BMC works), the battery does not discharge, it discharges when the board has no power
So the BMC is booting normally even with the card inserted? How did you flash the card?
u

1solon

11/28/2023, 9:24 PM
Win32 Disk Imager as directed
I'm kinda confused, because the BMC startup declares that the SD card is mounted.
So like, either there's a problem with the flash (Unsure how I would check that) or it's not reading it
u

_dhanos_

11/28/2023, 9:25 PM
Are you sure you... flashed the right drive?
u

1solon

11/28/2023, 9:25 PM
100%
u

_dhanos_

11/28/2023, 9:25 PM
Can you SSH into BMC?
u

1solon

11/28/2023, 9:26 PM
Yep
u

_dhanos_

11/28/2023, 9:26 PM
Can you list the sd card content? Will be mounted as
/mnt/sdcard
u

1solon

11/28/2023, 9:26 PM
Can do, hold one
So seemingly the flash has failed?
u

_dhanos_

11/28/2023, 9:30 PM
Looks empty and it should not be empty
I prepared this part of the docs and I ran this countless times. Unless you chose not the correct drive letter, I'm not sure what else could have happened
Or maybe the card is corrupt?
Just verify I'm doing the right thing here @_dhanos_
c

cfsworks

11/28/2023, 9:34 PM
It looks right. Just to verify it's actually writing, if you unplug and reconnect your microSD(/adapter), can you then go into
K:\
and see
install.txt
?
u

1solon

11/28/2023, 9:34 PM
Empty
c

cfsworks

11/28/2023, 9:35 PM
Try creating a file in there, then safely removing and replugging. I'm wondering if something silly is going on, like the writeprotect switch on the adapter is turned on 🤷‍♂️
So, yeah, it retains
That works, annoyingly
So its not that
@_dhanos_ just tried it with a seperate SD card, same deal
c

cfsworks

11/28/2023, 9:44 PM
I wonder if it's Win32DiskImager's fault. Maybe try Rufus or BalenaEtcher?
u

1solon

11/28/2023, 9:49 PM
Tried rufus, same deal
u

_dhanos_

11/28/2023, 10:29 PM
Okay I found your issue
This is not the correct file size
The file is over 50MB in size
Download it again and make sure you downloaded the full file
c

cfsworks

11/29/2023, 12:15 AM
ugh, no, it's a bug in my
rockusb
code somewhere. Worse, it doesn't happen on my own system but it does happen when compiled for the BMC.
r

rmeier

11/29/2023, 6:48 AM
@svenrademakers Thanks, for all the effort, just updated to v2.0.5 - Very happy customer
s

svenrademakers

11/29/2023, 9:28 AM
perhaps we should add a sha hash file to the directory for these kind of things. Im thinking, it would be even nicer to have the BMC verify known images on sha hashes
u

uberdouche

11/29/2023, 9:48 AM
I was going to mention shasums
u

_dhanos_

11/29/2023, 3:48 PM
I totally agree. Or we could have additional field to provide sha file
u

1solon

11/29/2023, 4:12 PM
Worked @_dhanos_!
Thanks
b

bhuism

12/02/2023, 2:30 PM
I would love to see a free format description field besides the nodes, I now literally keep a notepad next to my machine to track where which node is 🙂
w

werdnum

12/03/2023, 6:01 AM
Heh, just name the machines after what slot they're in
b

bhuism

12/03/2023, 6:06 AM
A, thanks, but i want to track the modules, and what you jean bij naming them? In dns u mean?
w

werdnum

12/03/2023, 6:07 AM
Yeah give them host names that reflect their slot number
I have a little cheekily, slot#3 is "shamrock" (shamrocks notoriously have 3 leaves)
b

bhuism

12/03/2023, 6:09 AM
Haha
w

werdnum

12/03/2023, 6:09 AM
(my nodes are named after places in Maastricht of all things)
b

bhuism

12/03/2023, 6:09 AM
4 if your lucky 😇
Haha, do you have enough places there 😉
w

werdnum

12/03/2023, 6:11 AM
Lol, I do need to strain a little after 4 or 5
But it's a student town, there are plenty of bars. Actually the whole scheme began as a pun, I named my first Raspberry Pi "vlaai", because of course it's a type of raspberry pie
Well more often cherry but oh well
b

bhuism

12/03/2023, 6:19 AM
I like vlaai (not from cows)
a

andrewbenton

12/04/2023, 3:57 AM
Should the BMC's sd card still auto-mount? I'm not seeing that behavior. I put the card in and the kernel detects it, but I have to manually invoke
mount /dev/mmcblk0p1 /mnt/sdcard
. It doesn't seem to properly auto-unmount either and I have to manually invoke
umount /mnt/sdcard
.
Oddly it seems like the turingpi UI will still recognize if I disconnect + reconnect the sdcard while the board is running since it shows in the UI when the card is there or not and the correct size, but when reconnected it's completely unusable and attempted writes result in an io error.
It does seem like after that initial manual invocation of
mount
that it will auto-mount after restart, but I did have to ssh in and manually invoke the command first.
s

svenrademakers

12/04/2023, 3:41 PM
Im having a bit of brain fog, someone here can perhaps past the correct command for this. The behavior you describe i usually have when not correctly ejecting the SDcard, did you try to call the obvious commands? e.g.
sync && eject /mnt/sdcard
i was born in the big metropol north of maastricht.. Sittard. Finding names related to places is not even your biggest challenge there. Im not sure if the concept of electronic boxes hooked up to the internet already landed there
w

werdnum

12/04/2023, 7:59 PM
"big" - when you said "big Metropole North" I figured you meant Eindhoven
s

svenrademakers

12/04/2023, 8:22 PM
Sometimes i fail to give it the right sarcastic tone 😅😀
b

bhuism

12/04/2023, 8:32 PM
@svenrademakers @werdnum @nico_06577 when you are west of Utrecht, let me know!
a

andrewbenton

12/07/2023, 4:39 AM
I didn't and probably won't be able to replicate that again unless I take the whole thing back out of the chasis so that I can try hot unplugging the sdcard. I think the bigger problem is that it wasn't showing up the first time until I manually `mount`'d it rather than the hotswap not working.
i

inquisitor_1337

12/08/2023, 7:25 AM
Guys, I have one delicate situation after updating to the latest FW. My nodes are not accessible nor visible via ssh/nmap, but in the same time I'm able to see them as a nodes in the k3s. Any suggestion how to debugg this? (turning off/on nodes wasn't helpful)
w

werdnum

12/09/2023, 9:31 AM
Are they actually active in k3s? Would be unusual... But you can use
kubectl debug node
to get a shell
Or create a daemonset with a privileged container
i

inquisitor_1337

12/09/2023, 3:05 PM
Yes, they are fully active in the k3s. It seems like nodes lost their static assigned IP addresses, really strange behaviour. I'm little bit confused if IP changed how k3s was able to add the node again in the pool, hmm Anyhow, many thanks for your input Werdnum
b

bplein

01/02/2024, 5:55 PM
how many K3s nodes and how many of them are masters?
i

inquisitor_1337

01/02/2024, 7:03 PM
One master and 6 additional nodes
t

teslamax

01/04/2024, 7:44 PM
Anything new since 2.0.5?
u

_dhanos_

01/04/2024, 7:53 PM
If you mean a new firmware version then no, not yet
t

terarex

01/04/2024, 8:13 PM
The post-2.0.5 BMC-Firmware GitHub updates mostly seem to address better RK1 flashing support. A new branch was created today to begin modifying buildroot to produce both BMC NAND and SDcard-native versions. See 'Info > github" for additional detail.
t

teslamax

01/05/2024, 6:13 PM
Will it be possible to configure nodes post-flash using cloud-init at some time using the BMC? (I assume it’s already on the “to do list”)
c

cfsworks

01/05/2024, 6:20 PM
It's a goal of mine to figure out how to get the BMC to host a 169.254.169.254 virtual IP service that isn't exposed off-board and knows which node/slot is accessing it. (I don't think there's any formal plan for that, mind, but I highly doubt such a PR would be turned away unless it's overcomplicated.)
t

teslamax

01/05/2024, 6:21 PM
I’ve used Proxmox on my R530 for quite some time. Administering/configuring nodes on the TP2 like VMs in Proxmox seems like a convenient feature.
I have little experience with containers, etc. so there might be several technologies for this I’m unaware of.
s

svenrademakers

01/05/2024, 11:02 PM
Out the top of my head: * Added support for downloading node and firmware images straight from a http remote. * Added support in the UI for installing node images that are local on the BMC * Fixed MSD mode for RK1 * Added backend for storing small node configuration, such as, name, module type and buad rate (still need front end) * Pending: serial console to a node over a web-socket. im 80% finished with the implementation. (UI front end still required)
* serial console to the BMC is exposed over the BMC_USB_OTG port. It means in theory that you wont need any USB ttl cables anymore
* Have a side branch which exposes a node as block device over the same BMC_USB_OTG port. Its not completely stable yet. Sometimes the UDC controller refuses to start the MSD function. I parked it for now
10 Views