expand-on-first-boot: wait for udev to create symlink

Message ID 20221208165542.2451856-1-tobias.schaffner@siemens.com
State Superseded, archived
Headers show
Series expand-on-first-boot: wait for udev to create symlink | expand

Commit Message

Tobias Schaffner Dec. 8, 2022, 4:55 p.m. UTC
From: Tobias Schaffner <tobias.schaffner@siemens.com>

systemd-growfs depends on a symlink to the partition of the filesystem
that should be resized. This symlink is created by udev in /dev/block/.

If this symlink is not yet created for example because systemd-udev is
not up yet systemd-growfs will fail.

We could use Require and After to depend on the systemd-udev service
but this could again create a race condition if udev is up but not
fast enough after the partx -u.

Resolve the symlinks in /dev/block/ periodically until the symlink
appears before running systemd-growfs.
---
 .../files/expand-last-partition.sh               | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Henning Schild Dec. 8, 2022, 5:39 p.m. UTC | #1
Am Thu, 8 Dec 2022 17:55:42 +0100
schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:

> From: Tobias Schaffner <tobias.schaffner@siemens.com>
> 
> systemd-growfs depends on a symlink to the partition of the filesystem
> that should be resized. This symlink is created by udev in
> /dev/block/.
> 
> If this symlink is not yet created for example because systemd-udev is
> not up yet systemd-growfs will fail.
> 
> We could use Require and After to depend on the systemd-udev service
> but this could again create a race condition if udev is up but not
> fast enough after the partx -u.
> 
> Resolve the symlinks in /dev/block/ periodically until the symlink
> appears before running systemd-growfs.
> ---
>  .../files/expand-last-partition.sh               | 16
> +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git
> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> index 57055cc..7ebb3e5 100755 ---
> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> +++
> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> @@ -38,7 +38,8 @@ if [ "$DISK_SIZE" -lt "$MINIMAL_SIZE" ]; then exit
> 0 fi -LAST_PART="$(sfdisk -d "${BOOT_DEV}" 2>/dev/null | tail -1 |
> cut -d ' ' -f 1)" +LAST_PART_NAME="$(lsblk -l -o NAME "${BOOT_DEV}" |
> tail -1)" +LAST_PART="/dev/${LAST_PART_NAME}"
>  
>  # Transform the partition table as follows:
>  #
> @@ -79,6 +80,19 @@ if [ ! -d "${MOUNT_POINT}" ]; then
>  	exit 1
>  fi
>  
> +START_TIME="$(date +%s)"
> +
> +# Wait for udev to create the symlink to the partition in

Newer systemd versions will not need that anymore. Maybe write down
which version that would be and starting from which debian distro we
can drop that. Who knows when that udev stuff changes and the symlinks
will never appear ... We should only wait for them in systemd versions
that use them.

If we already have a version out there ... like bookworm? We can
already implement it without the wait.

> /dev/block/ as +# systemd-growfs depends on it
> +while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do

I would make that much stricter to not be tricked by partial matches 

sda4 vs sda42

readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"

> +    sleep 0.1

I wonder if there is anything we could do here. Maybe "udevadm trigger"
and depend on udev after all. Because the 5 is a nasty guess ...

> +    CURRENT_TIME="$(date +%s)"
> +    if [ $(( CURRENT_TIME - START_TIME )) -gt "5" ]; then

we can simply i++ > 50, no need to call date

so maybe

err=1
for i in $(seq 0 50); do
	if readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"; then
		err=0
		break
	fi
	sleep 0.1
done

if $err ...
> +        echo "Could not find symlink to last part in /dev/block/."
> +        exit 1
> +    fi
> +done
> +
>  mount "${LAST_PART}" "${MOUNT_POINT}"
>  /lib/systemd/systemd-growfs "${MOUNT_POINT}"

we could also loop over calling this until it goes "0" or we reach a
retry counter, that way we magically handle new systemd versions that
do not need symlinks and do not implement anything with symlinks.

Henning

>  umount "${MOUNT_POINT}"
Tobias Schaffner Dec. 8, 2022, 7:18 p.m. UTC | #2
On 08.12.22 18:39, Schild, Henning (T CED SES-DE) wrote:
> Am Thu, 8 Dec 2022 17:55:42 +0100
> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> 
>> From: Tobias Schaffner <tobias.schaffner@siemens.com>
>>
>> systemd-growfs depends on a symlink to the partition of the filesystem
>> that should be resized. This symlink is created by udev in
>> /dev/block/.
>>
>> If this symlink is not yet created for example because systemd-udev is
>> not up yet systemd-growfs will fail.
>>
>> We could use Require and After to depend on the systemd-udev service
>> but this could again create a race condition if udev is up but not
>> fast enough after the partx -u.
>>
>> Resolve the symlinks in /dev/block/ periodically until the symlink
>> appears before running systemd-growfs.
>> ---
>>   .../files/expand-last-partition.sh               | 16
>> +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git
>> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
>> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
>> index 57055cc..7ebb3e5 100755 ---
>> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
>> +++
>> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
>> @@ -38,7 +38,8 @@ if [ "$DISK_SIZE" -lt "$MINIMAL_SIZE" ]; then exit
>> 0 fi -LAST_PART="$(sfdisk -d "${BOOT_DEV}" 2>/dev/null | tail -1 |
>> cut -d ' ' -f 1)" +LAST_PART_NAME="$(lsblk -l -o NAME "${BOOT_DEV}" |
>> tail -1)" +LAST_PART="/dev/${LAST_PART_NAME}"
>>   
>>   # Transform the partition table as follows:
>>   #
>> @@ -79,6 +80,19 @@ if [ ! -d "${MOUNT_POINT}" ]; then
>>   	exit 1
>>   fi
>>   
>> +START_TIME="$(date +%s)"
>> +
>> +# Wait for udev to create the symlink to the partition in
> 
> Newer systemd versions will not need that anymore. Maybe write down
> which version that would be and starting from which debian distro we
> can drop that. Who knows when that udev stuff changes and the symlinks
> will never appear ... We should only wait for them in systemd versions
> that use them.
> 
> If we already have a version out there ... like bookworm? We can
> already implement it without the wait.

Yes you are right. I will add a check and skip the loop from V252 on.

>> /dev/block/ as +# systemd-growfs depends on it
>> +while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do
> 
> I would make that much stricter to not be tricked by partial matches
> 
> sda4 vs sda42
> 
> readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"

I get a relative path from readlink. So I will have to skip the ^ but
to use match the end of the line is a good idea. -e is not needed for
start and end line matches. ;)

>> +    sleep 0.1
> 
> I wonder if there is anything we could do here. Maybe "udevadm trigger"
> and depend on udev after all. Because the 5 is a nasty guess ...


Nothing that I am aware of. The main reason for the issues that we have
seen was that the udev service was not up yet. We could maybe check if it
is up first and after that only allow a small period. But then again how
long do we wait for the service to be up and running.

>> +    CURRENT_TIME="$(date +%s)"
>> +    if [ $(( CURRENT_TIME - START_TIME )) -gt "5" ]; then
> 
> we can simply i++ > 50, no need to call date
> 
> so maybe
> 
> err=1
> for i in $(seq 0 50); do
> 	if readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"; then
> 		err=0
> 		break
> 	fi
> 	sleep 0.1
> done
> 
> if $err ...
>> +        echo "Could not find symlink to last part in /dev/block/."
>> +        exit 1
>> +    fi
>> +done
>> +
>>   mount "${LAST_PART}" "${MOUNT_POINT}"
>>   /lib/systemd/systemd-growfs "${MOUNT_POINT}"
> 
> we could also loop over calling this until it goes "0" or we reach a
> retry counter, that way we magically handle new systemd versions that
> do not need symlinks and do not implement anything with symlinks.


That's a cool idea. As systemd-growfs returns 1 in all error cases we will
have to parse stderr to determine if we ran in the error case that we want
to handle. I will give it a try.


> Henning
> 
>>   umount "${MOUNT_POINT}"
>
Henning Schild Dec. 8, 2022, 7:42 p.m. UTC | #3
Am Thu, 8 Dec 2022 20:18:28 +0100
schrieb "Schaffner, Tobias (T CED SES-DE)"
<tobias.schaffner@siemens.com>:

> On 08.12.22 18:39, Schild, Henning (T CED SES-DE) wrote:
> > Am Thu, 8 Dec 2022 17:55:42 +0100
> > schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> >   
> >> From: Tobias Schaffner <tobias.schaffner@siemens.com>
> >>
> >> systemd-growfs depends on a symlink to the partition of the
> >> filesystem that should be resized. This symlink is created by udev
> >> in /dev/block/.
> >>
> >> If this symlink is not yet created for example because
> >> systemd-udev is not up yet systemd-growfs will fail.
> >>
> >> We could use Require and After to depend on the systemd-udev
> >> service but this could again create a race condition if udev is up
> >> but not fast enough after the partx -u.
> >>
> >> Resolve the symlinks in /dev/block/ periodically until the symlink
> >> appears before running systemd-growfs.
> >> ---
> >>   .../files/expand-last-partition.sh               | 16
> >> +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
> >>
> >> diff --git
> >> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> >> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> >> index 57055cc..7ebb3e5 100755 ---
> >> a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> >> +++
> >> b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
> >> @@ -38,7 +38,8 @@ if [ "$DISK_SIZE" -lt "$MINIMAL_SIZE" ]; then
> >> exit 0 fi -LAST_PART="$(sfdisk -d "${BOOT_DEV}" 2>/dev/null | tail
> >> -1 | cut -d ' ' -f 1)" +LAST_PART_NAME="$(lsblk -l -o NAME
> >> "${BOOT_DEV}" | tail -1)" +LAST_PART="/dev/${LAST_PART_NAME}"
> >>   
> >>   # Transform the partition table as follows:
> >>   #
> >> @@ -79,6 +80,19 @@ if [ ! -d "${MOUNT_POINT}" ]; then
> >>   	exit 1
> >>   fi
> >>   
> >> +START_TIME="$(date +%s)"
> >> +
> >> +# Wait for udev to create the symlink to the partition in  
> > 
> > Newer systemd versions will not need that anymore. Maybe write down
> > which version that would be and starting from which debian distro we
> > can drop that. Who knows when that udev stuff changes and the
> > symlinks will never appear ... We should only wait for them in
> > systemd versions that use them.
> > 
> > If we already have a version out there ... like bookworm? We can
> > already implement it without the wait.  
> 
> Yes you are right. I will add a check and skip the loop from V252 on.
> 
> >> /dev/block/ as +# systemd-growfs depends on it
> >> +while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do  
> > 
> > I would make that much stricter to not be tricked by partial matches
> > 
> > sda4 vs sda42
> > 
> > readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"  
> 
> I get a relative path from readlink. So I will have to skip the ^ but
> to use match the end of the line is a good idea. -e is not needed for
> start and end line matches. ;)
> 
> >> +    sleep 0.1  
> > 
> > I wonder if there is anything we could do here. Maybe "udevadm
> > trigger" and depend on udev after all. Because the 5 is a nasty
> > guess ...  
> 
> 
> Nothing that I am aware of. The main reason for the issues that we
> have seen was that the udev service was not up yet. We could maybe
> check if it is up first and after that only allow a small period. But
> then again how long do we wait for the service to be up and running.
> 
> >> +    CURRENT_TIME="$(date +%s)"
> >> +    if [ $(( CURRENT_TIME - START_TIME )) -gt "5" ]; then  
> > 
> > we can simply i++ > 50, no need to call date
> > 
> > so maybe
> > 
> > err=1
> > for i in $(seq 0 50); do
> > 	if readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$";
> > then err=0
> > 		break
> > 	fi
> > 	sleep 0.1
> > done
> > 
> > if $err ...  
> >> +        echo "Could not find symlink to last part in /dev/block/."
> >> +        exit 1
> >> +    fi
> >> +done
> >> +
> >>   mount "${LAST_PART}" "${MOUNT_POINT}"
> >>   /lib/systemd/systemd-growfs "${MOUNT_POINT}"  
> > 
> > we could also loop over calling this until it goes "0" or we reach a
> > retry counter, that way we magically handle new systemd versions
> > that do not need symlinks and do not implement anything with
> > symlinks.  
> 
> 
> That's a cool idea. As systemd-growfs returns 1 in all error cases we
> will have to parse stderr to determine if we ran in the error case
> that we want to handle. I will give it a try.

True ... parsing the error smells bad. But let us have a look at the
code we get. We should probably do some code reading of < 252 to make
sure the parser would catch it in all affected versions since stretch.
And if >= 252 solved that for us that code will hopefully never fail
again ... famous last expand-on-first-boot patch ;)
I will do a 252 (bookworm) test run on the raspbi which i used to
repro. In theory that should work without this upcoming patch already.
Also gives the chance to add "raspios-bookworm" to isar.

Henning

> 
> > Henning
> >   
> >>   umount "${MOUNT_POINT}"  
> >
Roberto A. Foglietta Dec. 9, 2022, 12:22 a.m. UTC | #4
On Thu, 8 Dec 2022 at 18:39, Henning Schild <henning.schild@siemens.com> wrote:
>
> Am Thu, 8 Dec 2022 17:55:42 +0100
> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
>
> > From: Tobias Schaffner <tobias.schaffner@siemens.com>
> >
> > systemd-growfs depends on a symlink to the partition of the filesystem
> > that should be resized. This symlink is created by udev in
> > /dev/block/.
> >

reading that patch I had a flash-back about two patches I did recently
for another project

https://github.com/robang74/tinycore-editor/commit/d57271e3229551ddfbd3deba1363d2efe2914908
https://github.com/robang74/tinycore-editor/commit/4ff435bb8a73c7e956ea4f9303a36fe06a136e54

both altered this file

https://github.com/robang74/tinycore-editor/blob/main/tinycore/changes/rcS

the main problem and the final result was much the same: trigger udev
and wait for the dev link.

# Start Udev to populate /dev and handle hotplug events
/sbin/udevd --daemon 2>&1 >/dev/null
/sbin/udevadm trigger --action=add 2>&1 >/dev/null
/sbin/udevadm settle --timeout=5
echo -ne "\e[1;37m"
for i in $(seq 1 10); do
    if gettcdev; then
    break;
fi >/dev/null
echo -n .
sleep 1
done
echo -e "\e[0m"
/sbin/udevadm control --exit --timeout=5

in my case, the booting device is not meaningful because the rootfs is a file.

function gettcdev() {
    blkid --label $tclabel
}

So, the search for the USB stick device is done s by label

>
> > /dev/block/ as +# systemd-growfs depends on it
> > +while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do
>
> I would make that much stricter to not be tricked by partial matches
>
> sda4 vs sda42

Irrelevant, if sda42 has been found then sda4 exists as well
considering the problem is not yet started.
However, for sake of precision and to avoid corner cases in which
sda42 is created before sda4 a dollar at the end of regex will solve
this issue.

+while ! readlink /dev/block/* | grep -qe "${LAST_PART_NAME}$"; do

>
> readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"
>
> > +    sleep 0.1
>
> I wonder if there is anything we could do here. Maybe "udevadm trigger"
> and depend on udev after all. Because the 5 is a nasty guess ...

This is the reason because I cited my patches. I hope they will help you.

Best regards, R-
Tobias Schaffner Dec. 9, 2022, 10:51 a.m. UTC | #5
On 09.12.22 01:22, Roberto A. Foglietta wrote:
> On Thu, 8 Dec 2022 at 18:39, Henning Schild <henning.schild@siemens.com> wrote:
>>
>> Am Thu, 8 Dec 2022 17:55:42 +0100
>> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
>>
>>> From: Tobias Schaffner <tobias.schaffner@siemens.com>
>>>
>>> systemd-growfs depends on a symlink to the partition of the filesystem
>>> that should be resized. This symlink is created by udev in
>>> /dev/block/.
>>>
> 
> reading that patch I had a flash-back about two patches I did recently
> for another project
> 
> https://github.com/robang74/tinycore-editor/commit/d57271e3229551ddfbd3deba1363d2efe2914908
> https://github.com/robang74/tinycore-editor/commit/4ff435bb8a73c7e956ea4f9303a36fe06a136e54
> 
> both altered this file
> 
> https://github.com/robang74/tinycore-editor/blob/main/tinycore/changes/rcS
> 
> the main problem and the final result was much the same: trigger udev
> and wait for the dev link.
> 
> # Start Udev to populate /dev and handle hotplug events
> /sbin/udevd --daemon 2>&1 >/dev/null
> /sbin/udevadm trigger --action=add 2>&1 >/dev/null
> /sbin/udevadm settle --timeout=5
> echo -ne "\e[1;37m"
> for i in $(seq 1 10); do
>      if gettcdev; then
>      break;
> fi >/dev/null
> echo -n .
> sleep 1
> done
> echo -e "\e[0m"
> /sbin/udevadm control --exit --timeout=5
> 
> in my case, the booting device is not meaningful because the rootfs is a file.
> 
> function gettcdev() {
>      blkid --label $tclabel
> }
> 
> So, the search for the USB stick device is done s by label

Thanks for providing this Roberto!

In isar we have the systemd-udevd service which is responsible for the 
start of the udevd.

The problem appears if this systemd unit is not up yet when we try to 
use systemd-growfs by hand.

Therefore a udevadm trigger/settle would also most probably hit to early.

Correct me if I am wrong.

Best,
Tobias

>>
>>> /dev/block/ as +# systemd-growfs depends on it
>>> +while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do
>>
>> I would make that much stricter to not be tricked by partial matches
>>
>> sda4 vs sda42
> 
> Irrelevant, if sda42 has been found then sda4 exists as well
> considering the problem is not yet started.
> However, for sake of precision and to avoid corner cases in which
> sda42 is created before sda4 a dollar at the end of regex will solve
> this issue.
> 
> +while ! readlink /dev/block/* | grep -qe "${LAST_PART_NAME}$"; do
> 
>>
>> readlink -f /dev/block/* | grep -e -q "^${LAST_PART}$"
>>
>>> +    sleep 0.1
>>
>> I wonder if there is anything we could do here. Maybe "udevadm trigger"
>> and depend on udev after all. Because the 5 is a nasty guess ...
> 
> This is the reason because I cited my patches. I hope they will help you.
> 
> Best regards, R-
Roberto A. Foglietta Dec. 9, 2022, 5:17 p.m. UTC | #6
On Fri, 9 Dec 2022 at 11:51, Schaffner, Tobias
<tobias.schaffner@siemens.com> wrote:
>
>
> The problem appears if this systemd unit is not up yet when we try to
> use systemd-growfs by hand.
>
> Therefore a udevadm trigger/settle would also most probably hit too early.
>

Dear Tobias,

 I am not an expert of udev because I never investigated its code.
However, for the little I did with it, udevadm is the main tool to
interact with udev. In particular udevadm trigger --action=add is
exactly the request we send in order to push udev to register devices.
Because this registration is not immediate, udevadm settle is a good
way to wait for it finishes. Probably --type=subsystems
--subsystems-type=block could be better address the initial trigger
request. In my case, it seems not appropriate but further
investigation may worth a little of time. The settle could have a
timeout and in such a case a loop of pooling for the device should
run. In my specific case, udevadm also terminate udev because the
initial instance at boot time would not collide with system instance
called later. I hope this helps.

 Best regards, R-

Patch

diff --git a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
index 57055cc..7ebb3e5 100755
--- a/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
+++ b/meta/recipes-support/expand-on-first-boot/files/expand-last-partition.sh
@@ -38,7 +38,8 @@  if [ "$DISK_SIZE" -lt "$MINIMAL_SIZE" ]; then
 	exit 0
 fi
 
-LAST_PART="$(sfdisk -d "${BOOT_DEV}" 2>/dev/null | tail -1 | cut -d ' ' -f 1)"
+LAST_PART_NAME="$(lsblk -l -o NAME "${BOOT_DEV}" | tail -1)"
+LAST_PART="/dev/${LAST_PART_NAME}"
 
 # Transform the partition table as follows:
 #
@@ -79,6 +80,19 @@  if [ ! -d "${MOUNT_POINT}" ]; then
 	exit 1
 fi
 
+START_TIME="$(date +%s)"
+
+# Wait for udev to create the symlink to the partition in /dev/block/ as
+# systemd-growfs depends on it
+while ! readlink /dev/block/* | grep -q "${LAST_PART_NAME}"; do
+    sleep 0.1
+    CURRENT_TIME="$(date +%s)"
+    if [ $(( CURRENT_TIME - START_TIME )) -gt "5" ]; then
+        echo "Could not find symlink to last part in /dev/block/."
+        exit 1
+    fi
+done
+
 mount "${LAST_PART}" "${MOUNT_POINT}"
 /lib/systemd/systemd-growfs "${MOUNT_POINT}"
 umount "${MOUNT_POINT}"