image: limit search for *.core to regular files

Message ID 20240312073935.331457-1-cedric.hombourger@siemens.com
State Accepted, archived
Headers show
Series image: limit search for *.core to regular files | expand

Commit Message

Cedric Hombourger March 12, 2024, 7:39 a.m. UTC
Code to search and delete core dumps in the build tree assumes that
the build host has a kernel.core_pattern setting which would result
in core dumps having a .core file suffix: this is not guaranteed. One
may also argue that the build should have failed if a process executed
under qemu-user got to crash (and we should check why qemu has crashed
and fix it). My vote would be to kill that code but for now, make it
less wrong by restricting the search to regular files suffixed with
.core (this would at least stop isar from moving directories such as
"org.eclipse.equinox.p2.core" out of the image).

Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
---
 meta/classes/image.bbclass | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Kiszka March 12, 2024, 8:11 a.m. UTC | #1
On 12.03.24 08:39, 'Cedric Hombourger' via isar-users wrote:
> Code to search and delete core dumps in the build tree assumes that
> the build host has a kernel.core_pattern setting which would result
> in core dumps having a .core file suffix: this is not guaranteed. One
> may also argue that the build should have failed if a process executed
> under qemu-user got to crash (and we should check why qemu has crashed
> and fix it). My vote would be to kill that code but for now, make it
> less wrong by restricting the search to regular files suffixed with
> .core (this would at least stop isar from moving directories such as
> "org.eclipse.equinox.p2.core" out of the image).
> 
> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> ---
>  meta/classes/image.bbclass | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
> index 73f1d52c..793c21a2 100644
> --- a/meta/classes/image.bbclass
> +++ b/meta/classes/image.bbclass
> @@ -457,7 +457,7 @@ EOSUDO
>  
>      # Sometimes qemu-user-static generates coredumps in chroot, move them
>      # to work temporary directory and inform user about it.
> -    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
> +    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core); do
>          sudo mv "${f}" "${WORKDIR}/temp/"
>          bbwarn "found core dump in rootfs, check it in ${WORKDIR}/temp/${f##*/}"
>      done

Yeah, too much heuristics in play now. We could add a list of valid
"core" files on top, but maybe we should rather demand core file
generation being disabled during the build and enforcing that.

Jan
MOESSBAUER, Felix March 12, 2024, 8:31 a.m. UTC | #2
On Tue, 2024-03-12 at 09:11 +0100, 'Jan Kiszka' via isar-users wrote:
> On 12.03.24 08:39, 'Cedric Hombourger' via isar-users wrote:
> > Code to search and delete core dumps in the build tree assumes that
> > the build host has a kernel.core_pattern setting which would result
> > in core dumps having a .core file suffix: this is not guaranteed.
> > One
> > may also argue that the build should have failed if a process
> > executed
> > under qemu-user got to crash (and we should check why qemu has 
> > crashed

Well... It's not that easy. This is the third time for me that this
coredump discussion pops up somewhere.

Many builders (like CMake) use feature probing (e.g. to check for AVX2)
which execute test examples that either succeed or crash with a
coredump [1]. While most of our builders run inside the schroot, there
might still be cases outside schroot where this is the expected
behavior.

> > and fix it). My vote would be to kill that code but for now, make
> > it
> > less wrong by restricting the search to regular files suffixed with
> > .core (this would at least stop isar from moving directories such
> > as
> > "org.eclipse.equinox.p2.core" out of the image).
> > 
> > Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> > ---
> >  meta/classes/image.bbclass | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/meta/classes/image.bbclass
> > b/meta/classes/image.bbclass
> > index 73f1d52c..793c21a2 100644
> > --- a/meta/classes/image.bbclass
> > +++ b/meta/classes/image.bbclass
> > @@ -457,7 +457,7 @@ EOSUDO
> >  
> >      # Sometimes qemu-user-static generates coredumps in chroot,
> > move them
> >      # to work temporary directory and inform user about it.
> > -    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
> > +    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core); do
> >          sudo mv "${f}" "${WORKDIR}/temp/"
> >          bbwarn "found core dump in rootfs, check it in
> > ${WORKDIR}/temp/${f##*/}"
> >      done
> 
> Yeah, too much heuristics in play now. We could add a list of valid
> "core" files on top, but maybe we should rather demand core file
> generation being disabled during the build and enforcing that.

We had exactly that discussion on the KAS ML as well, were I tried to
introduce a warning on default coredump configurations (which are BTW
really tricky to debug on CI systems). However that was not accepted as
the builders itself shall be responsible for a suitable coredump
configuration [2]. The truths is probably somewhere in between.

Just disabling the coredump generation is not easily possible, as there
is no "coredump" namespace in the kernel. By that, you would fiddle
around with the global system config and potentially interfere with
systemd (systemd-coredump).

The probably best thing we could do is to look for common coredump
patterns and check with file if these are actual coredumps.

[1] https://github.com/DynamoRIO/dynamorio/issues/6126
[2] https://groups.google.com/g/kas-devel/c/-sEyujhICfw/m/1UtVfsDRAAAJ

Best regards,
Felix

> 
> Jan
> 
> -- 
> Siemens AG, Technology
> Linux Expert Center
>
Jan Kiszka March 12, 2024, 9:03 a.m. UTC | #3
On 12.03.24 09:31, Moessbauer, Felix (T CED OES-DE) wrote:
> On Tue, 2024-03-12 at 09:11 +0100, 'Jan Kiszka' via isar-users wrote:
>> On 12.03.24 08:39, 'Cedric Hombourger' via isar-users wrote:
>>> Code to search and delete core dumps in the build tree assumes that
>>> the build host has a kernel.core_pattern setting which would result
>>> in core dumps having a .core file suffix: this is not guaranteed.
>>> One
>>> may also argue that the build should have failed if a process
>>> executed
>>> under qemu-user got to crash (and we should check why qemu has 
>>> crashed
> 
> Well... It's not that easy. This is the third time for me that this
> coredump discussion pops up somewhere.
> 
> Many builders (like CMake) use feature probing (e.g. to check for AVX2)
> which execute test examples that either succeed or crash with a
> coredump [1]. While most of our builders run inside the schroot, there
> might still be cases outside schroot where this is the expected
> behavior.
> 
>>> and fix it). My vote would be to kill that code but for now, make
>>> it
>>> less wrong by restricting the search to regular files suffixed with
>>> .core (this would at least stop isar from moving directories such
>>> as
>>> "org.eclipse.equinox.p2.core" out of the image).
>>>
>>> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
>>> ---
>>>  meta/classes/image.bbclass | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/meta/classes/image.bbclass
>>> b/meta/classes/image.bbclass
>>> index 73f1d52c..793c21a2 100644
>>> --- a/meta/classes/image.bbclass
>>> +++ b/meta/classes/image.bbclass
>>> @@ -457,7 +457,7 @@ EOSUDO
>>>  
>>>      # Sometimes qemu-user-static generates coredumps in chroot,
>>> move them
>>>      # to work temporary directory and inform user about it.
>>> -    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
>>> +    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core); do
>>>          sudo mv "${f}" "${WORKDIR}/temp/"
>>>          bbwarn "found core dump in rootfs, check it in
>>> ${WORKDIR}/temp/${f##*/}"
>>>      done
>>
>> Yeah, too much heuristics in play now. We could add a list of valid
>> "core" files on top, but maybe we should rather demand core file
>> generation being disabled during the build and enforcing that.
> 
> We had exactly that discussion on the KAS ML as well, were I tried to
> introduce a warning on default coredump configurations (which are BTW
> really tricky to debug on CI systems). However that was not accepted as
> the builders itself shall be responsible for a suitable coredump
> configuration [2]. The truths is probably somewhere in between.
> 
> Just disabling the coredump generation is not easily possible, as there
> is no "coredump" namespace in the kernel. By that, you would fiddle
> around with the global system config and potentially interfere with
> systemd (systemd-coredump).
> 
> The probably best thing we could do is to look for common coredump
> patterns and check with file if these are actual coredumps.
> 

Can't we read out in isar what the effective settings are and use them
at least? In addition to possibly adding some exceptions on a per-image
basis.

Jan

> [1] https://github.com/DynamoRIO/dynamorio/issues/6126
> [2] https://groups.google.com/g/kas-devel/c/-sEyujhICfw/m/1UtVfsDRAAAJ
> 
> Best regards,
> Felix
> 
>>
>> Jan
>>
>> -- 
>> Siemens AG, Technology
>> Linux Expert Center
>>
>
Cedric Hombourger March 12, 2024, 9:10 a.m. UTC | #4
On Tue, 2024-03-12 at 10:03 +0100, Jan Kiszka wrote:
> On 12.03.24 09:31, Moessbauer, Felix (T CED OES-DE) wrote:
> > On Tue, 2024-03-12 at 09:11 +0100, 'Jan Kiszka' via isar-users
> > wrote:
> > > On 12.03.24 08:39, 'Cedric Hombourger' via isar-users wrote:
> > > > Code to search and delete core dumps in the build tree assumes
> > > > that
> > > > the build host has a kernel.core_pattern setting which would
> > > > result
> > > > in core dumps having a .core file suffix: this is not
> > > > guaranteed.
> > > > One
> > > > may also argue that the build should have failed if a process
> > > > executed
> > > > under qemu-user got to crash (and we should check why qemu has
> > > > crashed
> >
> > Well... It's not that easy. This is the third time for me that this
> > coredump discussion pops up somewhere.
> >
> > Many builders (like CMake) use feature probing (e.g. to check for
> > AVX2)
> > which execute test examples that either succeed or crash with a
> > coredump [1]. While most of our builders run inside the schroot,
> > there
> > might still be cases outside schroot where this is the expected
> > behavior.
> >
> > > > and fix it). My vote would be to kill that code but for now,
> > > > make
> > > > it
> > > > less wrong by restricting the search to regular files suffixed
> > > > with
> > > > .core (this would at least stop isar from moving directories
> > > > such
> > > > as
> > > > "org.eclipse.equinox.p2.core" out of the image).
> > > >
> > > > Signed-off-by: Cedric Hombourger
> > > > <cedric.hombourger@siemens.com>
> > > > ---
> > > >  meta/classes/image.bbclass | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/meta/classes/image.bbclass
> > > > b/meta/classes/image.bbclass
> > > > index 73f1d52c..793c21a2 100644
> > > > --- a/meta/classes/image.bbclass
> > > > +++ b/meta/classes/image.bbclass
> > > > @@ -457,7 +457,7 @@ EOSUDO
> > > >
> > > >      # Sometimes qemu-user-static generates coredumps in
> > > > chroot,
> > > > move them
> > > >      # to work temporary directory and inform user about it.
> > > > -    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
> > > > +    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core);
> > > > do
> > > >          sudo mv "${f}" "${WORKDIR}/temp/"
> > > >          bbwarn "found core dump in rootfs, check it in
> > > > ${WORKDIR}/temp/${f##*/}"
> > > >      done
> > >
> > > Yeah, too much heuristics in play now. We could add a list of
> > > valid
> > > "core" files on top, but maybe we should rather demand core file
> > > generation being disabled during the build and enforcing that.
> >
> > We had exactly that discussion on the KAS ML as well, were I tried
> > to
> > introduce a warning on default coredump configurations (which are
> > BTW
> > really tricky to debug on CI systems). However that was not
> > accepted as
> > the builders itself shall be responsible for a suitable coredump
> > configuration [2]. The truths is probably somewhere in between.
> >
> > Just disabling the coredump generation is not easily possible, as
> > there
> > is no "coredump" namespace in the kernel. By that, you would fiddle
> > around with the global system config and potentially interfere with
> > systemd (systemd-coredump).
> >
> > The probably best thing we could do is to look for common coredump
> > patterns and check with file if these are actual coredumps.
> >
>
> Can't we read out in isar what the effective settings are and use
> them
> at least? In addition to possibly adding some exceptions on a per-
> image
> basis.

Isn't that error-prone when kernel.core_pattern is configured to use an
external program to generate the coredump?

IMO, Isar should not do anything with coredumps, if the build did crash
with a coredump, shouldn't we attempt to debug qemu and fix the issue
there instead of masking the issue for the sole purpose of having a
"reproducible" build? Or maybe people concerned with reproducible
builds should make sure they configure kernel.core_pattern correctly to
e.g. always place coredumps in /var/coredumps. To me that code is out-
of-scope for Isar

>
> Jan
>
> > [1]
> > https://github.com/DynamoRIO/dynamorio/issues/6126
> > [2]
> > https://groups.google.com/g/kas-devel/c/-sEyujhICfw/m/1UtVfsDRAAAJ
> >
> > Best regards,
> > Felix
> >
> > >
> > > Jan
> > >
> > > --
> > > Siemens AG, Technology
> > > Linux Expert Center
> > >
> >
>

--
Cedric Hombourger
Siemens AG
http://www.siemens.com/
Uladzimir Bely March 22, 2024, 1:11 p.m. UTC | #5
On Tue, 2024-03-12 at 08:39 +0100, 'Cedric Hombourger' via isar-users
wrote:
> Code to search and delete core dumps in the build tree assumes that
> the build host has a kernel.core_pattern setting which would result
> in core dumps having a .core file suffix: this is not guaranteed. One
> may also argue that the build should have failed if a process
> executed
> under qemu-user got to crash (and we should check why qemu has
> crashed
> and fix it). My vote would be to kill that code but for now, make it
> less wrong by restricting the search to regular files suffixed with
> .core (this would at least stop isar from moving directories such as
> "org.eclipse.equinox.p2.core" out of the image).
> 
> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> ---
>  meta/classes/image.bbclass | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Hello everyone.

The patch itself passed CI, but the discussion looks unfinished. If no
one against, we are ready to merge it to next.

Best regards,
Uladzimir.
Uladzimir Bely March 26, 2024, 8:01 p.m. UTC | #6
On Tue, 2024-03-12 at 08:39 +0100, 'Cedric Hombourger' via isar-users
wrote:
> Code to search and delete core dumps in the build tree assumes that
> the build host has a kernel.core_pattern setting which would result
> in core dumps having a .core file suffix: this is not guaranteed. One
> may also argue that the build should have failed if a process
> executed
> under qemu-user got to crash (and we should check why qemu has
> crashed
> and fix it). My vote would be to kill that code but for now, make it
> less wrong by restricting the search to regular files suffixed with
> .core (this would at least stop isar from moving directories such as
> "org.eclipse.equinox.p2.core" out of the image).
> 
> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> ---
>  meta/classes/image.bbclass | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
> index 73f1d52c..793c21a2 100644
> --- a/meta/classes/image.bbclass
> +++ b/meta/classes/image.bbclass
> @@ -457,7 +457,7 @@ EOSUDO
>  
>      # Sometimes qemu-user-static generates coredumps in chroot, move
> them
>      # to work temporary directory and inform user about it.
> -    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
> +    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core); do
>          sudo mv "${f}" "${WORKDIR}/temp/"
>          bbwarn "found core dump in rootfs, check it in
> ${WORKDIR}/temp/${f##*/}"
>      done
> -- 
> 2.39.2
> 
Applied to next, thanks.

Patch

diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index 73f1d52c..793c21a2 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -457,7 +457,7 @@  EOSUDO
 
     # Sometimes qemu-user-static generates coredumps in chroot, move them
     # to work temporary directory and inform user about it.
-    for f in $(sudo find ${ROOTFSDIR} -name *.core); do
+    for f in $(sudo find ${ROOTFSDIR} -type f -name *.core); do
         sudo mv "${f}" "${WORKDIR}/temp/"
         bbwarn "found core dump in rootfs, check it in ${WORKDIR}/temp/${f##*/}"
     done