[0/5] allow creation of users/groups before rootfs creation

Message ID 20230125090156.284309-1-tobias.schaffner@siemens.com
Headers show
Series allow creation of users/groups before rootfs creation | expand

Message

Tobias Schaffner Jan. 25, 2023, 9:01 a.m. UTC
From: Tobias Schaffner <tobias.schaffner@siemens.com>

This patch series will allow to specify a `pre` flag for the USER_ and
GROUP_ bitbake variables. If this flag is set to `true` the given user
or group will be created in the rootfs configuration step instead of on
rootfs postprocessing. This is helpful when a specific id should be used
which would otherwise be picked by a user or group created by one of the
installed packages.

A rewrite of the image-account-extension in python was done on the way.
This allows us to drop a lot of encoding and parsing code that was used
to transition to shell and therefore made it easier to read and maintain.

Using python functions for more complex tasks allows us the usage of
unittests. A very basic infrastructure for unittesting using the build
in python unittest and the bb.parse module was added. This was used to
test the re-implementation of the image-account-extension as a first
showcase.

Tobias Schaffner (5):
  simplify image-account-extension
  allow creation of users/groups before rootfs creation
  create a minimal python unittest infrastructure
  add unittests for the image-account-extension
  set minimal python version in user_manual to 3.5

 doc/user_manual.md                            |   4 +-
 meta/classes/image-account-extension.bbclass  | 391 +++++++-----------
 testsuite/unittests/README.md                 |  28 ++
 testsuite/unittests/bitbake.py                |  37 ++
 testsuite/unittests/rootfs.py                 |  45 ++
 .../unittests/test_image_account_extension.py | 175 ++++++++
 6 files changed, 434 insertions(+), 246 deletions(-)
 create mode 100644 testsuite/unittests/README.md
 create mode 100644 testsuite/unittests/bitbake.py
 create mode 100644 testsuite/unittests/rootfs.py
 create mode 100644 testsuite/unittests/test_image_account_extension.py

Comments

Henning Schild Jan. 25, 2023, 1:29 p.m. UTC | #1
Am Wed, 25 Jan 2023 10:01:51 +0100
schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:

> From: Tobias Schaffner <tobias.schaffner@siemens.com>
> 
> This patch series will allow to specify a `pre` flag for the USER_ and
> GROUP_ bitbake variables. If this flag is set to `true` the given user
> or group will be created in the rootfs configuration step instead of
> on rootfs postprocessing. This is helpful when a specific id should
> be used which would otherwise be picked by a user or group created by
> one of the installed packages.

While i do understand the reason i am not sure how relevant that is.
Why would anything only function with a fixed ID? Whoever provided that
thing should maybe fix it.

So i am willing to say that this is super-niche! And it deserves a
niche-solution in its layer, not a feature in Isar.

You could hook in a task between bootstrap and image_install. Or you
could rebuild a bootstrap package to have reserved ids. You could run
"the thing" in namespaces ...

So is that really relevant? Please go into detail.

Whatever happens i think the python rewrite is cool. But the code may
have been coming/inspired from OE ... in which case it would not be
cool, because it would fork away further.

Henning

> A rewrite of the image-account-extension in python was done on the
> way. This allows us to drop a lot of encoding and parsing code that
> was used to transition to shell and therefore made it easier to read
> and maintain.
> 
> Using python functions for more complex tasks allows us the usage of
> unittests. A very basic infrastructure for unittesting using the build
> in python unittest and the bb.parse module was added. This was used to
> test the re-implementation of the image-account-extension as a first
> showcase.
> 
> Tobias Schaffner (5):
>   simplify image-account-extension
>   allow creation of users/groups before rootfs creation
>   create a minimal python unittest infrastructure
>   add unittests for the image-account-extension
>   set minimal python version in user_manual to 3.5
> 
>  doc/user_manual.md                            |   4 +-
>  meta/classes/image-account-extension.bbclass  | 391
> +++++++----------- testsuite/unittests/README.md                 |
> 28 ++ testsuite/unittests/bitbake.py                |  37 ++
>  testsuite/unittests/rootfs.py                 |  45 ++
>  .../unittests/test_image_account_extension.py | 175 ++++++++
>  6 files changed, 434 insertions(+), 246 deletions(-)
>  create mode 100644 testsuite/unittests/README.md
>  create mode 100644 testsuite/unittests/bitbake.py
>  create mode 100644 testsuite/unittests/rootfs.py
>  create mode 100644
> testsuite/unittests/test_image_account_extension.py
>
Quirin Gylstorff Jan. 25, 2023, 1:44 p.m. UTC | #2
On 1/25/23 14:29, Henning Schild wrote:
> Am Wed, 25 Jan 2023 10:01:51 +0100
> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> 
>> From: Tobias Schaffner <tobias.schaffner@siemens.com>
>>
>> This patch series will allow to specify a `pre` flag for the USER_ and
>> GROUP_ bitbake variables. If this flag is set to `true` the given user
>> or group will be created in the rootfs configuration step instead of
>> on rootfs postprocessing. This is helpful when a specific id should
>> be used which would otherwise be picked by a user or group created by
>> one of the installed packages.
> 
> While i do understand the reason i am not sure how relevant that is.
> Why would anything only function with a fixed ID? Whoever provided that
> thing should maybe fix it.

Specific IDs are necessary for Updating an A/B rootfs system with a 
persistent partition. In this case you do not want to change any IDs as
this can lead to wrong file permissions.
> 
> So i am willing to say that this is super-niche! And it deserves a
> niche-solution in its layer, not a feature in Isar.
> 
> You could hook in a task between bootstrap and image_install. Or you
> could rebuild a bootstrap package to have reserved ids. You could run
> "the thing" in namespaces ...
> 
> So is that really relevant? Please go into detail.
> 
> Whatever happens i think the python rewrite is cool. But the code may
> have been coming/inspired from OE ... in which case it would not be
> cool, because it would fork away further.


OE uses a complete different implementation than to original:
https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass

see also:

https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb 



Quirin
> 
> Henning
> 
>> A rewrite of the image-account-extension in python was done on the
>> way. This allows us to drop a lot of encoding and parsing code that
>> was used to transition to shell and therefore made it easier to read
>> and maintain.
>>
>> Using python functions for more complex tasks allows us the usage of
>> unittests. A very basic infrastructure for unittesting using the build
>> in python unittest and the bb.parse module was added. This was used to
>> test the re-implementation of the image-account-extension as a first
>> showcase.
>>
>> Tobias Schaffner (5):
>>    simplify image-account-extension
>>    allow creation of users/groups before rootfs creation
>>    create a minimal python unittest infrastructure
>>    add unittests for the image-account-extension
>>    set minimal python version in user_manual to 3.5
>>
>>   doc/user_manual.md                            |   4 +-
>>   meta/classes/image-account-extension.bbclass  | 391
>> +++++++----------- testsuite/unittests/README.md                 |
>> 28 ++ testsuite/unittests/bitbake.py                |  37 ++
>>   testsuite/unittests/rootfs.py                 |  45 ++
>>   .../unittests/test_image_account_extension.py | 175 ++++++++
>>   6 files changed, 434 insertions(+), 246 deletions(-)
>>   create mode 100644 testsuite/unittests/README.md
>>   create mode 100644 testsuite/unittests/bitbake.py
>>   create mode 100644 testsuite/unittests/rootfs.py
>>   create mode 100644
>> testsuite/unittests/test_image_account_extension.py
>>
>
Henning Schild Jan. 25, 2023, 4:29 p.m. UTC | #3
Am Wed, 25 Jan 2023 14:44:40 +0100
schrieb Gylstorff Quirin <quirin.gylstorff@siemens.com>:

> On 1/25/23 14:29, Henning Schild wrote:
> > Am Wed, 25 Jan 2023 10:01:51 +0100
> > schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> >   
> >> From: Tobias Schaffner <tobias.schaffner@siemens.com>
> >>
> >> This patch series will allow to specify a `pre` flag for the USER_
> >> and GROUP_ bitbake variables. If this flag is set to `true` the
> >> given user or group will be created in the rootfs configuration
> >> step instead of on rootfs postprocessing. This is helpful when a
> >> specific id should be used which would otherwise be picked by a
> >> user or group created by one of the installed packages.  
> > 
> > While i do understand the reason i am not sure how relevant that is.
> > Why would anything only function with a fixed ID? Whoever provided
> > that thing should maybe fix it.  
> 
> Specific IDs are necessary for Updating an A/B rootfs system with a 
> persistent partition. In this case you do not want to change any IDs
> as this can lead to wrong file permissions.

I see, that is much more understandable. It also goes into the
reproducible build direction.
But if that is the case the solution does not seem to be good enough
because it only considers users/groups created by isar. Not the ones
created by installing debian packages. Where the order could be
potentially random. Say you DEBIAN_DEPENDS or IMAGE_PREINSTALL "ftpd
wwwd" which will craete users "ftp www" where the two deamons do not
have any dep on each other and apt-get could install them in any order.
That order might in reality not change too often but it could i.e. when
you move from debian11 to debian12 or when you bring the third (or
10th) user-adding package into your new firmware.

So what you maybe really want is giving isar an /etc/passwd /etc/group
pair. Every new firmware is build with the given layer code and that
file-pair from the first release. Where you inject those files between
bootstrap and install ... hoping that bootstrap will always be the same.
Maybe one can inject before bootstrap ... or write a postproc function
that will find all different ids and all files and fix up. Or at least
start with an assertion in postproc that looks at the old database.

Is the problem of uid/gid depend on install order known in the debian
community and how do others solve it? Gentoo has moved from such
dynamic allocation to static some years ago, probably for similar
reasons.

Henning

> > 
> > So i am willing to say that this is super-niche! And it deserves a
> > niche-solution in its layer, not a feature in Isar.
> > 
> > You could hook in a task between bootstrap and image_install. Or you
> > could rebuild a bootstrap package to have reserved ids. You could
> > run "the thing" in namespaces ...
> > 
> > So is that really relevant? Please go into detail.
> > 
> > Whatever happens i think the python rewrite is cool. But the code
> > may have been coming/inspired from OE ... in which case it would
> > not be cool, because it would fork away further.  
> 
> 
> OE uses a complete different implementation than to original:
> https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass
> 
> see also:
> 
> https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb 
> 
> 
> 
> Quirin
> > 
> > Henning
> >   
> >> A rewrite of the image-account-extension in python was done on the
> >> way. This allows us to drop a lot of encoding and parsing code that
> >> was used to transition to shell and therefore made it easier to
> >> read and maintain.
> >>
> >> Using python functions for more complex tasks allows us the usage
> >> of unittests. A very basic infrastructure for unittesting using
> >> the build in python unittest and the bb.parse module was added.
> >> This was used to test the re-implementation of the
> >> image-account-extension as a first showcase.
> >>
> >> Tobias Schaffner (5):
> >>    simplify image-account-extension
> >>    allow creation of users/groups before rootfs creation
> >>    create a minimal python unittest infrastructure
> >>    add unittests for the image-account-extension
> >>    set minimal python version in user_manual to 3.5
> >>
> >>   doc/user_manual.md                            |   4 +-
> >>   meta/classes/image-account-extension.bbclass  | 391
> >> +++++++----------- testsuite/unittests/README.md                 |
> >> 28 ++ testsuite/unittests/bitbake.py                |  37 ++
> >>   testsuite/unittests/rootfs.py                 |  45 ++
> >>   .../unittests/test_image_account_extension.py | 175 ++++++++
> >>   6 files changed, 434 insertions(+), 246 deletions(-)
> >>   create mode 100644 testsuite/unittests/README.md
> >>   create mode 100644 testsuite/unittests/bitbake.py
> >>   create mode 100644 testsuite/unittests/rootfs.py
> >>   create mode 100644
> >> testsuite/unittests/test_image_account_extension.py
> >>  
> >   
> 
> 
> 
>
Tobias Schaffner Jan. 25, 2023, 8:55 p.m. UTC | #4
On 25.01.23 17:29, Schild, Henning (T CED SES-DE) wrote:
> Am Wed, 25 Jan 2023 14:44:40 +0100
> schrieb Gylstorff Quirin <quirin.gylstorff@siemens.com>:
> 
>> On 1/25/23 14:29, Henning Schild wrote:
>>> Am Wed, 25 Jan 2023 10:01:51 +0100
>>> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
>>>    
>>>> From: Tobias Schaffner <tobias.schaffner@siemens.com>
>>>>
>>>> This patch series will allow to specify a `pre` flag for the USER_
>>>> and GROUP_ bitbake variables. If this flag is set to `true` the
>>>> given user or group will be created in the rootfs configuration
>>>> step instead of on rootfs postprocessing. This is helpful when a
>>>> specific id should be used which would otherwise be picked by a
>>>> user or group created by one of the installed packages.
>>>
>>> While i do understand the reason i am not sure how relevant that is.
>>> Why would anything only function with a fixed ID? Whoever provided
>>> that thing should maybe fix it.
>>
>> Specific IDs are necessary for Updating an A/B rootfs system with a
>> persistent partition. In this case you do not want to change any IDs
>> as this can lead to wrong file permissions.
> 
> I see, that is much more understandable. It also goes into the
> reproducible build direction.
> But if that is the case the solution does not seem to be good enough
> because it only considers users/groups created by isar. Not the ones
> created by installing debian packages. Where the order could be
> potentially random. Say you DEBIAN_DEPENDS or IMAGE_PREINSTALL "ftpd
> wwwd" which will craete users "ftp www" where the two deamons do not
> have any dep on each other and apt-get could install them in any order.
> That order might in reality not change too often but it could i.e. when
> you move from debian11 to debian12 or when you bring the third (or
> 10th) user-adding package into your new firmware.

Thanks for the review!

This series is not about reproducible builds. The IDs of the users and
groups that are created by debian packages are expected to change
between _image_ versions. So we expect these IDs to be different after
a swupdate.

For reproducible builds it is important that the ordering of installed
packages and their dependencies stay the same but that is a different
story. I expect the ordering algorithm for a specific set of packages
including dependencies to be deterministic but would have to look into
that in detail.

Still, if somebody had a requirement that a user or group of a package
(e.g. ftp or www) stays the same, one could use this change to create
the user beforehand and pin its id using this change.

> So what you maybe really want is giving isar an /etc/passwd /etc/group
> pair. Every new firmware is build with the given layer code and that
> file-pair from the first release. Where you inject those files between
> bootstrap and install ... hoping that bootstrap will always be the same.
> Maybe one can inject before bootstrap ... or write a postproc function
> that will find all different ids and all files and fix up. Or at least
> start with an assertion in postproc that looks at the old database.

Care that just adding /etc/passwd and /etc/group might not be enough but
you would have to handle the side effects of a useradd/groupadd call
properly (E.g. creating the home if set). And I expect more things to
come up if you have a detailed look.

Take into account that the specific patch that introduces the pre flag
is small, easy to maintain and configure.

> Is the problem of uid/gid depend on install order known in the debian
> community and how do others solve it? Gentoo has moved from such
> dynamic allocation to static some years ago, probably for similar
> reasons.

I am open to discuss this with the reproducible build guys but again
IMHO this discussion belongs into another thread.

> Henning
> 
>>>
>>> So i am willing to say that this is super-niche! And it deserves a
>>> niche-solution in its layer, not a feature in Isar.
>>>
>>> You could hook in a task between bootstrap and image_install. Or you
>>> could rebuild a bootstrap package to have reserved ids. You could
>>> run "the thing" in namespaces ...
>>>
>>> So is that really relevant? Please go into detail.
>>>
>>> Whatever happens i think the python rewrite is cool. But the code
>>> may have been coming/inspired from OE ... in which case it would
>>> not be cool, because it would fork away further.
>>
>>
>> OE uses a complete different implementation than to original:
>> https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass
>>
>> see also:
>>
>> https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb
>>
>>
>>
>> Quirin
>>>
>>> Henning
>>>    
>>>> A rewrite of the image-account-extension in python was done on the
>>>> way. This allows us to drop a lot of encoding and parsing code that
>>>> was used to transition to shell and therefore made it easier to
>>>> read and maintain.
>>>>
>>>> Using python functions for more complex tasks allows us the usage
>>>> of unittests. A very basic infrastructure for unittesting using
>>>> the build in python unittest and the bb.parse module was added.
>>>> This was used to test the re-implementation of the
>>>> image-account-extension as a first showcase.
>>>>
>>>> Tobias Schaffner (5):
>>>>     simplify image-account-extension
>>>>     allow creation of users/groups before rootfs creation
>>>>     create a minimal python unittest infrastructure
>>>>     add unittests for the image-account-extension
>>>>     set minimal python version in user_manual to 3.5
>>>>
>>>>    doc/user_manual.md                            |   4 +-
>>>>    meta/classes/image-account-extension.bbclass  | 391
>>>> +++++++----------- testsuite/unittests/README.md                 |
>>>> 28 ++ testsuite/unittests/bitbake.py                |  37 ++
>>>>    testsuite/unittests/rootfs.py                 |  45 ++
>>>>    .../unittests/test_image_account_extension.py | 175 ++++++++
>>>>    6 files changed, 434 insertions(+), 246 deletions(-)
>>>>    create mode 100644 testsuite/unittests/README.md
>>>>    create mode 100644 testsuite/unittests/bitbake.py
>>>>    create mode 100644 testsuite/unittests/rootfs.py
>>>>    create mode 100644
>>>> testsuite/unittests/test_image_account_extension.py
>>>>   
>>>    
>>
>>
>>
>>
>
Henning Schild Jan. 25, 2023, 9:38 p.m. UTC | #5
Am Wed, 25 Jan 2023 21:55:00 +0100
schrieb "Schaffner, Tobias (T CED SES-DE)"
<tobias.schaffner@siemens.com>:

> On 25.01.23 17:29, Schild, Henning (T CED SES-DE) wrote:
> > Am Wed, 25 Jan 2023 14:44:40 +0100
> > schrieb Gylstorff Quirin <quirin.gylstorff@siemens.com>:
> >   
> >> On 1/25/23 14:29, Henning Schild wrote:  
> >>> Am Wed, 25 Jan 2023 10:01:51 +0100
> >>> schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> >>>      
> >>>> From: Tobias Schaffner <tobias.schaffner@siemens.com>
> >>>>
> >>>> This patch series will allow to specify a `pre` flag for the
> >>>> USER_ and GROUP_ bitbake variables. If this flag is set to
> >>>> `true` the given user or group will be created in the rootfs
> >>>> configuration step instead of on rootfs postprocessing. This is
> >>>> helpful when a specific id should be used which would otherwise
> >>>> be picked by a user or group created by one of the installed
> >>>> packages.  
> >>>
> >>> While i do understand the reason i am not sure how relevant that
> >>> is. Why would anything only function with a fixed ID? Whoever
> >>> provided that thing should maybe fix it.  
> >>
> >> Specific IDs are necessary for Updating an A/B rootfs system with a
> >> persistent partition. In this case you do not want to change any
> >> IDs as this can lead to wrong file permissions.  
> > 
> > I see, that is much more understandable. It also goes into the
> > reproducible build direction.
> > But if that is the case the solution does not seem to be good enough
> > because it only considers users/groups created by isar. Not the ones
> > created by installing debian packages. Where the order could be
> > potentially random. Say you DEBIAN_DEPENDS or IMAGE_PREINSTALL "ftpd
> > wwwd" which will craete users "ftp www" where the two deamons do not
> > have any dep on each other and apt-get could install them in any
> > order. That order might in reality not change too often but it
> > could i.e. when you move from debian11 to debian12 or when you
> > bring the third (or 10th) user-adding package into your new
> > firmware.  
> 
> Thanks for the review!
> 
> This series is not about reproducible builds. The IDs of the users and
> groups that are created by debian packages are expected to change
> between _image_ versions. So we expect these IDs to be different after
> a swupdate.

But your initial argument was that you have to create a user before any
debian package comes, or did i misunderstand? Maybe we need to
differentiate between upstream packages and isar-created ones?

> For reproducible builds it is important that the ordering of installed
> packages and their dependencies stay the same but that is a different
> story. I expect the ordering algorithm for a specific set of packages
> including dependencies to be deterministic but would have to look into
> that in detail.

I expect the algorithm to be non-deterministic ... potentially as you
progress using debian. I can just claim, you have to prove.

> Still, if somebody had a requirement that a user or group of a package
> (e.g. ftp or www) stays the same, one could use this change to create
> the user beforehand and pin its id using this change.

True. But we might still want to keep the old database to run
assertions so such things do not go unnoticed.

> > So what you maybe really want is giving isar an /etc/passwd
> > /etc/group pair. Every new firmware is build with the given layer
> > code and that file-pair from the first release. Where you inject
> > those files between bootstrap and install ... hoping that bootstrap
> > will always be the same. Maybe one can inject before bootstrap ...
> > or write a postproc function that will find all different ids and
> > all files and fix up. Or at least start with an assertion in
> > postproc that looks at the old database.  
> 
> Care that just adding /etc/passwd and /etc/group might not be enough
> but you would have to handle the side effects of a useradd/groupadd
> call properly (E.g. creating the home if set). And I expect more
> things to come up if you have a detailed look.

Yes. Not even want to think about LDAP or yp where the databases live
in different files.

> Take into account that the specific patch that introduces the pre flag
> is small, easy to maintain and configure.

It should still be motivated so that we later understand why we have
it. And i would not call it easy to configure if one does not
understand why ... people will "pre" randomly because they do not know
what they are doing. We moved that stuff from pre to post once, and i
can not help the feeling that people might want to want to mix pre and
post one day ... at which point it might become really un-easy.

> > Is the problem of uid/gid depend on install order known in the
> > debian community and how do others solve it? Gentoo has moved from
> > such dynamic allocation to static some years ago, probably for
> > similar reasons.  
> 
> I am open to discuss this with the reproducible build guys but again
> IMHO this discussion belongs into another thread.

IMHO not, the repro guys want full repro ... which we can happily leave
out when it comes to me. You want partial repro, which we should
discuss here.

I do not want to hold this back, just understand it and see if there
can be a better solution. Feel free to ignore me and merge if nobody
else asks questions.

Henning

> > Henning
> >   
> >>>
> >>> So i am willing to say that this is super-niche! And it deserves a
> >>> niche-solution in its layer, not a feature in Isar.
> >>>
> >>> You could hook in a task between bootstrap and image_install. Or
> >>> you could rebuild a bootstrap package to have reserved ids. You
> >>> could run "the thing" in namespaces ...
> >>>
> >>> So is that really relevant? Please go into detail.
> >>>
> >>> Whatever happens i think the python rewrite is cool. But the code
> >>> may have been coming/inspired from OE ... in which case it would
> >>> not be cool, because it would fork away further.  
> >>
> >>
> >> OE uses a complete different implementation than to original:
> >> https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass
> >>
> >> see also:
> >>
> >> https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb
> >>
> >>
> >>
> >> Quirin  
> >>>
> >>> Henning
> >>>      
> >>>> A rewrite of the image-account-extension in python was done on
> >>>> the way. This allows us to drop a lot of encoding and parsing
> >>>> code that was used to transition to shell and therefore made it
> >>>> easier to read and maintain.
> >>>>
> >>>> Using python functions for more complex tasks allows us the usage
> >>>> of unittests. A very basic infrastructure for unittesting using
> >>>> the build in python unittest and the bb.parse module was added.
> >>>> This was used to test the re-implementation of the
> >>>> image-account-extension as a first showcase.
> >>>>
> >>>> Tobias Schaffner (5):
> >>>>     simplify image-account-extension
> >>>>     allow creation of users/groups before rootfs creation
> >>>>     create a minimal python unittest infrastructure
> >>>>     add unittests for the image-account-extension
> >>>>     set minimal python version in user_manual to 3.5
> >>>>
> >>>>    doc/user_manual.md                            |   4 +-
> >>>>    meta/classes/image-account-extension.bbclass  | 391
> >>>> +++++++----------- testsuite/unittests/README.md
> >>>> | 28 ++ testsuite/unittests/bitbake.py                |  37 ++
> >>>>    testsuite/unittests/rootfs.py                 |  45 ++
> >>>>    .../unittests/test_image_account_extension.py | 175 ++++++++
> >>>>    6 files changed, 434 insertions(+), 246 deletions(-)
> >>>>    create mode 100644 testsuite/unittests/README.md
> >>>>    create mode 100644 testsuite/unittests/bitbake.py
> >>>>    create mode 100644 testsuite/unittests/rootfs.py
> >>>>    create mode 100644
> >>>> testsuite/unittests/test_image_account_extension.py
> >>>>     
> >>>      
> >>
> >>
> >>
> >>  
> >
Florian Bezdeka Jan. 26, 2023, 8:48 a.m. UTC | #6
On Thu, 2023-01-26 at 08:21 +0000, Schaffner, Tobias wrote:
> On 25.01.23 22:38, Schild, Henning (T CED SES-DE) wrote:
> > Am Wed, 25 Jan 2023 21:55:00 +0100
> > schrieb "Schaffner, Tobias (T CED SES-DE)"
> > <tobias.schaffner@siemens.com>:
> > 
> > > On 25.01.23 17:29, Schild, Henning (T CED SES-DE) wrote:
> > > > Am Wed, 25 Jan 2023 14:44:40 +0100
> > > > schrieb Gylstorff Quirin <quirin.gylstorff@siemens.com>:
> > > >    
> > > > > On 1/25/23 14:29, Henning Schild wrote:
> > > > > > Am Wed, 25 Jan 2023 10:01:51 +0100
> > > > > > schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> > > > > >       
> > > > > > > From: Tobias Schaffner <tobias.schaffner@siemens.com>
> > > > > > > 
> > > > > > > This patch series will allow to specify a `pre` flag for the
> > > > > > > USER_ and GROUP_ bitbake variables. If this flag is set to
> > > > > > > `true` the given user or group will be created in the rootfs
> > > > > > > configuration step instead of on rootfs postprocessing. This is
> > > > > > > helpful when a specific id should be used which would otherwise
> > > > > > > be picked by a user or group created by one of the installed
> > > > > > > packages.
> > > > > > 
> > > > > > While i do understand the reason i am not sure how relevant that
> > > > > > is. Why would anything only function with a fixed ID? Whoever
> > > > > > provided that thing should maybe fix it.
> > > > > 
> > > > > Specific IDs are necessary for Updating an A/B rootfs system with a
> > > > > persistent partition. In this case you do not want to change any
> > > > > IDs as this can lead to wrong file permissions.
> > > > 
> > > > I see, that is much more understandable. It also goes into the
> > > > reproducible build direction.
> > > > But if that is the case the solution does not seem to be good enough
> > > > because it only considers users/groups created by isar. Not the ones
> > > > created by installing debian packages. Where the order could be
> > > > potentially random. Say you DEBIAN_DEPENDS or IMAGE_PREINSTALL "ftpd
> > > > wwwd" which will craete users "ftp www" where the two deamons do not
> > > > have any dep on each other and apt-get could install them in any
> > > > order. That order might in reality not change too often but it
> > > > could i.e. when you move from debian11 to debian12 or when you
> > > > bring the third (or 10th) user-adding package into your new
> > > > firmware.
> > > 
> > > Thanks for the review!
> > > 
> > > This series is not about reproducible builds. The IDs of the users and
> > > groups that are created by debian packages are expected to change
> > > between _image_ versions. So we expect these IDs to be different after
> > > a swupdate.
> > 
> > But your initial argument was that you have to create a user before any
> > debian package comes, or did i misunderstand? Maybe we need to
> > differentiate between upstream packages and isar-created ones?
> 
> Lets make this clearer with an example:
> A downstream layer creates one user X that writes to a separate partition.
> This user gets the ID 1000 as there are no packages that create any users.
> 
> Now they want to create a image update that should be deployed.
> The updated image includes a new debian package that creates a user Y.
> The user Y of this package gets the ID 1000 as it is now the first user that
> is created.
> 
> User X will now get 1001 and it is not possible to change this. The data
> created on the separate partition belongs to user Y.
> 
> The only resort at the moment is to revert the patch that moved the user
> creation to the post processing.

Hm... I have the strong feeling that your approach is broken in the
same way...

I was questioning myself how Debian itself solves this problem. The
answer is: Debian has access to the "old" or "current" user database
while doing updates. So it can not happen that a user gets a different
UID. We would have to feed in the "current" user database during such
updates to do it the same way.

Why your approach is broken: Consider three image versions A, B and C.
There is no guarantee that someone will update from A to B and then to
C. A -> C is also possible.

Image A can add a user using your "pre" logic, B can remove that user,
C can introduce a new different user, but with same ID. That brings you
into the same situation.

As Isar can't tell if that is a dangerous scenario for all the possible
use cases it's a NACK security wise.


> 
> > > For reproducible builds it is important that the ordering of installed
> > > packages and their dependencies stay the same but that is a different
> > > story. I expect the ordering algorithm for a specific set of packages
> > > including dependencies to be deterministic but would have to look into
> > > that in detail.
> > 
> > I expect the algorithm to be non-deterministic ... potentially as you
> > progress using debian. I can just claim, you have to prove.
> 
>  From my point of view this problem derives from the fact that the set of
> installed packages will change on image updates. If the user Y is created
> in the first or in the last package does not matter. Its just the fact
> that an additional user gets created.
> 
> > > Still, if somebody had a requirement that a user or group of a package
> > > (e.g. ftp or www) stays the same, one could use this change to create
> > > the user beforehand and pin its id using this change.
> > 
> > True. But we might still want to keep the old database to run
> > assertions so such things do not go unnoticed.
> > 
> > > > So what you maybe really want is giving isar an /etc/passwd
> > > > /etc/group pair. Every new firmware is build with the given layer
> > > > code and that file-pair from the first release. Where you inject
> > > > those files between bootstrap and install ... hoping that bootstrap
> > > > will always be the same. Maybe one can inject before bootstrap ...
> > > > or write a postproc function that will find all different ids and
> > > > all files and fix up. Or at least start with an assertion in
> > > > postproc that looks at the old database.
> > > 
> > > Care that just adding /etc/passwd and /etc/group might not be enough
> > > but you would have to handle the side effects of a useradd/groupadd
> > > call properly (E.g. creating the home if set). And I expect more
> > > things to come up if you have a detailed look.
> > 
> > Yes. Not even want to think about LDAP or yp where the databases live
> > in different files.
> > 
> > > Take into account that the specific patch that introduces the pre flag
> > > is small, easy to maintain and configure.
> > 
> > It should still be motivated so that we later understand why we have
> > it. And i would not call it easy to configure if one does not
> > understand why ... people will "pre" randomly because they do not know
> > what they are doing. We moved that stuff from pre to post once, and i
> > can not help the feeling that people might want to want to mix pre and
> > post one day ... at which point it might become really un-easy.
> > 
> > > > Is the problem of uid/gid depend on install order known in the
> > > > debian community and how do others solve it? Gentoo has moved from
> > > > such dynamic allocation to static some years ago, probably for
> > > > similar reasons.
> > > 
> > > I am open to discuss this with the reproducible build guys but again
> > > IMHO this discussion belongs into another thread.
> > 
> > IMHO not, the repro guys want full repro ... which we can happily leave
> > out when it comes to me. You want partial repro, which we should
> > discuss here.
> 
> I think you may have a valid point that a non deterministic ordering on
> package install may be a problem for reproducable builds. Not only when
> it comes to users but also e.g. every time two packages append to the same
> file.
> 
> > I do not want to hold this back, just understand it and see if there
> > can be a better solution. Feel free to ignore me and merge if nobody
> > else asks questions.
> 
> I am happy to discuss this, I just think we are mixing things up. But
> maybe I am missing something.
> 
> Best!
> 
> > Henning
> > 
> > > > Henning
> > > >    
> > > > > > 
> > > > > > So i am willing to say that this is super-niche! And it deserves a
> > > > > > niche-solution in its layer, not a feature in Isar.
> > > > > > 
> > > > > > You could hook in a task between bootstrap and image_install. Or
> > > > > > you could rebuild a bootstrap package to have reserved ids. You
> > > > > > could run "the thing" in namespaces ...
> > > > > > 
> > > > > > So is that really relevant? Please go into detail.
> > > > > > 
> > > > > > Whatever happens i think the python rewrite is cool. But the code
> > > > > > may have been coming/inspired from OE ... in which case it would
> > > > > > not be cool, because it would fork away further.
> > > > > 
> > > > > 
> > > > > OE uses a complete different implementation than to original:
> > > > > https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass
> > > > > 
> > > > > see also:
> > > > > 
> > > > > https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb
> > > > > 
> > > > > 
> > > > > 
> > > > > Quirin
> > > > > > 
> > > > > > Henning
> > > > > >       
> > > > > > > A rewrite of the image-account-extension in python was done on
> > > > > > > the way. This allows us to drop a lot of encoding and parsing
> > > > > > > code that was used to transition to shell and therefore made it
> > > > > > > easier to read and maintain.
> > > > > > > 
> > > > > > > Using python functions for more complex tasks allows us the usage
> > > > > > > of unittests. A very basic infrastructure for unittesting using
> > > > > > > the build in python unittest and the bb.parse module was added.
> > > > > > > This was used to test the re-implementation of the
> > > > > > > image-account-extension as a first showcase.
> > > > > > > 
> > > > > > > Tobias Schaffner (5):
> > > > > > >      simplify image-account-extension
> > > > > > >      allow creation of users/groups before rootfs creation
> > > > > > >      create a minimal python unittest infrastructure
> > > > > > >      add unittests for the image-account-extension
> > > > > > >      set minimal python version in user_manual to 3.5
> > > > > > > 
> > > > > > >     doc/user_manual.md                            |   4 +-
> > > > > > >     meta/classes/image-account-extension.bbclass  | 391
> > > > > > > +++++++----------- testsuite/unittests/README.md
> > > > > > > > 28 ++ testsuite/unittests/bitbake.py                |  37 ++
> > > > > > >     testsuite/unittests/rootfs.py                 |  45 ++
> > > > > > >     .../unittests/test_image_account_extension.py | 175 ++++++++
> > > > > > >     6 files changed, 434 insertions(+), 246 deletions(-)
> > > > > > >     create mode 100644 testsuite/unittests/README.md
> > > > > > >     create mode 100644 testsuite/unittests/bitbake.py
> > > > > > >     create mode 100644 testsuite/unittests/rootfs.py
> > > > > > >     create mode 100644
> > > > > > > testsuite/unittests/test_image_account_extension.py
> > > > > > >      
> > > > > >       
> > > > > 
> > > > > 
> > > > > 
> > > > >   
> > > >    
> > 
>
Henning Schild Jan. 26, 2023, 10:27 a.m. UTC | #7
Am Thu, 26 Jan 2023 09:48:06 +0100
schrieb Florian Bezdeka <florian.bezdeka@siemens.com>:

> On Thu, 2023-01-26 at 08:21 +0000, Schaffner, Tobias wrote:
> > On 25.01.23 22:38, Schild, Henning (T CED SES-DE) wrote:  
> > > Am Wed, 25 Jan 2023 21:55:00 +0100
> > > schrieb "Schaffner, Tobias (T CED SES-DE)"
> > > <tobias.schaffner@siemens.com>:
> > >   
> > > > On 25.01.23 17:29, Schild, Henning (T CED SES-DE) wrote:  
> > > > > Am Wed, 25 Jan 2023 14:44:40 +0100
> > > > > schrieb Gylstorff Quirin <quirin.gylstorff@siemens.com>:
> > > > >      
> > > > > > On 1/25/23 14:29, Henning Schild wrote:  
> > > > > > > Am Wed, 25 Jan 2023 10:01:51 +0100
> > > > > > > schrieb "T. Schaffner" <tobias.schaffner@siemens.com>:
> > > > > > >         
> > > > > > > > From: Tobias Schaffner <tobias.schaffner@siemens.com>
> > > > > > > > 
> > > > > > > > This patch series will allow to specify a `pre` flag
> > > > > > > > for the USER_ and GROUP_ bitbake variables. If this
> > > > > > > > flag is set to `true` the given user or group will be
> > > > > > > > created in the rootfs configuration step instead of on
> > > > > > > > rootfs postprocessing. This is helpful when a specific
> > > > > > > > id should be used which would otherwise be picked by a
> > > > > > > > user or group created by one of the installed packages.
> > > > > > > >  
> > > > > > > 
> > > > > > > While i do understand the reason i am not sure how
> > > > > > > relevant that is. Why would anything only function with a
> > > > > > > fixed ID? Whoever provided that thing should maybe fix
> > > > > > > it.  
> > > > > > 
> > > > > > Specific IDs are necessary for Updating an A/B rootfs
> > > > > > system with a persistent partition. In this case you do not
> > > > > > want to change any IDs as this can lead to wrong file
> > > > > > permissions.  
> > > > > 
> > > > > I see, that is much more understandable. It also goes into the
> > > > > reproducible build direction.
> > > > > But if that is the case the solution does not seem to be good
> > > > > enough because it only considers users/groups created by
> > > > > isar. Not the ones created by installing debian packages.
> > > > > Where the order could be potentially random. Say you
> > > > > DEBIAN_DEPENDS or IMAGE_PREINSTALL "ftpd wwwd" which will
> > > > > craete users "ftp www" where the two deamons do not have any
> > > > > dep on each other and apt-get could install them in any
> > > > > order. That order might in reality not change too often but
> > > > > it could i.e. when you move from debian11 to debian12 or when
> > > > > you bring the third (or 10th) user-adding package into your
> > > > > new firmware.  
> > > > 
> > > > Thanks for the review!
> > > > 
> > > > This series is not about reproducible builds. The IDs of the
> > > > users and groups that are created by debian packages are
> > > > expected to change between _image_ versions. So we expect these
> > > > IDs to be different after a swupdate.  
> > > 
> > > But your initial argument was that you have to create a user
> > > before any debian package comes, or did i misunderstand? Maybe we
> > > need to differentiate between upstream packages and isar-created
> > > ones?  
> > 
> > Lets make this clearer with an example:
> > A downstream layer creates one user X that writes to a separate
> > partition. This user gets the ID 1000 as there are no packages that
> > create any users.
> > 
> > Now they want to create a image update that should be deployed.
> > The updated image includes a new debian package that creates a user
> > Y. The user Y of this package gets the ID 1000 as it is now the
> > first user that is created.
> > 
> > User X will now get 1001 and it is not possible to change this. The
> > data created on the separate partition belongs to user Y.
> > 
> > The only resort at the moment is to revert the patch that moved the
> > user creation to the post processing.  
> 
> Hm... I have the strong feeling that your approach is broken in the
> same way...
> 
> I was questioning myself how Debian itself solves this problem. The
> answer is: Debian has access to the "old" or "current" user database
> while doing updates. So it can not happen that a user gets a different
> UID. We would have to feed in the "current" user database during such
> updates to do it the same way.
> 
> Why your approach is broken: Consider three image versions A, B and C.
> There is no guarantee that someone will update from A to B and then to
> C. A -> C is also possible.
> 
> Image A can add a user using your "pre" logic, B can remove that user,
> C can introduce a new different user, but with same ID. That brings
> you into the same situation.

Whether a firmware update supports skipping intermediates is up to
whoever builds a product. Interface changing releases should always
support old and new interface. And at some point one can drop the old.

The most trivial/conservative implementation would be to make any
update a stop-release which has to be taken. That greatly reduces the
testing effort and you have to think less about interfaces. But it can
lead to a bad user experience if you have to update too often because
you can not just jump.

Henning

> As Isar can't tell if that is a dangerous scenario for all the
> possible use cases it's a NACK security wise.
> 
> 
> >   
> > > > For reproducible builds it is important that the ordering of
> > > > installed packages and their dependencies stay the same but
> > > > that is a different story. I expect the ordering algorithm for
> > > > a specific set of packages including dependencies to be
> > > > deterministic but would have to look into that in detail.  
> > > 
> > > I expect the algorithm to be non-deterministic ... potentially as
> > > you progress using debian. I can just claim, you have to prove.  
> > 
> >  From my point of view this problem derives from the fact that the
> > set of installed packages will change on image updates. If the user
> > Y is created in the first or in the last package does not matter.
> > Its just the fact that an additional user gets created.
> >   
> > > > Still, if somebody had a requirement that a user or group of a
> > > > package (e.g. ftp or www) stays the same, one could use this
> > > > change to create the user beforehand and pin its id using this
> > > > change.  
> > > 
> > > True. But we might still want to keep the old database to run
> > > assertions so such things do not go unnoticed.
> > >   
> > > > > So what you maybe really want is giving isar an /etc/passwd
> > > > > /etc/group pair. Every new firmware is build with the given
> > > > > layer code and that file-pair from the first release. Where
> > > > > you inject those files between bootstrap and install ...
> > > > > hoping that bootstrap will always be the same. Maybe one can
> > > > > inject before bootstrap ... or write a postproc function that
> > > > > will find all different ids and all files and fix up. Or at
> > > > > least start with an assertion in postproc that looks at the
> > > > > old database.  
> > > > 
> > > > Care that just adding /etc/passwd and /etc/group might not be
> > > > enough but you would have to handle the side effects of a
> > > > useradd/groupadd call properly (E.g. creating the home if set).
> > > > And I expect more things to come up if you have a detailed
> > > > look.  
> > > 
> > > Yes. Not even want to think about LDAP or yp where the databases
> > > live in different files.
> > >   
> > > > Take into account that the specific patch that introduces the
> > > > pre flag is small, easy to maintain and configure.  
> > > 
> > > It should still be motivated so that we later understand why we
> > > have it. And i would not call it easy to configure if one does not
> > > understand why ... people will "pre" randomly because they do not
> > > know what they are doing. We moved that stuff from pre to post
> > > once, and i can not help the feeling that people might want to
> > > want to mix pre and post one day ... at which point it might
> > > become really un-easy. 
> > > > > Is the problem of uid/gid depend on install order known in the
> > > > > debian community and how do others solve it? Gentoo has moved
> > > > > from such dynamic allocation to static some years ago,
> > > > > probably for similar reasons.  
> > > > 
> > > > I am open to discuss this with the reproducible build guys but
> > > > again IMHO this discussion belongs into another thread.  
> > > 
> > > IMHO not, the repro guys want full repro ... which we can happily
> > > leave out when it comes to me. You want partial repro, which we
> > > should discuss here.  
> > 
> > I think you may have a valid point that a non deterministic
> > ordering on package install may be a problem for reproducable
> > builds. Not only when it comes to users but also e.g. every time
> > two packages append to the same file.
> >   
> > > I do not want to hold this back, just understand it and see if
> > > there can be a better solution. Feel free to ignore me and merge
> > > if nobody else asks questions.  
> > 
> > I am happy to discuss this, I just think we are mixing things up.
> > But maybe I am missing something.
> > 
> > Best!
> >   
> > > Henning
> > >   
> > > > > Henning
> > > > >      
> > > > > > > 
> > > > > > > So i am willing to say that this is super-niche! And it
> > > > > > > deserves a niche-solution in its layer, not a feature in
> > > > > > > Isar.
> > > > > > > 
> > > > > > > You could hook in a task between bootstrap and
> > > > > > > image_install. Or you could rebuild a bootstrap package
> > > > > > > to have reserved ids. You could run "the thing" in
> > > > > > > namespaces ...
> > > > > > > 
> > > > > > > So is that really relevant? Please go into detail.
> > > > > > > 
> > > > > > > Whatever happens i think the python rewrite is cool. But
> > > > > > > the code may have been coming/inspired from OE ... in
> > > > > > > which case it would not be cool, because it would fork
> > > > > > > away further.  
> > > > > > 
> > > > > > 
> > > > > > OE uses a complete different implementation than to
> > > > > > original:
> > > > > > https://git.openembedded.org/openembedded-core/tree/meta/classes/useradd.bbclass
> > > > > > 
> > > > > > see also:
> > > > > > 
> > > > > > https://git.openembedded.org/openembedded-core/tree/meta-skeleton/recipes-skeleton/useradd/useradd-example.bb
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Quirin  
> > > > > > > 
> > > > > > > Henning
> > > > > > >         
> > > > > > > > A rewrite of the image-account-extension in python was
> > > > > > > > done on the way. This allows us to drop a lot of
> > > > > > > > encoding and parsing code that was used to transition
> > > > > > > > to shell and therefore made it easier to read and
> > > > > > > > maintain.
> > > > > > > > 
> > > > > > > > Using python functions for more complex tasks allows us
> > > > > > > > the usage of unittests. A very basic infrastructure for
> > > > > > > > unittesting using the build in python unittest and the
> > > > > > > > bb.parse module was added. This was used to test the
> > > > > > > > re-implementation of the image-account-extension as a
> > > > > > > > first showcase.
> > > > > > > > 
> > > > > > > > Tobias Schaffner (5):
> > > > > > > >      simplify image-account-extension
> > > > > > > >      allow creation of users/groups before rootfs
> > > > > > > > creation create a minimal python unittest infrastructure
> > > > > > > >      add unittests for the image-account-extension
> > > > > > > >      set minimal python version in user_manual to 3.5
> > > > > > > > 
> > > > > > > >     doc/user_manual.md                            |   4
> > > > > > > > +- meta/classes/image-account-extension.bbclass  | 391
> > > > > > > > +++++++----------- testsuite/unittests/README.md  
> > > > > > > > > 28 ++ testsuite/unittests/bitbake.py                |
> > > > > > > > >  37 ++  
> > > > > > > >     testsuite/unittests/rootfs.py                 |  45
> > > > > > > > ++ .../unittests/test_image_account_extension.py | 175
> > > > > > > > ++++++++ 6 files changed, 434 insertions(+), 246
> > > > > > > > deletions(-) create mode 100644
> > > > > > > > testsuite/unittests/README.md create mode 100644
> > > > > > > > testsuite/unittests/bitbake.py create mode 100644
> > > > > > > > testsuite/unittests/rootfs.py create mode 100644
> > > > > > > > testsuite/unittests/test_image_account_extension.py
> > > > > > > >        
> > > > > > >         
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > >     
> > > > >      
> > >   
> >   
>