bootstrap: use only valid pairs in get_apt_source_mirror()

Message ID 20250319064937.68881-1-cedric.hombourger@siemens.com
State Accepted, archived
Headers show
Series bootstrap: use only valid pairs in get_apt_source_mirror() | expand

Commit Message

Cedric Hombourger March 19, 2025, 6:49 a.m. UTC
The following construct may generate [] entries:

     mirror_list = [entry.split()
                   for entry in premirrors.split('\\n')
                   if any(entry)]

A valid pre-mirror entry is a regex and replacement URL
tupple. This causes an unpack error when evaluating:

    for regex, replace in mirror_list

if the entry is e.g. " ".

For instance " re1 u1 \n re2 u2\n   " would be translated to
mirorr_list = [['re1','u1'],['re2','u2'],[]]: only the first
two entries have two values, the latter has none.

It should be noted that split() will do just fine when multiple
spaces are found between components of a valid entry (leading
and trailing spaces within an entry will not cause issues).

After checking if entry is iterable ("if any(entry)"), only
process entries with exactly two components (silently ignore
others) so we do not die with an uggly unpack error exception.

Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
---
 meta/classes/bootstrap.bbclass | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Felix Moessbauer March 19, 2025, 6:54 a.m. UTC | #1
On Wed, 2025-03-19 at 07:49 +0100, 'Cedric Hombourger' via isar-users
wrote:
> The following construct may generate [] entries:
> 
>      mirror_list = [entry.split()
>                    for entry in premirrors.split('\\n')
>                    if any(entry)]
> 
> A valid pre-mirror entry is a regex and replacement URL
> tupple. This causes an unpack error when evaluating:
> 
>     for regex, replace in mirror_list
> 
> if the entry is e.g. " ".
> 
> For instance " re1 u1 \n re2 u2\n   " would be translated to
> mirorr_list = [['re1','u1'],['re2','u2'],[]]: only the first
> two entries have two values, the latter has none.
> 
> It should be noted that split() will do just fine when multiple
> spaces are found between components of a valid entry (leading
> and trailing spaces within an entry will not cause issues).
> 
> After checking if entry is iterable ("if any(entry)"), only
> process entries with exactly two components (silently ignore
> others) so we do not die with an uggly unpack error exception.

Hi, I'm wondering if we really should ignore the malformed ones.
Probably we want to issue a warning in this case.

Anyways, the change makes sense.

Felix

> 
> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> ---
>  meta/classes/bootstrap.bbclass | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/classes/bootstrap.bbclass
> b/meta/classes/bootstrap.bbclass
> index c0644acb..64702d5d 100644
> --- a/meta/classes/bootstrap.bbclass
> +++ b/meta/classes/bootstrap.bbclass
> @@ -123,7 +123,7 @@ def get_apt_source_mirror(d,
> aptsources_entry_list):
>          premirrors = d.getVar('DISTRO_APT_PREMIRRORS') or ""
>      mirror_list = [entry.split()
>                    for entry in premirrors.split('\\n')
> -                  if any(entry)]
> +                  if any(entry) and len(entry.split()) == 2]
>  
>      for regex, replace in mirror_list:
>          match = re.search(regex, aptsources_entry_list[2])
> -- 
> 2.39.5
Cedric Hombourger March 19, 2025, 6:59 a.m. UTC | #2
On Wed, 2025-03-19 at 06:54 +0000, Moessbauer, Felix (FT RPD CED OES-
DE) wrote:
> On Wed, 2025-03-19 at 07:49 +0100, 'Cedric Hombourger' via isar-users
> wrote:
> > The following construct may generate [] entries:
> > 
> >      mirror_list = [entry.split()
> >                    for entry in premirrors.split('\\n')
> >                    if any(entry)]
> > 
> > A valid pre-mirror entry is a regex and replacement URL
> > tupple. This causes an unpack error when evaluating:
> > 
> >     for regex, replace in mirror_list
> > 
> > if the entry is e.g. " ".
> > 
> > For instance " re1 u1 \n re2 u2\n   " would be translated to
> > mirorr_list = [['re1','u1'],['re2','u2'],[]]: only the first
> > two entries have two values, the latter has none.
> > 
> > It should be noted that split() will do just fine when multiple
> > spaces are found between components of a valid entry (leading
> > and trailing spaces within an entry will not cause issues).
> > 
> > After checking if entry is iterable ("if any(entry)"), only
> > process entries with exactly two components (silently ignore
> > others) so we do not die with an uggly unpack error exception.
> 
> Hi, I'm wondering if we really should ignore the malformed ones.
> Probably we want to issue a warning in this case.

I do not either. That function has a comment noting that we cannot
produce errors from it. I was therefore wondering if I could emit
the warning at a later stage (when bitbake is done with parsing).
Hoping to have a follow-up patch soon.

> 
> Anyways, the change makes sense.
> 
> Felix
> 
> > 
> > Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> > ---
> >  meta/classes/bootstrap.bbclass | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/meta/classes/bootstrap.bbclass
> > b/meta/classes/bootstrap.bbclass
> > index c0644acb..64702d5d 100644
> > --- a/meta/classes/bootstrap.bbclass
> > +++ b/meta/classes/bootstrap.bbclass
> > @@ -123,7 +123,7 @@ def get_apt_source_mirror(d,
> > aptsources_entry_list):
> >          premirrors = d.getVar('DISTRO_APT_PREMIRRORS') or ""
> >      mirror_list = [entry.split()
> >                    for entry in premirrors.split('\\n')
> > -                  if any(entry)]
> > +                  if any(entry) and len(entry.split()) == 2]
> >  
> >      for regex, replace in mirror_list:
> >          match = re.search(regex, aptsources_entry_list[2])
> > -- 
> > 2.39.5
> 
> -- 
> Siemens AG
> Linux Expert Center
> Friedrich-Ludwig-Bauer-Str. 3
> 85748 Garching, Germany
>
Felix Moessbauer March 19, 2025, 7:09 a.m. UTC | #3
On Wed, 2025-03-19 at 06:59 +0000, Hombourger, Cedric (FT FDS CES LX)
wrote:
> On Wed, 2025-03-19 at 06:54 +0000, Moessbauer, Felix (FT RPD CED OES-
> DE) wrote:
> > On Wed, 2025-03-19 at 07:49 +0100, 'Cedric Hombourger' via isar-
> > users
> > wrote:
> > > The following construct may generate [] entries:
> > > 
> > >      mirror_list = [entry.split()
> > >                    for entry in premirrors.split('\\n')
> > >                    if any(entry)]
> > > 
> > > A valid pre-mirror entry is a regex and replacement URL
> > > tupple. This causes an unpack error when evaluating:
> > > 
> > >     for regex, replace in mirror_list
> > > 
> > > if the entry is e.g. " ".
> > > 
> > > For instance " re1 u1 \n re2 u2\n   " would be translated to
> > > mirorr_list = [['re1','u1'],['re2','u2'],[]]: only the first
> > > two entries have two values, the latter has none.
> > > 
> > > It should be noted that split() will do just fine when multiple
> > > spaces are found between components of a valid entry (leading
> > > and trailing spaces within an entry will not cause issues).
> > > 
> > > After checking if entry is iterable ("if any(entry)"), only
> > > process entries with exactly two components (silently ignore
> > > others) so we do not die with an uggly unpack error exception.
> > 
> > Hi, I'm wondering if we really should ignore the malformed ones.
> > Probably we want to issue a warning in this case.
> 
> I do not either. That function has a comment noting that we cannot
> produce errors from it. I was therefore wondering if I could emit
> the warning at a later stage (when bitbake is done with parsing).

Ah... this rings a bell. You can produce warnings from that function,
but as it is executed dozens of times, these warnings will fill up the
terminal.

> Hoping to have a follow-up patch soon.

IMHO, the change is fine for now. This processing anyways needs to be
redone once dep822 is supported (or even required).

Felix

> 
> > 
> > Anyways, the change makes sense.
> > 
> > Felix
> > 
> > > 
> > > Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> > > ---
> > >  meta/classes/bootstrap.bbclass | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/meta/classes/bootstrap.bbclass
> > > b/meta/classes/bootstrap.bbclass
> > > index c0644acb..64702d5d 100644
> > > --- a/meta/classes/bootstrap.bbclass
> > > +++ b/meta/classes/bootstrap.bbclass
> > > @@ -123,7 +123,7 @@ def get_apt_source_mirror(d,
> > > aptsources_entry_list):
> > >          premirrors = d.getVar('DISTRO_APT_PREMIRRORS') or ""
> > >      mirror_list = [entry.split()
> > >                    for entry in premirrors.split('\\n')
> > > -                  if any(entry)]
> > > +                  if any(entry) and len(entry.split()) == 2]
> > >  
> > >      for regex, replace in mirror_list:
> > >          match = re.search(regex, aptsources_entry_list[2])
> > > -- 
> > > 2.39.5
> > 
> > -- 
> > Siemens AG
> > Linux Expert Center
> > Friedrich-Ludwig-Bauer-Str. 3
> > 85748 Garching, Germany
> > 
> 
> -- 
> Cedric Hombourger
> Siemens AG
> www.siemens.com
Uladzimir Bely March 25, 2025, 4:57 p.m. UTC | #4
On Wed, 2025-03-19 at 07:49 +0100, 'Cedric Hombourger' via isar-users
wrote:
> The following construct may generate [] entries:
> 
>      mirror_list = [entry.split()
>                    for entry in premirrors.split('\\n')
>                    if any(entry)]
> 
> A valid pre-mirror entry is a regex and replacement URL
> tupple. This causes an unpack error when evaluating:
> 
>     for regex, replace in mirror_list
> 
> if the entry is e.g. " ".
> 
> For instance " re1 u1 \n re2 u2\n   " would be translated to
> mirorr_list = [['re1','u1'],['re2','u2'],[]]: only the first
> two entries have two values, the latter has none.
> 
> It should be noted that split() will do just fine when multiple
> spaces are found between components of a valid entry (leading
> and trailing spaces within an entry will not cause issues).
> 
> After checking if entry is iterable ("if any(entry)"), only
> process entries with exactly two components (silently ignore
> others) so we do not die with an uggly unpack error exception.
> 
> Signed-off-by: Cedric Hombourger <cedric.hombourger@siemens.com>
> ---
>  meta/classes/bootstrap.bbclass | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/classes/bootstrap.bbclass
> b/meta/classes/bootstrap.bbclass
> index c0644acb..64702d5d 100644
> --- a/meta/classes/bootstrap.bbclass
> +++ b/meta/classes/bootstrap.bbclass
> @@ -123,7 +123,7 @@ def get_apt_source_mirror(d,
> aptsources_entry_list):
>          premirrors = d.getVar('DISTRO_APT_PREMIRRORS') or ""
>      mirror_list = [entry.split()
>                    for entry in premirrors.split('\\n')
> -                  if any(entry)]
> +                  if any(entry) and len(entry.split()) == 2]
>  
>      for regex, replace in mirror_list:
>          match = re.search(regex, aptsources_entry_list[2])
> -- 
> 2.39.5

Applied to next, thanks.

Patch

diff --git a/meta/classes/bootstrap.bbclass b/meta/classes/bootstrap.bbclass
index c0644acb..64702d5d 100644
--- a/meta/classes/bootstrap.bbclass
+++ b/meta/classes/bootstrap.bbclass
@@ -123,7 +123,7 @@  def get_apt_source_mirror(d, aptsources_entry_list):
         premirrors = d.getVar('DISTRO_APT_PREMIRRORS') or ""
     mirror_list = [entry.split()
                   for entry in premirrors.split('\\n')
-                  if any(entry)]
+                  if any(entry) and len(entry.split()) == 2]
 
     for regex, replace in mirror_list:
         match = re.search(regex, aptsources_entry_list[2])