Skip to content

Incorrect license for text-unidecode-1. 3: gpl-2.0-plus, gpl-2.0, artistic-1.0, artistic-perl-1.0 #5109

@hesa

Description

@hesa

Description

The LICENSE file of text-unidecode is incorrectly identified as: 'gpl-2.0-plus AND gpl-2.0 AND (gpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0) AND artistic-perl-1.0'

I think that the text that causes the problem is:

text-unidecode is a free software; you can redistribute
it and/or modify it under the terms of either:

* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or
* Artistic License - see below:

The matches (of the text above) are:

  • gpl-2.0-plus matched by text-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or
  • gpl-2.0 matched by text-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/l\ icense-list.html#GNUGPL), or
  • gpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0 matched by * GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or\n* Artistic License - see below:

Is scancode matching the same text sections too many times?

Note: artistic-perl-1.0 is identified from the license text in the LICENSE file referred to as "Artistic License"

How To Reproduce

$ curl -LJO https://files.pythonhosted.org/packages/ab/e2/e9a00f0ccb71718418230718b3d900e71a5d16e701a3dae079a21e9cd8f8/text-unidecode-1.3.tar.gz
$ tar zxvf  text-unidecode-1.3.tar.gz

$ scancode -clipe --license-text --license-text-diagnostics --classify --license-clarity-score --todo --license-diagnostics --summary -n 16 --json-pp text-unidecode-1.3-scan.json text-unideco\
de-1.3

Note that the following files are also identified with an AND` between the licenses, but that is because of the following

Classifier: License :: OSI Approved :: Artistic License
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: License :: OSI Approved :: GNU General Public License v2 or later (GPLv2+)

... which scancode really can't do much about. The safe interpretation of listing thee licenses as to add an AND in between them.

System configuration

OS: Ubuntu 24-04
Scancode-toolkit: 32.5.0 (using pip)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions