I want to get all the installed packages licenses on my Ubuntu server, I can dump it all by using (this 2013 post):
packages=$( dpkg --get-selections | awk '{ print $1 }' )
for package in $packages; do
echo "$package: "
cat /usr/share/doc/$package/copyright
echo; echo
done > /tmp/licenses.txt
less /tmp/licenses.txt
But the output is a huge useless file with all the copyright data for each package. I need something like:
package: package_name licence: licence_name
Is there a parser or some other tool to get data like this?
What you are trying is poorly supported at the moment, though there is an effort under way to provide machine-readable information in the file /usr/share/doc/*/copyright
files. See for example this excerpt:
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
Upstream-Name: at
Source: git://anonscm.debian.org/collab-maint/at.git
Comment: This package was debianized by its author Thomas Koenig
<ig25@rz.uni-karlsruhe.de>, taken over and re-packaged first by Martin
Schulze <joey@debian.org> and then by Siggy Brentrup <bsb@winnegan.de>,
and then taken over by Ryan Murray <rmurray@debian.org>.
.
In August 2009 the upstream development and Debian packaging were taken over
by Ansgar Burchardt <ansgar@debian.org> and Cyril Brulebois <kibi@debian.org>.
.
This may be considered the experimental upstream source, and since there
doesn't seem to be any other upstream source, the only upstream source.
Files: *
Copyright: 1993-1997, Thomas Koenig <ig25@rz.uni-karlsruhe.de>
1993, David Parsons
2002, 2005, Ryan Murray <rmurray@debian.org>
License: GPL-2+
Files: getloadavg.c
Copyright: 1985-1995, Free Software Foundation Inc
License: GPL-2+
Files: posixtm.*
Copyright: 1989-2007, Free Software Foundation Inc
License: GPL-3+
Files: parsetime.pl
Copyright: 2009, Ansgar Burchardt <ansgar@debian.org>
License: ISC
License: GPL-2+
This program is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
See the specification (linked above) in http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ for details.
As you can see, the basic assumption that there is necessarily a single license per package is false. There may be multiple licenses per file -- depending on which problem you are trying to solve, it may of course be possible to ignore many of them (for example, if you want to investigate whether or not you have stuff under the Apache license, that should be easy to do, for packages which have transitioned to this new format).
This is new with Debian Jessie, released in 2015; older versions of Debian do not have anything like this. The best you can do if you need to audit a system with older packages is probably to grep the copyright
files for fragments which look like GPL, BSD, MIT etc and then hope you're not missing too much; but hope on top of some flimsy grepping seem anathema to any proper legal work, which I think we can assume is the reason you are attempting this. A better approach might be to find the current copyright
files for the packages you are auditing, with the roughly machine-readable information, and hoping (there's that word again) that they are adequate for the older version you have installed, too.
(For comparison, older versions, too, are available at http://metadata.ftp-master.debian.org/changelogs/main/a/at/ for you to examine.)
I don't follow Ubuntu very closely any longer, but assume they are picking up this change since a few versions back. Indeed, http://packages.ubuntu.com/xenial/at seems to have the same copyright
file.
来源:https://stackoverflow.com/questions/35044841/how-to-list-licences-of-all-installed-packages-in-debian-based-distros