Introduction to UnZip
The UnZip package contains
ZIP
extraction utilities. These are useful for
extracting files from ZIP
archives.
ZIP
archives are created with
PKZIP or Info-ZIP
utilities, primarily in a DOS environment.
This package is known to build and work properly
using an LFS 11.3 platform.
Caution
The previous version of the UnZip
package had some locale related issues. Currently there are no BLFS
editors capable of testing these locale issues. Therefore, the
locale related information is left on this page, but has not been
tested. A more general discussion of these problems can be found in
the Program Assumes Encoding section of the Locale Related Issues page.
Package Information
Additional Downloads
User Notes:
https://wiki.linuxfromscratch.org/blfs/wiki/unzip
UnZip Locale Issues
Note
Use of UnZip in the
JDK, Mozilla,
DocBook or any other BLFS package
installation is not a problem, as BLFS instructions never use
UnZip to extract a file with non-ASCII
characters in the file's name.
These issues are thought to be fixed in the patch. But since none
of the editors have data to test this, the following workarounds are
retained in case they might still be needed.
The UnZip package assumes that filenames
stored in the ZIP archives created on non-Unix systems are encoded in
CP850, and that they should be converted to ISO-8859-1 when writing files
onto the filesystem. Such assumptions are not always valid. In fact,
inside the ZIP archive, filenames are encoded in the DOS codepage that is
in use in the relevant country, and the filenames on disk should be in
the locale encoding. In MS Windows, the OemToChar() C function (from
User32.DLL
) does the correct conversion (which is
indeed the conversion from CP850 to a superset of ISO-8859-1 if MS
Windows is set up to use the US English language), but there is no
equivalent in Linux.
When using unzip to unpack a ZIP archive
containing non-ASCII filenames, the filenames are damaged because
unzip uses improper conversion when any of its
encoding assumptions are incorrect. For example, in the ru_RU.KOI8-R
locale, conversion of filenames from CP866 to KOI8-R is required, but
conversion from CP850 to ISO-8859-1 is done, which produces filenames
consisting of undecipherable characters instead of words (the closest
equivalent understandable example for English-only users is rot13). There
are several ways around this limitation:
1) For unpacking ZIP archives with filenames containing non-ASCII
characters, use WinZip while
running the Wine Windows
emulator.
2) Use bsdtar -xf from
libarchive-3.6.2 to unpack the ZIP archive.
Then fix the damage made to
the filenames using the convmv tool
(https://j3e.de/linux/convmv/). The following is an example
for the zh_CN.UTF-8 locale:
convmv -f cp936 -t utf-8 -r --nosmart --notest \
</path/to/unzipped/files>
Installation of UnZip
First apply the patch:
patch -Np1 -i ../unzip-6.0-consolidated_fixes-1.patch
Now compile the package:
make -f unix/Makefile generic
The test suite does not work for target “generic”.
Now, as the root
user:
make prefix=/usr MANDIR=/usr/share/man/man1 \
-f unix/Makefile install
Command Explanations
make -f unix/Makefile generic:
This target begins by running a configure script (unlike the older
targets such as linux and linux_noasm) which creates a flags file that
is then used in the build. This ensures that the 32-bit x86 build
receives the right flags to unzip files which which are larger than 2GB
when extracted.