UnZip
Locale Issues
Note
Use of UnZip in the JDK, Mozilla, DocBook or any other BLFS package
installation is not a problem, as BLFS instructions never use
UnZip to extract a file with
non-ASCII characters in the file's name.
These issues were thought to be fixed in the consolidated patch,
but at least it does not work with some ZIP archives containing
Chinese file names. So the following workaround is retained.
The UnZip package assumes that
filenames stored in the ZIP archives created on non-Unix systems
are encoded in CP850, and that they should be converted to
ISO-8859-1 when writing files onto the filesystem. Such assumptions
are not always valid. In fact, inside the ZIP archive, filenames
are encoded in the DOS codepage that is in use in the relevant
country, and the filenames on disk should be in the locale
encoding. In MS Windows, the OemToChar() C function (from
User32.DLL
) does the correct
conversion (which is indeed the conversion from CP850 to a superset
of ISO-8859-1 if MS Windows is set up to use the US English
language), but there is no equivalent in Linux.
When using unzip to
unpack a ZIP archive containing non-ASCII filenames, the filenames
are damaged because unzip uses improper conversion
when any of its encoding assumptions are incorrect. For example, if
the ZIP archive was created on a Windows system and it contains
Simplified Chinese file names, and the current locale is
zh_CN.UTF-8 (or any UTF-8 locale), conversion of filenames from
CP936 to UTF-8 is required, but conversion from CP850 to UTF-8 is
done, which produces filenames consisting of undecipherable
characters instead of words (the closest equivalent understandable
example for English-only users is rot13).
The easiest way to work around this issue is using the bsdunzip utility from libarchive-3.7.7 to unpack the archive, and
specifying the file name encoding in the ZIP archive with a
-I
option. For example,
assuming archive.zip
was created on
Windows and it contains Simplified Chinese file names, extract it
with:
bsdunzip archive.zip -I cp936
Installation of UnZip
First apply the patches:
patch -Np1 -i ../unzip-6.0-consolidated_fixes-1.patch
patch -Np1 -i ../unzip-6.0-gcc14-1.patch
Now compile the package:
make -f unix/Makefile generic
The test suite does not work for target generic
.
Now, as the root
user:
make prefix=/usr MANDIR=/usr/share/man/man1 \
-f unix/Makefile install
Command Explanations
make -f unix/Makefile
generic: This target begins by running a configure
script (unlike the older targets such as linux and linux_noasm)
which creates a flags file that is then used in the build. This
ensures that the 32-bit x86 build receives the right flags to unzip
files which are larger than 2 GB when extracted.