gensprep(8) — Linux manual page

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | ENVIRONMENT | FILES | VERSION | COPYRIGHT | SEE ALSO | COLOPHON

gensprep(8)                 ICU 73.0.1 Manual                gensprep(8)

NAME         top

       gensprep - compile StringPrep data from files filtered by
       filterRFC3454.pl

SYNOPSIS         top

       gensprep [ -h, -?, --help ] [ -v, --verbose ] [ -c, --copyright ]
       [ -s, --sourcedir source ] [ -d, --destdir destination ]

DESCRIPTION         top

       gensprep reads filtered RFC 3454 files and compiles their
       information into a binary form.  The resulting file, <name>.icu,
       can then be read directly by ICU, or used by pkgdata(8) for
       incorporation into a larger archive or library.

       The files read by gensprep are described in the FILES section.

OPTIONS         top

       -h, -?, --help
              Print help about usage and exit.

       -v, --verbose
              Display extra informative messages during execution.

       -c, --copyright
              Include a copyright notice into the binary data.

       -s, --sourcedir source
              Set the source directory to source.  The default source
              directory is specified by the environment variable
              ICU_DATA.

       -d, --destdir destination
              Set the destination directory to destination.  The default
              destination directory is specified by the environment
              variable ICU_DATA.

ENVIRONMENT         top

       ICU_DATA
              Specifies the directory containing ICU data. Defaults to
              ${prefix}/share/icu/73.0.1/.  Some tools in ICU depend on
              the presence of the trailing slash. It is thus important
              to make sure that it is present if ICU_DATA is set.

FILES         top

       The following files are read by gensprep and are looked for in
       the source /misc for rfc3454_*.txt files and in source /unidata
       for NormalizationCorrections.txt.

       rfc3453_A_1.txt
              Contains the list of unassigned codepoints in Unicode
              version 3.2.0....

       rfc3454_B_1.txt
              Contains the list of code points that are commonly mapped
              to nothing....

       rfc3454_B_2.txt
              Contains the list of mappings for casefolding of  code
              points when Normalization form NFKC is specified....

       rfc3454_C_X.txt
              Contains the list of code points that are prohibited for
              IDNA.

       NormalizationCorrections.txt
              Contains the list of code points whose normalization has
              changed since Unicode Version 3.2.0.

VERSION         top

       73.0.1

COPYRIGHT         top

       Copyright (C) 2000-2002 IBM, Inc. and others.

SEE ALSO         top

       pkgdata(8)

COLOPHON         top

       This page is part of the ICU (International Components for
       Unicode) project.  Information about the project can be found at
       ⟨http://site.icu-project.org/home⟩.  If you have a bug report for
       this manual page, see ⟨http://site.icu-project.org/bugs⟩.  This
       page was obtained from the project's upstream Git repository
       ⟨https://github.com/unicode-org/icu⟩ on 2023-12-22.  (At that
       time, the date of the most recent commit that was found in the
       repository was 2023-12-22.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there
       is a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to
       man-pages@man7.org

ICU MANPAGE                   18 March 2003                  gensprep(8)