gensprep(8) — Linux manual page

NAME \| SYNOPSIS \| DESCRIPTION \| OPTIONS \| ENVIRONMENT \| FILES \| VERSION \| COPYRIGHT \| SEE ALSO \| COLOPHON

gensprep(8)                 ICU 76.0.1 Manual                 gensprep(8)

NAME top

       gensprep - compile StringPrep data from files filtered by
       filterRFC3454.pl

SYNOPSIS top

       gensprep [ -h, -?, --help ] [ -v, --verbose ] [ -c, --copyright ]
       [ -s, --sourcedir source ] [ -d, --destdir destination ]

DESCRIPTION top

       gensprep reads filtered RFC 3454 files and compiles their
       information into a binary form.  The resulting file, <name>.icu,
       can then be read directly by ICU, or used by pkgdata(8) for
       incorporation into a larger archive or library.

       The files read by gensprep are described in the FILES section.

OPTIONS top

       -h, -?, --help
              Print help about usage and exit.

       -v, --verbose
              Display extra informative messages during execution.

       -c, --copyright
              Include a copyright notice into the binary data.

       -s, --sourcedir source
              Set the source directory to source.  The default source
              directory is specified by the environment variable
              ICU_DATA.

       -d, --destdir destination
              Set the destination directory to destination.  The default
              destination directory is specified by the environment
              variable ICU_DATA.

ENVIRONMENT top

       ICU_DATA
              Specifies the directory containing ICU data. Defaults to
              ${prefix}/share/icu/76.0.1/.  Some tools in ICU depend on
              the presence of the trailing slash. It is thus important to
              make sure that it is present if ICU_DATA is set.

FILES top

       The following files are read by gensprep and are looked for in the
       source /misc for rfc3454_*.txt files and in source /unidata for
       NormalizationCorrections.txt.

       rfc3453_A_1.txt
              Contains the list of unassigned codepoints in Unicode
              version 3.2.0....

       rfc3454_B_1.txt
              Contains the list of code points that are commonly mapped
              to nothing....

       rfc3454_B_2.txt
              Contains the list of mappings for casefolding of  code
              points when Normalization form NFKC is specified....

       rfc3454_C_X.txt
              Contains the list of code points that are prohibited for
              IDNA.

       NormalizationCorrections.txt
              Contains the list of code points whose normalization has
              changed since Unicode Version 3.2.0.

VERSION top

       76.0.1

COPYRIGHT top

       Copyright (C) 2000-2002 IBM, Inc. and others.

COLOPHON top

       This page is part of the ICU (International Components for
       Unicode) project.  Information about the project can be found at
       ⟨http://site.icu-project.org/home⟩.  If you have a bug report for
       this manual page, see ⟨http://site.icu-project.org/bugs⟩.  This
       page was obtained from the project's upstream Git repository
       ⟨https://github.com/unicode-org/icu⟩ on 2026-05-24.  (At that
       time, the date of the most recent commit that was found in the
       repository was 2026-05-22.)  If you discover any rendering
       problems in this HTML version of the page, or you believe there is
       a better or more up-to-date source for the page, or you have
       corrections or improvements to the information in this COLOPHON
       (which is not part of the original manual page), send a mail to
       man-pages@man7.org

ICU MANPAGE                   18 March 2003                   gensprep(8)

gensprep(8) — Linux manual page

NAME top

SYNOPSIS top

DESCRIPTION top

OPTIONS top

ENVIRONMENT top

FILES top

VERSION top

COPYRIGHT top

SEE ALSO top

COLOPHON top