#include <utf-8.h>
unsigned int
sgetu8(unsigned int *chars, char *string)
unsigned int
utf8sgetc(unsigned int *chars, char *string)
DESCRIPTION
The sgetu8() function of utf-8 library reads a sequence of one or more
characters from a UTF-8 formatted string, which it converts to a single
UTC-4 (Unicode) value.
utf8sgetc() is a macro which simply gives sgetu8() a name that may be
more convenient to remember. It is defined in <utf-8.h>.
The sgetu8() function and the utf8sgetc() macro take two arguments:
chars, a pointer to an integer for an additional return value, and
string, which contains the sequence of one or more UTF-8 characters.
RETURN VALUES
If string is NULL, or if a premature end-of-string condition occurs,
sgetu8() returns EOF. If string contains valid UTF-8 codes, sgetu8() re-
turns the converted UTC-4 value. Otherwise, it returns UTF8INVALID, de-
fined in <utf-8.h>.
Additionally, if chars is not NULL, sgetu8() will fill it in with the
number of characters read from string. This allows you to determine where
the next UTF-8 encoded character sequence starts in the string.
SEE ALSO
libutf-8(3), fgetu8(3), fputu8(3), sputu8(3)
F. Yergeau, UTF-8, a transformation format of Unicode and ISO 10646,
RFC2279.
D. Goldsmith, M. Davis, Using Unicode with MIME, RFC1641.
STANDARDS
ISO 10646-1: 1993 (``Unicode''), RFC 2279: 1998 (``UTF-8''), ISO
9899: 1990 (``ISO C'').
DIAGNOSTICS
You should always check the RETURN VALUES against EOF and UTF8INVALID.
AUTHORS
This manual page was written by G. Adam Stanislav <adam@whizkidtech.net>.
BUGS
None known.
BSD April 1, 1999 1
Man(1) output converted with
man2html
[ Back to Whiz Kid Technomagic i18n tools ]