]> Unicode 3 RAPTOR Library Unicode Unicode and UTF-8 utility functions. Synopsis typedef raptor_unichar; int raptor_unicode_char_to_utf8 (raptor_unichar c, unsigned char *output); int raptor_utf8_to_unicode_char (raptor_unichar *output, unsigned char *input, int length); int raptor_unicode_is_xml11_namestartchar (raptor_unichar c); int raptor_unicode_is_xml10_namestartchar (raptor_unichar c); int raptor_unicode_is_xml11_namechar (raptor_unichar c); int raptor_unicode_is_xml10_namechar (raptor_unichar c); int raptor_utf8_check (unsigned char *string, size_t length); Description Functions to support converting to and from Unicode written in UTF-8 which is the native internal string format of all the redland libraries. Includes checking for Unicode names using either the XML 1.0 or XML 1.1 rules. Details raptor_unichar raptor_unichartypedef unsigned long raptor_unichar; raptor Unicode codepoint raptor_unicode_char_to_utf8 () raptor_unicode_char_to_utf8int raptor_unicode_char_to_utf8 (raptor_unichar c, unsigned char *output); Convert a Unicode character to UTF-8 encoding. Based on librdf_unicode_char_to_utf8() with no need to calculate length since the encoded character is always copied into a buffer with sufficient size. c : Unicode character output : UTF-8 string buffer or NULL Returns : bytes encoded to output buffer or <0 on failure raptor_utf8_to_unicode_char () raptor_utf8_to_unicode_charint raptor_utf8_to_unicode_char (raptor_unichar *output, unsigned char *input, int length); Convert an UTF-8 encoded buffer to a Unicode character. If output is NULL, then will calculate the number of bytes that will be used from the input buffer and not perform the conversion. output : Pointer to the Unicode character or NULL input : UTF-8 string buffer length : buffer size Returns : bytes used from input buffer or <0 on failure: -1 input buffer too short or length error, -2 overlong UTF-8 sequence, -3 illegal code positions, -4 code out of range U+0000 to U+10FFFF. In cases -2, -3 and -4 the coded character is stored in the output. raptor_unicode_is_xml11_namestartchar () raptor_unicode_is_xml11_namestartcharint raptor_unicode_is_xml11_namestartchar (raptor_unichar c); Check if Unicode character is legal to start an XML 1.1 Name Namespaces in XML 1.1 REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml11-20040204/NT-NameStartChar updating Extensible Markup Language (XML) 1.1 REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml11-20040204/ sec 2.3, [4a] excluding the ':' c : Unicode character to check Returns : non-0 if legal raptor_unicode_is_xml10_namestartchar () raptor_unicode_is_xml10_namestartcharint raptor_unicode_is_xml10_namestartchar (raptor_unichar c); Check if Unicode character is legal to start an XML 1.0 Name Namespaces in XML REC 1999-01-14 http://www.w3.org/TR/1999/REC-xml-names-19990114/NT-NCName updating Extensible Markup Language (XML) 1.0 (Third Edition) REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml-20040204/ excluding the ':' c : Unicode character to check Returns : non-0 if legal raptor_unicode_is_xml11_namechar () raptor_unicode_is_xml11_namecharint raptor_unicode_is_xml11_namechar (raptor_unichar c); Check if a Unicode codepoint is a legal to continue an XML 1.1 Name Namespaces in XML 1.1 REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml11-20040204/ updating Extensible Markup Language (XML) 1.1 REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml11-20040204/ sec 2.3, [4a] excluding the ':' c : Unicode character Returns : non-0 if legal raptor_unicode_is_xml10_namechar () raptor_unicode_is_xml10_namecharint raptor_unicode_is_xml10_namechar (raptor_unichar c); Check if a Unicode codepoint is a legal to continue an XML 1.0 Name Namespaces in XML REC 1999-01-14 http://www.w3.org/TR/1999/REC-xml-names-19990114/NT-NCNameChar updating Extensible Markup Language (XML) 1.0 (Third Edition) REC 2004-02-04 http://www.w3.org/TR/2004/REC-xml-20040204/ excluding the ':' c : Unicode character Returns : non-0 if legal raptor_utf8_check () raptor_utf8_checkint raptor_utf8_check (unsigned char *string, size_t length); Check a string is UTF-8. string : UTF-8 string length : length of string Returns : Non 0 if the string is UTF-8