audacity/lib-src/libraptor/docs/html/restrict-parser-network-access.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Restrict parser network access</title>
<meta name="generator" content="DocBook XSL Stylesheets V1.73.2">
<link rel="start" href="index.html" title="Raptor RDF Syntax Parsing and Serializing Library Manual">
<link rel="up" href="tutorial-parsing.html" title="Parsing syntaxes to RDF Triples">
<link rel="prev" href="tutorial-parser-content.html" title="Provide syntax content to parse">
<link rel="next" href="tutorial-parser-static-info.html" title="Querying parser static information">
<meta name="generator" content="GTK-Doc V1.10 (XML mode)">
<link rel="stylesheet" href="style.css" type="text/css">
<link rel="chapter" href="introduction.html" title="Raptor Overview">
<link rel="part" href="tutorial.html" title="Part I. Raptor Tutorial">
<link rel="chapter" href="tutorial-initialising-finishing.html" title="Initialising and Finishing using the Library">
<link rel="chapter" href="tutorial-querying-functionality.html" title="Listing built-in functionality">
<link rel="chapter" href="tutorial-parsing.html" title="Parsing syntaxes to RDF Triples">
<link rel="chapter" href="tutorial-serializing.html" title="Serializing RDF triples to a syntax">
<link rel="part" href="reference-manual.html" title="Part II. Raptor Reference Manual">
<link rel="chapter" href="raptor-parsers.html" title="Parsers in Raptor (syntax to triples)">
<link rel="chapter" href="raptor-serializers.html" title="Serializers in Raptor (triples to syntax)">
<link rel="index" href="ix01.html" title="Index">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table class="navigation" id="top" width="100%" summary="Navigation header" cellpadding="2" cellspacing="2"><tr valign="middle">
<td><a accesskey="p" href="tutorial-parser-content.html"><img src="left.png" width="24" height="24" border="0" alt="Prev"></a></td>
<td><a accesskey="u" href="tutorial-parsing.html"><img src="up.png" width="24" height="24" border="0" alt="Up"></a></td>
<td><a accesskey="h" href="index.html"><img src="home.png" width="24" height="24" border="0" alt="Home"></a></td>
<th width="100%" align="center">Raptor RDF Syntax Parsing and Serializing Library Manual</th>
<td><a accesskey="n" href="tutorial-parser-static-info.html"><img src="right.png" width="24" height="24" border="0" alt="Next"></a></td>
</tr></table>
<div class="section" lang="en">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="restrict-parser-network-access"></a>Restrict parser network access</h2></div></div></div>
<p>
Parsing can cause network requests to be performed, especially
if a URI is given as an argument such as with
<a class="link" href="raptor-section-parser.html#raptor-parse-uri" title="raptor_parse_uri ()"><code class="function">raptor_parse_uri()</code></a>
however there may also be indirect requests such as with the
GRDDL parser that retrieves URIs depending on the results of
initial parse requests.  The URIs requested may not be wanted
to be fetched or need to be filtered, and this can be done in
three ways.
</p>
<div class="section" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-with-feature"></a>Filtering parser network requests with feature <a class="link" href="raptor-section-feature.html#RAPTOR-FEATURE-NO-NET:CAPS"><code class="literal">RAPTOR_FEATURE_NO_NET</code></a>
</h3></div></div></div>
<p>
The parser feature
<a class="link" href="raptor-section-feature.html#RAPTOR-FEATURE-NO-NET:CAPS"><code class="literal">RAPTOR_FEATURE_NO_NET</code></a>
can be set with
<a class="link" href="raptor-section-parser.html#raptor-set-feature" title="raptor_set_feature ()"><code class="function">raptor_set_feature()</code></a>
and forbids all network requests.  There is no customisation with
this approach, for that see the URI filter in the next section.
</p>
<pre class="programlisting">
  rdf_parser = raptor_new_parser("rdfxml");

  /* Disable internal network requests */
  raptor_set_feature(rdf_parser, RAPTOR_FEATURE_NO_NET, 1);
</pre>
</div>
<div class="section" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-www-uri-filter"></a>Filtering parser network requests with <a class="link" href="raptor-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a>
</h3></div></div></div>
<p>
The
<a class="link" href="raptor-section-www.html#raptor-www-set-uri-filter" title="raptor_www_set_uri_filter ()"><code class="function">raptor_www_set_uri_filter()</code></a>

allows setting of a filtering function to operate on all URIs
retrieved by a WWW connection.  This connection can be used in
parsing when operated by hand.
</p>
<pre class="programlisting">
void write_bytes_handler(raptor_www* www, void *user_data,
                         const void *ptr, size_t size, size_t nmemb) {
{
  raptor_parser* rdf_parser=(raptor_parser*)user_data;
  raptor_parse_chunk(rdf_parser, (unsigned char*)ptr, size*nmemb, 0);
}

int uri_filter(void* filter_user_data, raptor_uri* uri) {
  /* return non-0 to forbid the request */
}

int main(int argc, char *argv[]) {
  ...

  rdf_parser = raptor_new_parser("rdfxml");
  www = raptor_new_www();

  /* filter all URI requests */
  raptor_www_set_uri_filter(www, uri_filter, filter_user_data);

  /* make WWW write bytes to parser */
  raptor_www_set_write_bytes_handler(www, write_bytes_handler, rdf_parser);

  raptor_start_parse(rdf_parser, uri);
  raptor_www_fetch(www, uri);
  /* tell the parser that we are done */
  raptor_parse_chunk(rdf_parser, NULL, 0, 1);

  raptor_www_free(www);
  raptor_free_parser(rdf_parser);

  ...
}

</pre>
</div>
<div class="section" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-parser-uri-filter"></a>Filtering parser network requests with <a class="link" href="raptor-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a>
</h3></div></div></div>
<p>
The
<a class="link" href="raptor-section-parser.html#raptor-parser-set-uri-filter" title="raptor_parser_set_uri_filter ()"><code class="function">raptor_parser_set_uri_filter()</code></a>
allows setting of a filtering function to operate on all URIs that
the parser sees.  This operates on the internal raptor_www object
used inside parsing to retrieve URIs, similar to that described in
the <a class="link" href="restrict-parser-network-access.html#tutorial-filter-network-www-uri-filter" title="Filtering parser network requests with raptor_www_set_uri_filter()">previous section</a>.
</p>
<pre class="programlisting">
  int uri_filter(void* filter_user_data, raptor_uri* uri) {
    /* return non-0 to forbid the request */
  }

  rdf_parser = raptor_new_parser("rdfxml");
  raptor_parser_set_uri_filter(rdf_parser, uri_filter, filter_user_data);

  /* parse content as normal */
  raptor_parse_uri(rdf_parser, uri, base_uri);
</pre>
</div>
<div class="section" lang="en">
<div class="titlepage"><div><div><h3 class="title">
<a name="tutorial-filter-network-parser-timeout"></a>Setting timeout for parser network requests with feature <a class="link" href="raptor-section-feature.html#RAPTOR-FEATURE-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_FEATURE_WWW_TIMEOUT</code></a>
</h3></div></div></div>
<p>If the value of feature
<a class="link" href="raptor-section-feature.html#RAPTOR-FEATURE-WWW-TIMEOUT:CAPS"><code class="literal">RAPTOR_FEATURE_WWW_TIMEOUT</code></a>
if set to a number &gt;0, it is used as the timeout in seconds
for retrieving of URIs during parsing (primarily for GRDDL).
This uses
<a class="link" href="raptor-section-www.html#raptor-www-set-connection-timeout" title="raptor_www_set_connection_timeout ()"><code class="function">raptor_www_set_connection_timeout()</code></a>
internally.
</p>
<pre class="programlisting">
  rdf_parser = raptor_new_parser("grddl");

  /* set internal URI retrieval maximum time to 5 seconds */
  raptor_set_feature(rdf_parser, RAPTOR_FEATURE_WWW_TIMEOUT , 5);
</pre>
</div>
</div>
<div class="footer">
<hr>
          Generated by GTK-Doc V1.10</div>
</body>
</html>