ELinks 0.16.1.1
parser.h File Reference
#include "dom/code.h"
#include "dom/node.h"
#include "dom/stack.h"
#include "dom/sgml/sgml.h"
#include "dom/scanner.h"
Include dependency graph for parser.h:
This graph shows which files directly or indirectly include this file:

Data Structures

struct  sgml_parser_state
 SGML parser state. More...
struct  sgml_parser
 The SGML parser. More...

Typedefs

typedef enum dom_code(* sgml_error_T) (struct sgml_parser *, struct dom_string *, unsigned int)
 SGML error callback.

Enumerations

enum  sgml_parser_type { SGML_PARSER_STREAM , SGML_PARSER_TREE }
 SGML parser type. More...
enum  sgml_parser_flag { SGML_PARSER_COUNT_LINES = 1 , SGML_PARSER_COMPLETE = 2 , SGML_PARSER_INCREMENTAL = 4 , SGML_PARSER_DETECT_ERRORS = 8 }
 SGML parser flags. More...

Functions

struct sgml_parserinit_sgml_parser (enum sgml_parser_type type, enum sgml_document_type doctype, struct dom_string *uri, unsigned int flags)
 Initialise an SGML parser.
void done_sgml_parser (struct sgml_parser *parser)
 Release an SGML parser.
enum dom_code parse_sgml (struct sgml_parser *parser, char *buf, size_t bufsize, int complete)
 Parse a chunk of SGML source.
unsigned int get_sgml_parser_line_number (struct sgml_parser *parser)
 Get the line position in the source.

Typedef Documentation

◆ sgml_error_T

typedef enum dom_code(* sgml_error_T) (struct sgml_parser *, struct dom_string *, unsigned int)

SGML error callback.

Called by the SGML parser when a parsing error has occurred.

If the return code is not DOM_CODE_OK the parsing will be ended and that code will be returned.

Enumeration Type Documentation

◆ sgml_parser_flag

SGML parser flags.

These flags control how the parser behaves.

Enumerator
SGML_PARSER_COUNT_LINES 

Make line numbers available.

SGML_PARSER_COMPLETE 

Used internally when incremental.

SGML_PARSER_INCREMENTAL 

Parse chunks of input.

SGML_PARSER_DETECT_ERRORS 

Report errors.

◆ sgml_parser_type

SGML parser type.

There are two kinds of parser types: One that optimises one-time access to the DOM tree and one that creates a persistent DOM tree.

Enumerator
SGML_PARSER_STREAM 

The first one will simply push nodes on the stack, not building a DOM tree.

This interface is similar to that of SAX (Simple API for XML) where events are fired when nodes are entered and exited. It is useful when you are not actually interested in the DOM tree, but can do all processing in a stream-like manner, such as when highlighting HTML code.

SGML_PARSER_TREE 

The second one is a DOM tree builder, that builds a persistent DOM tree.

When using this type, it is possible to do even more (pre)processing than for parser streams. For example you can sort element child nodes, or purge various node such as text nodes that only contain space characters.

Function Documentation

◆ done_sgml_parser()

void done_sgml_parser ( struct sgml_parser * parser)

Release an SGML parser.

Deallocates all resources, except the root node.

Parameters
parserThe parser being released.

◆ get_sgml_parser_line_number()

unsigned int get_sgml_parser_line_number ( struct sgml_parser * parser)

Get the line position in the source.

Note
Line numbers are recorded in the scanner tokens.
Parameters
parserA parser created with init_sgml_parser.
Returns
What line number the parser is currently at or zero if there has been no parsing yet.

◆ init_sgml_parser()

struct sgml_parser * init_sgml_parser ( enum sgml_parser_type type,
enum sgml_document_type doctype,
struct dom_string * uri,
unsigned int flags )

Initialise an SGML parser.

Initialise an SGML parser with the given properties.

Parameters
typeStream or tree; one-time or persistant.
doctypeThe document type, this affects what sub type nodes are given.
uriThe URI of the document root.
flagsFlags controlling the behaviour of the parser.
Returns
The created parser or NULL.

◆ parse_sgml()

enum dom_code parse_sgml ( struct sgml_parser * parser,
char * buf,
size_t bufsize,
int complete )

Parse a chunk of SGML source.

Parses the given buffer. For incremental rendering the last buffer can be signals through the complete parameter.

Parameters
parserA parser created with init_sgml_parser.
bufA buffer containing the chunk to parse.
bufsizeThe size of the buffer given in the buf parameter.
completeWhether this is the last chunk to parse.
Returns
DOM_CODE_OK if the buffer was successfully parsed, else a code hinting at the error.