The document is assumed to be encoded as UTF-8.