JSoup doesn't properly parse encoded HTML character which confuses the XML Collector

Description

When using the XML Collector to parse HTML Documents, the JSoup library is used to convert any HTML to a well formed XML document.

The problem is that when the document contains encoded characters like "Curaçao" for "Curaçao", the JSoup document must be initialized on a special way in order to properly parse the data and avoid exceptions.

On a customer installation, this problem was generating a DatacollectionFailed on 17.0.0-SNAPSHOT.

Acceptance / Success Criteria

None

Attachments

1

Lucidchart Diagrams

Activity

Show:
Fixed

Details

Assignee

Reporter

Labels

Components

Affects versions

Priority

PagerDuty

Created November 4, 2015 at 3:59 PM
Updated November 4, 2015 at 11:40 PM
Resolved November 4, 2015 at 4:31 PM
Loading...