JSoup doesn't properly parse encoded HTML character which confuses the XML Collector
Description
Acceptance / Success Criteria
None
Attachments
1
Lucidchart Diagrams
Activity
Show:
Fixed
Details
Assignee
Alejandro GalueAlejandro GalueReporter
Alejandro GalueAlejandro GalueLabels
Components
Fix versions
Affects versions
Priority
Major
Details
Details
Assignee

Reporter

Labels
Components
Fix versions
Affects versions
Priority
PagerDuty
PagerDuty Incident
PagerDuty
PagerDuty Incident
PagerDuty

PagerDuty Incident
Created November 4, 2015 at 3:59 PM
Updated November 4, 2015 at 11:40 PM
Resolved November 4, 2015 at 4:31 PM
When using the XML Collector to parse HTML Documents, the JSoup library is used to convert any HTML to a well formed XML document.
The problem is that when the document contains encoded characters like "Curaçao" for "Curaçao", the JSoup document must be initialized on a special way in order to properly parse the data and avoid exceptions.
On a customer installation, this problem was generating a DatacollectionFailed on 17.0.0-SNAPSHOT.