Details
-
Bug
-
Status: Resolved (View Workflow)
-
Blocker
-
Resolution: Fixed
-
17.0.0
-
Security Level: Default (Default Security Scheme)
-
None
Description
From a long time, the requisitions ReST API was modified to return a 303 "See Other" with the URL to get the modified entity, after requesting an addition or modification of an existing entity (by entity I mean, requisition, node, interface, monitored-service, assets, categories, etc.).
Almost all the ReST Clients expect a HTTP 200 for a successful response, and if a 303 is sent instead, some could follow the redirect and others don't. Actually, provision.pl has been modified in order to consider a 303 a good response.
Moderns browsers will always going to follow the redirect, and there is no way to change this behavior on the browsers.
When the redirects are follow, the workflow of manipulating the requisitions through the ReST API will brake on the following condition:
"Adding a node to an existing requisition following the redirect, and then import the requisition following the redirect on the PUT request response."
I think this is a race condition due to the fact that each GET request will trigger the generation of the requisition XML file on the pending directory, and at the same time, the ReST API is trying to read the files to send it to the requestor.
The exception generated is the following:
org.opennms.core.xml.MarshallingResourceFailureException at org.opennms.core.xml.MarshallingExceptionTranslator.translate(MarshallingExceptionTranslator.java:73) at org.opennms.core.xml.JaxbUtils.unmarshal(JaxbUtils.java:242) at org.opennms.core.xml.JaxbUtils.unmarshal(JaxbUtils.java:173) at org.opennms.core.xml.JaxbUtils.unmarshal(JaxbUtils.java:166) at org.opennms.netmgt.provision.persist.RequisitionFileUtils.getRequisitionFromFile(RequisitionFileUtils.java:70) at org.opennms.netmgt.provision.persist.FilesystemForeignSourceRepository.getRequisition(FilesystemForeignSourceRepository.java:267) at org.opennms.web.svclayer.support.DefaultRequisitionAccessService$RequisitionAccessor.getActiveRequisition(DefaultRequisitionAccessService.java:134) at org.opennms.web.svclayer.support.DefaultRequisitionAccessService$RequisitionAccessor.getRequisition(DefaultRequisitionAccessService.java:410) at org.opennms.web.svclayer.support.DefaultRequisitionAccessService$6.call(DefaultRequisitionAccessService.java:570) at org.opennms.web.svclayer.support.DefaultRequisitionAccessService$6.call(DefaultRequisitionAccessService.java:568) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
I've created a simple shell script that recreates the problem using curl.
The important thing is that on OpenNMS 14 and Meridian, this is not a problem, but it can be reproduced all the time on OpenNMS 17 Snapshot.
SOLUTION:
A long time ago, I've modified the requisitions file handling to have a cache in memory to accelerate the Requisitions ReST API:
I left the default "file" enabled for backward compatibility, and the "accelerated" version (i.e. "fastFile") is the recommended option for very large requisitions.
If I enable "fastFile", the problem disappear:
org.opennms.provisiond.repositoryImplementation=fastFile
Prior making "fastFile" the new default, I would like to know if the changes related to Apache CXF (or something else) is the responsible for this behavior.
I'm attaching the web.log generated after running the script with the full stack-trace of the exception.
Attachments
Issue Links
- depends on
-
NMS-7926 FasterFilesystemForeignSourceRepository is not working as expected
-
- Resolved
-