Uploaded image for project: 'OpenNMS'
  1. OpenNMS
  2. NMS-12226

Wrong PID in opennms.pid

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 24.1.2
    • Fix Version/s: 25.0.0
    • Component/s: Build / Packaging
    • Security Level: Default (Default Security Scheme)
    • Labels:
      None
    • Environment:
      Seen on CentOS 7, but probably distribution independent
    • Sprint:
      Horizon 2019 - September 11th

      Description

      Steps to reproduce:

      1) Start OpenNMS

      2) Find the actual PID of the OpenNMS JVM via }}{{ps command:

      [vagrant@horizon-24-1-2 opennms]$ ps -ef | grep 'java.*opennms.*bootstra[p]'
      root      7021  7019  3 16:15 ?        00:03:59 /usr/lib/jvm/java-1.8.0-openjdk/bin/java -Djava.endorsed.dirs=/opt/opennms/lib/endorsed -Dopennms.home=/opt/opennms -Xmx1024m -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.login.config=opennms -Dcom.sun.management.jmxremote.access.file=/opt/opennms/etc/jmxremote.access -DisThreadContextMapInheritable=true -Dgroovy.use.classvalue=true -XX:MaxMetaspaceSize=512m -Djava.io.tmpdir=/opt/opennms/data/tmp -XX:+StartAttachListener -jar /opt/opennms/lib/opennms_bootstrap.jar start 

      3) Compare against the value in /var/log/opennms/opennms.pid:

      [vagrant@horizon-24-1-2 opennms]$ cat /var/log/opennms/opennms.pid 
      7019
      

      Expected result: PID in file is same as PID in column 2 in output of (2) — 7021 in this case

      Actual result: PID differs

       

      Notes:

      • The value in the PID file seems to be consistently equal to the parent PID, i.e. the PID of the OPENNMS_HOME/bin/opennms shell script
      • The value in karaf.pid is the correct value that should be in opennms.pid
      • I think the introduction of the runCmd function in the start / stop script (commit 24221ac2d0d92c839878209e328477fec4265c76) is the proximate cause, having changed the semantics of the "last background process"
      • This bug doesn't seem to break the control script, perhaps because it uses the Attach API, but it does break tools like generate-opennms-thread-dump which read opennms.pid

        Attachments

          Activity

            People

            • Assignee:
              indigo Ronny Trommer
              Reporter:
              jeffg Jeff Gehlbach
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: