Uploaded image for project: 'Jenkins'
  1. Jenkins
  2. JENKINS-52408

Multiple jobs on same pipeline: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob

    Details

    • Similar Issues:

      Description

      Hi guys,

      I have a complex workflow with multiple stage and step (some of them run in parallel).

      All worked fine until last month.

      Now when I launch my job in different github branch (with a multiple pipeline job), I get the following error (not every time)

      java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob 

      Here are steps of the stage which fails:

      Stage : Start - (13 sec in block) Configuration Success

      Configuration - (11 sec in block) Success

      Shell Script - (1.8 sec in self) mkdir -p logs cache/container cache/dev cache/test Console Output Success

      Shell Script - (1.7 sec in self) cp alk_service_auth/application/config/parameters_test.yml.dist alk_service_auth/application/config/parameters_test.yml ||: Console Output
      Success

      Bind credentials to variables : Start - (6.6 sec in block) Console Output Success

      Bind credentials to variables : Body : Start - (4.2 sec in block) Success

      Shell Script - (1.3 sec in self) Console Output Success

      Shell Script - (1.7 sec in self) sed -i s/DB_SCHEMA/alk_auth_b1267c9d/g alk_service_auth/application/config/parameters_test.yml Console Output Success

      Stage : Start - (4 min 13 sec in block) Build & Test Success

      Build & Test - (4 min 11 sec in block) Success

      Verify if file exists in workspace - (1.7 sec in self) /var/www/service-auth.alkemics.com/shared/env Success

      Shell Script - (1.7 sec in self) Console Output Success

      Shell Script - (14 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip install -e .[test] --process-dependency-links Console Output Success

      Shell Script - (3.3 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip check Console Output Success

      Shell Script - (2.1 sec in self) Console Output Success

      Shell Script - (2.8 sec in self) Console Output Success

      Shell Script - (3.6 sec in self) Console Output Success

      Shell Script - (4.4 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/flake8 alk_service_auth Console Output Success

      Shell Script - (3 sec in self) Console Output Success

      Shell Script - (3 min 21 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml
      alk_service_auth Console Output Success

      Shell Script - (6.2 sec in self) Console Output Success

      Shell Script - (2.2 sec in self) Console Output Success

      Shell Script - (1.7 sec in self) Console Output Success

      Error signal - (0.39 sec in self) Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob Failed

      Print Message - (1.6 sec in self) hudson.AbortException: Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob

       

      It cuts unit tests step without obvious reasons.

       

      Do you have any idea about it ?

       

        Attachments

          Activity

          kiva Romain N created issue -
          kiva Romain N made changes -
          Field Original Value New Value
          Description Hi guys,

          I have a complex workflow with multiple stage and step (some of them run in parallel).

          All worked fine until last month.

          Now when I launch my job in different github branch (with a multiple pipeline job), I get the following error (not every time)
          {code:java}
          java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob {code}
          Here are steps of the stage which fails:
          {noformat}
          Stage : Start - (13 sec in block) Configuration Success Configuration - (11 sec in block) Success Shell Script - (1.8 sec in self) mkdir -p logs cache/container cache/dev cache/test Console Output Success Shell Script - (1.7 sec in self) cp alk_service_auth/application/config/parameters_test.yml.dist alk_service_auth/application/config/parameters_test.yml ||: Console Output Success Bind credentials to variables : Start - (6.6 sec in block) Console Output Success Bind credentials to variables : Body : Start - (4.2 sec in block) Success Shell Script - (1.3 sec in self) Console Output Success Shell Script - (1.7 sec in self) sed -i s/DB_SCHEMA/alk_auth_b1267c9d/g alk_service_auth/application/config/parameters_test.yml Console Output Success Stage : Start - (4 min 13 sec in block) Build & Test Success Build & Test - (4 min 11 sec in block) Success Verify if file exists in workspace - (1.7 sec in self) /var/www/service-auth.alkemics.com/shared/env Success Shell Script - (1.7 sec in self) Console Output Success Shell Script - (14 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip install -e .[test] --process-dependency-links Console Output Success Shell Script - (3.3 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip check Console Output Success Shell Script - (2.1 sec in self) Console Output Success Shell Script - (2.8 sec in self) Console Output Success Shell Script - (3.6 sec in self) Console Output Success Shell Script - (4.4 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/flake8 alk_service_auth Console Output Success Shell Script - (3 sec in self) Console Output Success Shell Script - (3 min 21 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml alk_service_auth Console Output Success Shell Script - (6.2 sec in self) Console Output Success Shell Script - (2.2 sec in self) Console Output Success Shell Script - (1.7 sec in self) Console Output Success Error signal - (0.39 sec in self) Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob Failed Print Message - (1.6 sec in self) hudson.AbortException: Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob
          {noformat}
          It cuts unit tests step without obvious reasons.

           

          Do you have any idea about it ?

           
          Hi guys,

          I have a complex workflow with multiple stage and step (some of them run in parallel).

          All worked fine until last month.

          Now when I launch my job in different github branch (with a multiple pipeline job), I get the following error (not every time)
          {code:java}
          java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob {code}
          Here are steps of the stage which fails:

          Stage : Start - (13 sec in block) Configuration Success

          Configuration - (11 sec in block) Success

          Shell Script - (1.8 sec in self) mkdir -p logs cache/container cache/dev cache/test Console Output Success

          Shell Script - (1.7 sec in self) cp alk_service_auth/application/config/parameters_test.yml.dist alk_service_auth/application/config/parameters_test.yml ||: Console Output
          Success

          Bind credentials to variables : Start - (6.6 sec in block) Console Output Success

          Bind credentials to variables : Body : Start - (4.2 sec in block) Success

          Shell Script - (1.3 sec in self) Console Output Success

          Shell Script - (1.7 sec in self) sed -i s/DB_SCHEMA/alk_auth_b1267c9d/g alk_service_auth/application/config/parameters_test.yml Console Output Success

          Stage : Start - (4 min 13 sec in block) Build & Test Success

          Build & Test - (4 min 11 sec in block) Success

          Verify if file exists in workspace - (1.7 sec in self) /var/www/service-auth.alkemics.com/shared/env Success

          Shell Script - (1.7 sec in self) Console Output Success

          Shell Script - (14 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip install -e .[test] --process-dependency-links Console Output Success

          Shell Script - (3.3 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/pip check Console Output Success

          Shell Script - (2.1 sec in self) Console Output Success

          Shell Script - (2.8 sec in self) Console Output Success

          Shell Script - (3.6 sec in self) Console Output Success

          Shell Script - (4.4 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/flake8 alk_service_auth Console Output Success

          Shell Script - (3 sec in self) Console Output Success

          Shell Script - (3 min 21 sec in self) /var/www/service-auth.alkemics.com/shared/env/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml
          alk_service_auth Console Output Success

          Shell Script - (6.2 sec in self) Console Output Success

          Shell Script - (2.2 sec in self) Console Output Success

          Shell Script - (1.7 sec in self) Console Output Success

          Error signal - (0.39 sec in self) Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob Failed

          Print Message - (1.6 sec in self) hudson.AbortException: Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob

           

          It cuts unit tests step without obvious reasons.

           

          Do you have any idea about it ?

           
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Please provide a full stacktrace if possible. It should be in the System or agent log if it is not in Pipeline (otherwise it is a diagnosability bug).

          "Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob". Could you please provide your Pipeline. Which plugin do you use to run Python unit tests?

          I will conditionally mark it as a JEP-200 issues since it looks like a plausible root cause (to be confirmed)

          Show
          oleg_nenashev Oleg Nenashev added a comment - Please provide a full stacktrace if possible. It should be in the System or agent log if it is not in Pipeline (otherwise it is a diagnosability bug). "Python unit tests failed: java.io.NotSerializableException: org.jenkinsci.plugins.workflow.job.WorkflowJob". Could you please provide your Pipeline. Which plugin do you use to run Python unit tests? I will conditionally mark it as a JEP-200 issues since it looks like a plausible root cause (to be confirmed)
          oleg_nenashev Oleg Nenashev made changes -
          Labels exception pipeline JEP-200 exception pipeline
          Hide
          kiva Romain N added a comment -

          I can't find a specific exception linked to this.

          However I have exception linked to docker plugin:

          Failed to send back a reply to the request hudson.remoting.Request$2@3c508964
          java.io.IOException at hudson.remoting.Channel.close(Channel.java:1447) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@32d69667:docker-58a6cb2018e5d": channel is already closed at hudson.remoting.Channel.send(Channel.java:717) at hudson.remoting.Request$2.run(Request.java:382) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:19) at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
          
          

          I can't give you all our pipeline code, I don't have the permission.

          Here is the code which launch the unit test:

          node("docker"){
          def database = null
          try { 
          stage("Checkout") { checkout scm }
          if(create_database) {
          stage("Database Setup") 
          { database = database_setup(app_config, database, app_root, repo_slug, args) } }
          if(push_config) {
          stage("Configuration") { 
          common.deploy_config("${app_config}/config", database, create_database, use_elasticsearch) } } 
          stage("Build & Test") {
          build_test(repo_slug, app_root, clone_translations, check_coverage, args) step([$class: 'JUnitResultArchiver', testResults: '*.xml', allowEmptyResults: true]) 
          } 
          } }
          #HERE IS THE BUILD_TEST METHOD: 
          def service = env.JOB_NAME.split("-pipeline")[0] startDate = datadog.initStep('python-unittests', service)
          try {
          sh("${vars} ${env_path}/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml ${path} ${coverage}")
          datadog.endStep('python-unittests', service, startDate, true)
          } catch(e) { 
          datadog.endStep('python-unittests', service, startDate, false)
          error "Python unit tests failed: " + e
          }
          
          

          Docker node is a node managed by docker plugin.

          As you can see, we don't have plugin for unit test, they are launch with nosetests by sh step.

           

          Show
          kiva Romain N added a comment - I can't find a specific exception linked to this. However I have exception linked to docker plugin: Failed to send back a reply to the request hudson.remoting.Request$2@3c508964 java.io.IOException at hudson.remoting.Channel.close(Channel.java:1447) at hudson.remoting.Channel.close(Channel.java:1403) at hudson.slaves.SlaveComputer.closeChannel(SlaveComputer.java:746) at hudson.slaves.SlaveComputer.access$800(SlaveComputer.java:99) at hudson.slaves.SlaveComputer$3.run(SlaveComputer.java:664) at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28) at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@32d69667:docker-58a6cb2018e5d" : channel is already closed at hudson.remoting.Channel.send(Channel.java:717) at hudson.remoting.Request$2.run(Request.java:382) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at org.jenkinsci.remoting.CallableDecorator.call(CallableDecorator.java:19) at hudson.remoting.CallableDecoratorList$1.call(CallableDecoratorList.java:21) at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46) at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:71) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang. Thread .run( Thread .java:748) I can't give you all our pipeline code, I don't have the permission. Here is the code which launch the unit test: node( "docker" ){ def database = null try { stage( "Checkout" ) { checkout scm } if (create_database) { stage( "Database Setup" ) { database = database_setup(app_config, database, app_root, repo_slug, args) } } if (push_config) { stage( "Configuration" ) { common.deploy_config( "${app_config}/config" , database, create_database, use_elasticsearch) } } stage( "Build & Test" ) { build_test(repo_slug, app_root, clone_translations, check_coverage, args) step([$class: 'JUnitResultArchiver' , testResults: '*.xml' , allowEmptyResults: true ]) } } } #HERE IS THE BUILD_TEST METHOD:  def service = env.JOB_NAME.split( "-pipeline" )[0] startDate = datadog.initStep( 'python-unittests' , service) try { sh( "${vars} ${env_path}/bin/nosetests --nologcapture --verbose --with-xunit --xunit-file=xunit.xml ${path} ${coverage}" ) datadog.endStep( 'python-unittests' , service, startDate, true ) } catch (e) { datadog.endStep( 'python-unittests' , service, startDate, false ) error "Python unit tests failed: " + e } Docker node is a node managed by docker plugin. As you can see, we don't have plugin for unit test, they are launch with nosetests by sh step.  
          Hide
          kiva Romain N added a comment - - edited

          Hi, 

          I have some logs from agent:

           

          [07/11/18 09:29:15] [SSH] Opening SSH connection to int-oro-docker-7:32908. [07/11/18 09:29:15] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection. [07/11/18 09:29:15] [SSH] Authentication successful. [07/11/18 09:29:15] [SSH] The remote user's environment is: BASH=/bin/bash BASHOPTS=cmdhist:complete_fullquote:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_EXECUTION_STRING=set BASH_LINENO=() BASH_SOURCE=() BASH_VERSINFO=([0]="4" [1]="3" [2]="48" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu") BASH_VERSION='4.3.48(1)-release' DIRSTACK=() EUID=999 GROUPS=() HOME=/home/deploy HOSTNAME=44b22e41f61f HOSTTYPE=x86_64 IFS=$' \t\n' LOGNAME=deploy MACHTYPE=x86_64-pc-linux-gnu MAIL=/var/mail/deploy OPTERR=1 OPTIND=1 OSTYPE=linux-gnu PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games PIPESTATUS=([0]="0") PPID=37 PS4='+ ' PWD=/home/deploy SHELL=/bin/bash SHELLOPTS=braceexpand:hashall:interactive-comments SHLVL=1 SSH_CLIENT='172.31.16.52 40404 22' SSH_CONNECTION='172.31.16.52 40404 172.17.0.7 22' TERM=dumb UID=999 USER=deploy _=']' [07/11/18 09:29:15] [SSH] Checking java version of java [07/11/18 09:29:15] [SSH] java -version returned 1.8.0_171. [07/11/18 09:29:15] [SSH] Starting sftp client. [07/11/18 09:29:15] [SSH] Copying latest slave.jar... [07/11/18 09:29:16]
          
          [SSH] Copied 770,802 bytes. Expanded the channel window size to 4MB [07/11/18 09:29:16]
          [SSH] Starting slave process: cd "/home/deploy" && java -jar slave.jar
          <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 3.20 This is a Unix agent Evacuated stdout Agent successfully connected and online
          Jul 11, 2018 9:29:25 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class net.bull.javamelody.RemoteCallHelper$1; see:
          https://jenkins.io/redirect/serialization-of-anonymous-classes/
          Jul 11, 2018 9:29:30 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.Git$1; see:
          https://jenkins.io/redirect/serialization-of-anonymous-classes/
          Jul 11, 2018 9:29:40 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1; see:
          https://jenkins.io/redirect/serialization-of-anonymous-classes/
          

           

          Show
          kiva Romain N added a comment - - edited Hi,  I have some logs from agent:   [07/11/18 09:29:15] [SSH] Opening SSH connection to int-oro-docker-7:32908. [07/11/18 09:29:15] [SSH] WARNING: SSH Host Keys are not being verified. Man-in-the-middle attacks may be possible against this connection. [07/11/18 09:29:15] [SSH] Authentication successful. [07/11/18 09:29:15] [SSH] The remote user's environment is: BASH=/bin/bash BASHOPTS=cmdhist:complete_fullquote:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_EXECUTION_STRING=set BASH_LINENO=() BASH_SOURCE=() BASH_VERSINFO=([0]="4" [1]="3" [2]="48" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu") BASH_VERSION='4.3.48(1)-release' DIRSTACK=() EUID=999 GROUPS=() HOME=/home/deploy HOSTNAME=44b22e41f61f HOSTTYPE=x86_64 IFS=$' \t\n' LOGNAME=deploy MACHTYPE=x86_64-pc-linux-gnu MAIL=/var/mail/deploy OPTERR=1 OPTIND=1 OSTYPE=linux-gnu PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games PIPESTATUS=([0]="0") PPID=37 PS4='+ ' PWD=/home/deploy SHELL=/bin/bash SHELLOPTS=braceexpand:hashall:interactive-comments SHLVL=1 SSH_CLIENT='172.31.16.52 40404 22' SSH_CONNECTION='172.31.16.52 40404 172.17.0.7 22' TERM=dumb UID=999 USER=deploy _=']' [07/11/18 09:29:15] [SSH] Checking java version of java [07/11/18 09:29:15] [SSH] java -version returned 1.8.0_171. [07/11/18 09:29:15] [SSH] Starting sftp client. [07/11/18 09:29:15] [SSH] Copying latest slave.jar... [07/11/18 09:29:16] [SSH] Copied 770,802 bytes. Expanded the channel window size to 4MB [07/11/18 09:29:16] [SSH] Starting slave process: cd "/home/deploy" && java -jar slave.jar <===[JENKINS REMOTING CAPACITY]===>channel started Remoting version: 3.20 This is a Unix agent Evacuated stdout Agent successfully connected and online Jul 11, 2018 9:29:25 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class net.bull.javamelody.RemoteCallHelper$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/ Jul 11, 2018 9:29:30 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.Git$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/ Jul 11, 2018 9:29:40 AM org.jenkinsci.remoting.util.AnonymousClassWarnings warn WARNING: Attempt to (de-)serialize anonymous class org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1; see: https://jenkins.io/redirect/serialization-of-anonymous-classes/  
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          >
          channel is already closed at hudson.remoting.Channel.send
          This is a "fine" behavior for Cloud agents. Not related to this report at least

           

          > org.jenkinsci.plugins.gitclient.Git$1 ,

          > org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1

          Josh Soref has recently created a patch:  https://github.com/jenkinsci/git-client-plugin/pull/330

           

          > net.bull.javamelody.RemoteCallHelper$1

          Just a minor defect, may make sense to create a Jira ticket to the Monitoring plugin

           

          None of these issues are realted, digging into the code

           

          Show
          oleg_nenashev Oleg Nenashev added a comment - > channel is already closed at hudson.remoting.Channel.send This is a "fine" behavior for Cloud agents. Not related to this report at least   > org.jenkinsci.plugins.gitclient.Git$1 , > org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1 Josh Soref has recently created a patch:  https://github.com/jenkinsci/git-client-plugin/pull/330   > net.bull.javamelody.RemoteCallHelper$1 Just a minor defect, may make sense to create a Jira ticket to the Monitoring plugin   None of these issues are realted, digging into the code  
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Romain N I think that it is something in your Pipeline library. " Python unit tests failed" ... there is no such text patterns in the Jenkins codebase. My guess is that datadog.endStep() is a Pipeline library method, which accesses the WorkflowJob class using whitelisted Java API and saves it to a local variable. This logic happens in a method without NonCPS annotation, so Pipeline tries to save the context and legitimately fails since the class must not be serialized in such way.

          Please check your Pipeline lib to verify my theory 

          Show
          oleg_nenashev Oleg Nenashev added a comment - Romain N I think that it is something in your Pipeline library. " Python unit tests failed" ... there is no such text patterns in the Jenkins codebase. My guess is that datadog.endStep() is a Pipeline library method, which accesses the WorkflowJob class using whitelisted Java API and saves it to a local variable. This logic happens in a method without NonCPS annotation, so Pipeline tries to save the context and legitimately fails since the class must not be serialized in such way. Please check your Pipeline lib to verify my theory 
          Hide
          kiva Romain N added a comment -

          Hi,

          datadog is not an external library, it's a class write by ourself to send metric to datadog

          Here is the code if you want to see it:

           

          // Datadog.groovy
          
          import groovy.transform.Field
          
          import java.time.Instant
          import java.time.temporal.ChronoUnit
          
          @Field def configFileDatadog = "/etc/datadog-cli.conf"
          @Field def commandDatadog = "/opt/datadog-agent/bin/dog --config ${configFileDatadog} metric post "
          @Field def datadog_enabled = true
          
          def initStep(step, repo_slug = "") {
              metricsStep(step, repo_slug)
              return Instant.now()
          }
          
          def getTags(repo_slug, tag_separator=',', value_separator=':') {
              if (repo_slug == '') {
                  return ''
              }
              tags = "role${value_separator}${repo_slug}${tag_separator}workflow${value_separator}"
              if (common.is_integration_branch()) {
                  tags = tags + "PR${tag_separator}branch${value_separator}"
                  if (common.is_pr_staging_master()) {
                      tags = tags + "master"
                  } else {
                      tags = tags + "staging"
                  }
              } else {
                  tags = tags + "build${tag_separator}branch${value_separator}${env.BRANCH_NAME}"
              }
              return tags
          }
          
          def metricsStep(step, repo_slug = "", unit = "total") {
              postCountMetric("${step}.${unit}", 1, getTags(repo_slug))
          }
          
          def endStep(step, repo_slug, date, success = true) {
              state_name = success ? 'success' : 'fail'
              metricsStep(step, repo_slug, state_name)
              if (date != null) {
                  def duration = ChronoUnit.MILLIS.between(date, Instant.now())
                  postGaugeMetric("${step}.${state_name}.duration", duration, getTags(repo_slug))
                  // Keep the old one for retrocompat
                  postGaugeMetric("${step}.duration", duration, getTags(repo_slug))
              }
          }
          
          def postCountMetric(name, value = 1, tags = "") {
              postMetric(name, value, "rate", tags)
          }
          
          def postGaugeMetric(name, value, tags = "") {
              postMetric(name, value, "gauge", tags)
          }
          
          def postMetric(name, value, type, tags) {
              if (!(name ==~ /^ci\..*$/))
                  name = "ci.${name}"
              if (datadog_enabled) {
                  sh returnStatus: true, script:"${commandDatadog} --no_host --type ${type} --tags 'jenkins,${tags}' ${name} ${value}"
              }
          }
          
          return this
          
          Show
          kiva Romain N added a comment - Hi, datadog is not an external library, it's a class write by ourself to send metric to datadog Here is the code if you want to see it:   // Datadog.groovy import groovy.transform.Field import java.time.Instant import java.time.temporal.ChronoUnit @Field def configFileDatadog = "/etc/datadog-cli.conf" @Field def commandDatadog = "/opt/datadog-agent/bin/dog --config ${configFileDatadog} metric post " @Field def datadog_enabled = true def initStep(step, repo_slug = "") { metricsStep(step, repo_slug) return Instant.now() } def getTags(repo_slug, tag_separator= ',' , value_separator= ':' ) { if (repo_slug == '') { return '' } tags = "role${value_separator}${repo_slug}${tag_separator}workflow${value_separator}" if (common.is_integration_branch()) { tags = tags + "PR${tag_separator}branch${value_separator}" if (common.is_pr_staging_master()) { tags = tags + "master" } else { tags = tags + "staging" } } else { tags = tags + "build${tag_separator}branch${value_separator}${env.BRANCH_NAME}" } return tags } def metricsStep(step, repo_slug = "", unit = " total") { postCountMetric( "${step}.${unit}" , 1, getTags(repo_slug)) } def endStep(step, repo_slug, date, success = true ) { state_name = success ? 'success' : 'fail' metricsStep(step, repo_slug, state_name) if (date != null ) { def duration = ChronoUnit.MILLIS.between(date, Instant.now()) postGaugeMetric( "${step}.${state_name}.duration" , duration, getTags(repo_slug)) // Keep the old one for retrocompat postGaugeMetric( "${step}.duration" , duration, getTags(repo_slug)) } } def postCountMetric(name, value = 1, tags = "") { postMetric(name, value, "rate" , tags) } def postGaugeMetric(name, value, tags = "") { postMetric(name, value, "gauge" , tags) } def postMetric(name, value, type, tags) { if (!(name ==~ /^ci\..*$/)) name = "ci.${name}" if (datadog_enabled) { sh returnStatus: true , script: "${commandDatadog} --no_host --type ${type} --tags 'jenkins,${tags}' ${name} ${value}" } } return this
          Hide
          oleg_nenashev Oleg Nenashev added a comment -

          Hard to say what exactly goes wrong, but it is not JEP-200 from what I see.

          WorkflowJob is likely referenced from "step". Although a String is passed as a first parameter to endStep() in the original Jenkinsfile, the provided library method clearly references an object like "${step.duration}". I am not sure what happens, but it is something you firstly need to check on your side. If you want to pass non-serializable objects as method arguments, they should be NonCPS.

          Since it is not JEP-200 so far, I will leave the investigation (if needed) to the plugin maintainers

          Show
          oleg_nenashev Oleg Nenashev added a comment - Hard to say what exactly goes wrong, but it is not JEP-200 from what I see. WorkflowJob is likely referenced from "step". Although a String is passed as a first parameter to endStep() in the original Jenkinsfile, the provided library method clearly references an object like "${step.duration}". I am not sure what happens, but it is something you firstly need to check on your side. If you want to pass non-serializable objects as method arguments, they should be NonCPS. Since it is not JEP-200 so far, I will leave the investigation (if needed) to the plugin maintainers
          oleg_nenashev Oleg Nenashev made changes -
          Labels JEP-200 exception pipeline exception pipeline
          Hide
          kiva Romain N added a comment -

          You misread the code.
          It’s not « ${step.duration} » but « ${step}.duration »
          Step is indeed a string as you said

          Show
          kiva Romain N added a comment - You misread the code. It’s not « ${step.duration} » but « ${step}.duration » Step is indeed a string as you said
          vivek Vivek Pandey made changes -
          Labels exception pipeline exception pipeline triaged-2018-11
          Hide
          kiva Romain N added a comment -

          After many investigations, I used Jenkins.getInstance().getJobs() in a groovy file and this call is not Serializable.

          Added a @CPS annotation resolve this problem.

          Show
          kiva Romain N added a comment - After many investigations, I used Jenkins.getInstance().getJobs() in a groovy file and this call is not Serializable. Added a @CPS annotation resolve this problem.
          kiva Romain N made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Resolution Not A Defect [ 7 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              kiva Romain N
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: