While connecting to windows machine as slave, i am getting following error i think its some network related issue, but need some help where to start looking or what is a pos
I was facing the same issue , rectified using below steps
I experienced the same issue. I found out that the windows slave switched to a "sleep" mode specially if your jobs are not running against a GUI.
Then to successfully solve it. On a Windows7 slave, here is what I did:
select High performance
Control Panel\Hardware and Sound\Power Options\Edit Plan Settings
Should be ok after this procedure
On Windows, I recognized that I needed to add the "-noCertificateCheck" attribute to the arguments of the jenkins-slave.xml in the workdir. We use a cert from a internal PKI on the master and this was the easiest way to work around it (having everything in the internal network).
<arguments>-Xrs -jar "%BASE%\slave.jar" -jnlpUrl https://jenkins.ourdomain.com/computer/Windows%20build%20server%20-%20Bare%20metal/slave-agent.jnlp -secret abc -noCertificateCheck</arguments>
I recognized this by manually running the agent from the command prompt:
java -jar agent.jar -jnlpUrl https://jenkins.ourdomain.com/computer/Windows%20build%20server%20-%20Bare%20metal/slave-agent.jnlp -secret abc -workDir "D:\agentroot" -noCertificateCheck
ok, here how I've solved my special case:
I had some VM's with libvirt/quemu running as slaves. Because the libvirt-plugin was to unreliable for me I've started those VM's on my own. I asked my self: "Why this libvirt-plugin had a mandatory delay time... Impatience...
So if the libvirt-client (slave) is saying hello to jenkins you should probably wait some secs to let this poor guy breath a bit. After starting up.
The slave was a win7 the host a ubuntu 18.04
in addition to the error log in the post, I got also the error log under the jenkins directory in the slave (for me it was C:\jenkins\jenkins-slave.err.log):
JNLP file http://jenkins.domain.com/computer/my_slave_name/slave-agent.jnlp?encrypt=true has invalid arguments: [#####################################, my_slave_name, -workDir, c:\jenkins, -internalDir, remoting, -url, http://jenkins.domain.com/, -headless, -jar-cache, C:\Users\Administrator.jenkins\cache\jars] Most likely a configuration error in the master "-workDir" is not a valid option
my solution:
1)windows slave level: close the services console in the GUI for all users - this is must. from some reason Microsoft is locking installation/removal of windows services
2)windows slave level: kill all java and jenkins-slave processes (if exist)
3)windows slave level: delete the jenkins slave service (if exist) from cmd: sc delete jenkinsslave-c__jenkins /force
(in my case)
4)windows slave level: verify that you have java 8 installed: i'm using jdk1.8.0_151
. uninstall all old java version
5)jenkins master ui level: Change the way the Jenkins is connect to the slave under slave configure --> Launch method: Let Jenkins control this Windows slave as a Windows service
(instead of Launch agent via Java Web Start
)
6) aws level: Increase the aws elb Idle timeout to 600
(from 60
) - like @njtman suggested
7)jenkins master ui level: relaunch the agent in jenkins and wait several minutes.
my environment:
jenkins: 2.89.2 , os: windows 2012 R2, java: jdk1.8.0_151
I was experiencing a similar error as the OP where the connection to my slave was dropping. The root cause of the issue was not due to a mismatch in Java versions between Jenkins slave and master hosts.
Solution If you are running Jenkins in an EC2 instance on AWS behind an Elastic Load Balancer (ELB), increase the "idle timeout" value under the "attributes" section from the default 60 seconds. I set the new value to 600 and no longer experienced the error.
It appears that if a single command in your build process takes greater than 60 seconds with no log output, the ELB will terminate the session due to idle activity.
Source: https://issues.jenkins-ci.org/browse/JENKINS-44001?focusedCommentId=312412&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-312412