问题
I'm using a Jenkins declarative pipeline with Docker Agents to build and test my software, including running integration tests using testcontainers. I can run my testcontainers tests OK in my development environment (not using Jenkins), but they fail under Jenkins.
The testcontainers Ryuk resource reaping daemon does not work
16:29:20.255 [testcontainers-ryuk] WARN o.t.utility.ResourceReaper - Can not connect to Ryuk at 172.17.0.1:32769
java.net.NoRouteToHostException: No route to host (Host unreachable)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at java.base/java.net.Socket.connect(Socket.java:540)
at java.base/java.net.Socket.<init>(Socket.java:436)
at java.base/java.net.Socket.<init>(Socket.java:213)
at org.testcontainers.utility.ResourceReaper.lambda$start$1(ResourceReaper.java:112)
at java.base/java.lang.Thread.run(Thread.java:834)
I was able to work around that problem by disabling the daemon by setting the TESTCONTAINERS_RYUK_DISABLED
environment variable to true
. But some of the integration tests still repeatedly fail, however.
An integration test using an ElasticsearchContainer
repeatedly fails to start: it times out waiting for the HTTP port to respond.
17:04:57.595 [main] INFO d.e.c.7.1] - Starting container with ID: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.465 [main] INFO d.e.c.7.1] - Container docker.elastic.co/elasticsearch/elasticsearch:6.7.1 is starting: f5c653442103b9073c76f6ed91fc9117f7cb388d576606be8bd85bd9f3b2051d
17:04:58.479 [main] INFO o.t.c.wait.strategy.HttpWaitStrategy - /loving_swartz: Waiting for 240 seconds for URL: http://172.17.0.1:32833/
17:08:58.480 [main] ERROR d.e.c.7.1] - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://172.17.0.1:32833/ should return HTTP 200)
at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:197)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:35)
at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:582)
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:259)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:212)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:76)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:210)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:199)
at
...
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
17:08:58.513 [main] ERROR d.e.c.7.1] - Log output from the failed container:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=0
[2019-04-11T17:05:02,527][INFO ][o.e.e.NodeEnvironment ] [1a_XhBT] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [1.2tb], net total_space [1.2tb], types [rootfs]
[2019-04-11T17:05:02,532][INFO ][o.e.e.NodeEnvironment ] [1a_XhBT] heap size [989.8mb], compressed ordinary object pointers [true]
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] node name derived from node ID [1a_XhBTfQZWw1XLZMXrp4A]; set [node.name] to override
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] version[6.7.1], pid[1], build[default/docker/2f32220/2019-04-02T15:59:27.961366Z], OS[Linux/3.10.0-957.10.1.el7.x86_64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/12/12+33]
[2019-04-11T17:05:02,536][INFO ][o.e.n.Node ] [1a_XhBT] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-14081126934203442674, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=docker]
...
[2019-04-11T17:05:16,338][INFO ][o.e.d.DiscoveryModule ] [1a_XhBT] using discovery type [single-node] and host providers [settings]
[2019-04-11T17:05:17,795][INFO ][o.e.n.Node ] [1a_XhBT] initialized
[2019-04-11T17:05:17,795][INFO ][o.e.n.Node ] [1a_XhBT] starting ...
[2019-04-11T17:05:18,086][INFO ][o.e.t.TransportService ] [1a_XhBT] publish_address {172.28.0.3:9300}, bound_addresses {0.0.0.0:9300}
[2019-04-11T17:05:18,128][WARN ][o.e.b.BootstrapChecks ] [1a_XhBT] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2019-04-11T17:05:18,299][INFO ][o.e.h.n.Netty4HttpServerTransport] [1a_XhBT] publish_address {172.28.0.3:9200}, bound_addresses {0.0.0.0:9200}
[2019-04-11T17:05:18,299][INFO ][o.e.n.Node ] [1a_XhBT] started
[2019-04-11T17:05:18,461][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [1a_XhBT] Failed to clear cache for realms [[]]
[2019-04-11T17:05:18,542][INFO ][o.e.g.GatewayService ] [1a_XhBT] recovered [0] indices into cluster_state
[2019-04-11T17:05:18,822][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watch-history-9] for index patterns [.watcher-history-9*]
[2019-04-11T17:05:18,871][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.watches] for index patterns [.watches*]
[2019-04-11T17:05:18,906][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.triggered_watches] for index patterns [.triggered_watches*]
[2019-04-11T17:05:18,955][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-logstash] for index patterns [.monitoring-logstash-6-*]
[2019-04-11T17:05:19,017][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-es] for index patterns [.monitoring-es-6-*]
[2019-04-11T17:05:19,054][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-alerts] for index patterns [.monitoring-alerts-6]
[2019-04-11T17:05:19,100][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-beats] for index patterns [.monitoring-beats-6-*]
[2019-04-11T17:05:19,148][INFO ][o.e.c.m.MetaDataIndexTemplateService] [1a_XhBT] adding template [.monitoring-kibana] for index patterns [.monitoring-kibana-6-*]
[2019-04-11T17:05:19,480][INFO ][o.e.l.LicenseService ] [1a_XhBT] license [17853035-5cf6-49c8-96ca-4d14b26325f6] mode [basic] - valid
Yet the Elasticsearch log file looks OK, and includes the last log message that Elasticsearch writes during start up (about the license).
Manually changing that container to use a HostPortWaitStrategy
instead of the default HttpWaitStrategy
did not help.
While trying to investigate or work around this problem, I changed my test code to explicitly start the Docker network, by calling network.getId()
for the testcontainers Network
object. That then failed with a NoRouteToHostException
.
How do I fix this?
回答1:
After some experimentation, I've discovered the cause of the problem. The crucial action is trying to create a Docker bridge network (using docker network create
, or a testcontainers Network
object) inside a Docker container that is itself running in a Docker bridge network. If you do this you will not get an error message from Docker, nor will the Docker daemon log file include any useful messages. But attempts to use the network will result in there being "no route to host".
I fixed the problem by giving my outermost Docker containers (the Jenkins Agents) access to the host network, by having Jenkins provide a --network="host"
option to its docker run
command:
pipeline {
agent {
dockerfile {
filename 'Dockerfile.jenkinsAgent'
additionalBuildArgs ...
args '-v /var/run/docker.sock:/var/run/docker.sock ... --network="host" -u jenkins:docker'
}
}
stages {
...
That is OK because the Jenkins Agents do not need the level of isolation given by a bridge network.
来源:https://stackoverflow.com/questions/55653747/using-testcontainers-in-a-jenkins-docker-agent-containers-fail-to-start-norout