一、现象描述:
openstack平台中创建虚拟机后,虚拟机在web页面中显示获取到了ip,但是打开虚拟机控制台后查看网络状态,虚拟机没有ip地址,下图为故障截图:
二、分析思路:
(1)查看neutron服务状态,确保dchp服务正常运行
root@controller22:15:11~#neutron agent-list
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. +--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+ | id | agent_type | host | availability_zone | alive | admin_state_up | binary | +--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+ | 3812cb30-7440-4080-bd75-9634687937f6 | DHCP agent | controller | nova | :-) | True | neutron-dhcp-agent | | 51a30db0-9525-42de-b5d8-6b04e2a13baf | Open vSwitch agent | storage | | :-) | True | neutron-openvswitch-agent | | 63416b42-376b-4576-b89d-12694faa2bf9 | L3 agent | controller | nova | :-) | True | neutron-l3-agent | | 7ce3b592-240f-4090-bf09-9a7ecbfa7d3c | Open vSwitch agent | controller | | :-) | True | neutron-openvswitch-agent | | 851ccdd9-ff14-4e8f-971c-9343787ef056 | Open vSwitch agent | compute | | :-) | True | neutron-openvswitch-agent | | 8c458dca-a306-4882-a851-1c47a19ab3c1 | Metadata agent | controller | | :-) | True | neutron-metadata-agent | +--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+ root@controller22:16:24~#
(2)查看dnsmsp进程是否正常
root@controller22:48:28/var/log/neutron#ps aux | grep dnsmasq nobody 28488 0.0 0.0 53888 1164 ? S 21:51 0:00 dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file=/var/lib/neutron/dhcp/1a426ffe-2bf0-4785-96a5-74402004a17b/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/1a426ffe-2bf0-4785-96a5-74402004a17b/host --addn-hosts=/var/lib/neutron/dhcp/1a426ffe-2bf0-4785-96a5-74402004a17b/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/1a426ffe-2bf0-4785-96a5-74402004a17b/opts --dhcp-leasefile=/var/lib/neutron/dhcp/1a426ffe-2bf0-4785-96a5-74402004a17b/leases --dhcp-match=set:ipxe,175 --bind-interfaces --interface=tap2c7d9cb9-96 --dhcp-range=set:tag0,172.16.0.0,static,86400s --dhcp-option-force=option:mtu,1450 --dhcp-lease-max=65536 --conf-file= --domain=openstacklocal root 30314 0.0 0.0 112724 988 pts/0 R+ 22:48 0:00 grep --color=auto dnsmasq root@controller22:48:52/var/log/neutron#
(3)检查ovs网桥中的 br-int 集成网桥是否有 tap设备 连接到了dchp-agent 的 namesapce上
root@controller22:16:29~#ovs-vsctl show
552eea67-4365-410a-b683-644af569c52d Manager "ptcp:6640:127.0.0.1" is_connected: true Bridge br-ex Port "eth2" Interface "eth2" Port br-ex Interface br-ex type: internal Port "qg-91819abf-e1" Interface "qg-91819abf-e1" type: internal Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port "tap2c7d9cb9-96" tag: 2 Interface "tap2c7d9cb9-96" type: internal Port br-int Interface br-int type: internal Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port "qr-4056447b-ea" tag: 1 Interface "qr-4056447b-ea" type: internal Bridge br-tun Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port "vxlan-c0a8fe97" Interface "vxlan-c0a8fe97" type: vxlan options: {df_default="true", in_key=flow, local_ip="192.168.254.150", out_key=flow, remote_ip="192.168.254.151"} Port patch-int Interface patch-int type: patch options: {peer=patch-tun} Port br-tun Interface br-tun type: internal ovs_version: "2.9.0" root@controller22:16:48~#i
在dhcp命名空间中找到对应网络的 namespace 中找到 br-int 网桥上对应的 tap 设备,然后查看 ip 配置:
root@controller22:26:29/var/log/neutron#ip netns show
qrouter-3028515a-106a-4d77-b2bb-edd34ddbc7c7 (id: 1) qdhcp-1a426ffe-2bf0-4785-96a5-74402004a17b (id: 0) root@controller22:26:44/var/log/neutron# root@controller22:27:14/var/log/neutron# root@controller22:27:14/var/log/neutron#ip netns exec qdhcp-1a426ffe-2bf0-4785-96a5-74402004a17b ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 15: tap2c7d9cb9-96: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:18:16:47 brd ff:ff:ff:ff:ff:ff inet 172.16.199.10/16 brd 172.16.255.255 scope global tap2c7d9cb9-96 valid_lft forever preferred_lft forever inet 169.254.169.254/16 brd 169.254.255.255 scope global tap2c7d9cb9-96 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fe18:1647/64 scope link valid_lft forever preferred_lft forever root@controller22:27:47/var/log/neutron#
三、定位问题:
通过以上排查思路分析,br-int 上是有 tap设备 连接到了dhcp-namespace 中,但却是外部网络的dhcp服务ip,没有发现虚拟机所连接的192.168.168.0/24的dhcp-namespace
四、故障处理流程:
(1)找到对应网络的subnet,把 dchp 功能启用,打对勾
(2)然后到 subnet 中查看时候有 dhcp 端口且有ip,并检查 dhcp-namespace 中的 tap设备是否有了ip
(3)在次查看namespace发现多了一个dhcp-namespace
root@controller23:26:15/var/log/neutron#ip netns show
qdhcp-cb06eada-2249-46e7-bcd8-c9c07937231d (id: 2) qrouter-3028515a-106a-4d77-b2bb-edd34ddbc7c7 (id: 1) qdhcp-1a426ffe-2bf0-4785-96a5-74402004a17b (id: 0)
(4)查看dhcp-namespace的 ip 配置,正好是dhcp的服务ip
root@controller23:26:20/var/log/neutron#ip netns exec qdhcp-cb06eada-2249-46e7-bcd8-c9c07937231d ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 21: tap865fcb34-fc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether fa:16:3e:ce:2f:9b brd ff:ff:ff:ff:ff:ff inet 192.168.168.2/24 brd 192.168.168.255 scope global tap865fcb34-fc valid_lft forever preferred_lft forever inet 169.254.169.254/16 brd 169.254.255.255 scope global tap865fcb34-fc valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fece:2f9b/64 scope link valid_lft forever preferred_lft forever root@controller23:26:54/var/log/neutron#
(5)重启虚拟机虚拟机,发现获取到 ip 了
五、总结
在创建虚拟机下发请求后,dnsmasq进程会给虚拟机分配好mac地址和ip地址,并写入到/var/lib/neutron/dhcp/network-id 目录下的host文件中。虚拟机在内网中发送广播来获取ip的过程中,dnsmasq 会监听到然后将host文件中的对应ip通过dchp-namespace分配给虚拟机。
所以,在虚拟机获取ip过程中,必须虚拟机发出的包可以到达dhcp-namespace 经过的虚拟网络设备都存在且正常工作。
如果没有在subnet中开启上述的dhcp功能,那就少了一个对应网络的name-sapce dhcp服务了,所以虚拟机获取不到 ip。
来源:oschina
链接:https://my.oschina.net/u/4281209/blog/4462218