问题
We're having a strange issue today brought on by an unexpected power failure back in our office while we're all working remotely. After dispatching someone to restart the equipment, our office internet connection came back up and we're able to reach some services, but our Site-to-Site (S2S) VPN between our office network and the cloud is no longer functioning. The odd part is that Azure indicates that the VPN is "Connected", and -- after some creative tunneling -- I was able to confirm that Windows Server 2019 in the office also indicates the connection as "Connected", so this looks like a routing issue. This VPN has worked faithfully for 10 months, through reboots and Windows Updates, and yet today it's inexplicably down.
Now, some history: Back in June 2019, we set-up an S2S VPN between our office in LA and resources in Azure. The goal was to start using Windows Virtual Desktop on Azure for remote employee virtual desktops, while enabling them to access the same resources as on-site employees. Back then, we ran the following PowerShell script on the domain controller in LA to configure Windows Server 2019 with the S2S VPN to Azure:
Install-WindowsFeature Routing, RemoteAccess, RSAT-RemoteAccess-PowerShell
# Only needed if "RestartNeeded" is "Yes"
# Restart-Computer
# After the machine reboots. Launch PowerShell again to resume the configuration
Install-RemoteAccess -VpnType VpnS2S
# Setting variables
$rrasInterfaceName = "Azure (vpn-subnet-to-la)"
$azureGatewayIpAddress = "12.74.131.73"
$virtualNetworkRange = "10.3.0.0/16"
$sharedKey = "redacted-psk"
Function Invoke-WindowsApi(
[string] $dllName,
[Type] $returnType,
[string] $methodName,
[Type[]] $parameterTypes,
[Object[]] $parameters
)
{
## Begin to build the dynamic assembly
$domain = [AppDomain]::CurrentDomain
$name = New-Object Reflection.AssemblyName 'PInvokeAssembly'
$assembly = $domain.DefineDynamicAssembly($name, 'Run')
$module = $assembly.DefineDynamicModule('PInvokeModule')
$type = $module.DefineType('PInvokeType', "Public,BeforeFieldInit")
$inputParameters = @()
for($counter = 1; $counter -le $parameterTypes.Length; $counter++)
{
$inputParameters += $parameters[$counter - 1]
}
$method = $type.DefineMethod($methodName, 'Public,HideBySig,Static,PinvokeImpl',$returnType, $parameterTypes)
## Apply the P/Invoke constructor
$ctor = [Runtime.InteropServices.DllImportAttribute].GetConstructor([string])
$attr = New-Object Reflection.Emit.CustomAttributeBuilder $ctor, $dllName
$method.SetCustomAttribute($attr)
## Create the temporary type, and invoke the method.
$realType = $type.CreateType()
$ret = $realType.InvokeMember($methodName, 'Public,Static,InvokeMethod', $null, $null, $inputParameters)
return $ret
}
Function Set-PrivateProfileString(
$file,
$category,
$key,
$value)
{
## Prepare the parameter types and parameter values for the Invoke-WindowsApi script
$parameterTypes = [string], [string], [string], [string]
$parameters = [string] $category, [string] $key, [string] $value, [string] $file
## Invoke the API
[void] (Invoke-WindowsApi "kernel32.dll" ([UInt32]) "WritePrivateProfileString" $parameterTypes $parameters)
}
# Add and configure S2S VPN interface for VNet1
Add-VpnS2SInterface -Protocol IKEv2 -AuthenticationMethod PSKOnly -ResponderAuthenticationMethod PSKOnly `
-Name $rrasInterfaceName -Destination $azureGatewayIpAddress -IPv4Subnet @("$($virtualNetworkRange):256")`
-NumberOfTries 3 -SharedSecret $sharedKey
Set-VpnServerIPsecConfiguration -EncryptionType MaximumEncryption
# default value for Windows 2012 is 100MB, which is way too small. Increase it to 32GB.
Set-VpnServerIPsecConfiguration -SADataSizeForRenegotiationKilobytes 33553408
# TODO: Confirm why this setting is needed/what it does
# Seems related to this: https://tools.ietf.org/html/draft-dukes-ikev2-config-payload-00
New-ItemProperty -Path HKLM:\System\CurrentControlSet\Services\RemoteAccess\Parameters\IKEV2 -Name SkipConfigPayload -PropertyType DWord -Value 1 -Force
# Set S2S VPN connections to be persistent by editing the router.pbk file (required admin priveleges)note that the IdelDisconnectSeconds and RedialOnLinkFailure are set for reach adaptors.
Set-PrivateProfileString $env:windir\System32\ras\router.pbk "$($rrasInterfaceName)" "IdleDisconnectSeconds" "0"
Set-PrivateProfileString $env:windir\System32\ras\router.pbk "$($rrasInterfaceName)" "RedialOnLinkFailure" "1"
# Restart the RRAS service
Restart-Service RemoteAccess
Connect-VpnS2SInterface -Name $rrasInterfaceName
route -p ADD 10.1.0.0 MASK 255.255.0.0 10.3.0.1 IF 30
The static routing rule at the end ensures that packets destined for the Windows Virtual Desktop machines in the 10.1.0.x range get routed through the gateway on the other side of the S2S VPN at 10.3.0.1. The S2S VPN VNet is peered to the VNet that WVD is connected to.
Again, I want to emphasize we have made no changes to either the Azure VPN or server configuration since this was setup in June.
The routing table looks like this:
===========================================================================
Interface List
15...6c 4b 90 21 ab 9b ......Intel(R) Ethernet Connection (2) I219-LM
27...........................Azure (vpn-subnet-to-la)
1...........................Software Loopback Interface 1
===========================================================================
IPv4 Route Table
===========================================================================
Active Routes:
Network Destination Netmask Gateway Interface Metric
0.0.0.0 0.0.0.0 192.168.100.254 192.168.100.1 281
10.3.0.0 255.255.0.0 On-link 169.254.0.27 281
10.3.255.255 255.255.255.255 On-link 169.254.0.27 281
127.0.0.0 255.0.0.0 On-link 127.0.0.1 331
127.0.0.1 255.255.255.255 On-link 127.0.0.1 331
127.255.255.255 255.255.255.255 On-link 127.0.0.1 331
169.254.0.0 255.255.0.0 On-link 169.254.0.27 281
169.254.0.27 255.255.255.255 On-link 169.254.0.27 281
169.254.255.255 255.255.255.255 On-link 169.254.0.27 281
192.168.100.0 255.255.255.0 On-link 192.168.100.1 281
192.168.100.1 255.255.255.255 On-link 192.168.100.1 281
192.168.100.255 255.255.255.255 On-link 192.168.100.1 281
224.0.0.0 240.0.0.0 On-link 127.0.0.1 331
224.0.0.0 240.0.0.0 On-link 192.168.100.1 281
224.0.0.0 240.0.0.0 On-link 169.254.0.27 281
255.255.255.255 255.255.255.255 On-link 127.0.0.1 331
255.255.255.255 255.255.255.255 On-link 192.168.100.1 281
255.255.255.255 255.255.255.255 On-link 169.254.0.27 281
===========================================================================
Persistent Routes:
Network Address Netmask Gateway Address Metric
10.1.0.0 255.255.0.0 10.3.0.1 1
0.0.0.0 0.0.0.0 192.168.100.254 Default
===========================================================================
IPv6 Route Table
===========================================================================
Active Routes:
If Metric Network Destination Gateway
1 331 ::1/128 On-link
15 281 fe80::/64 On-link
15 281 fe80::6430:2788:424f:47fb/128
On-link
1 331 ff00::/8 On-link
15 281 ff00::/8 On-link
===========================================================================
Persistent Routes:
None
Highlights:
- 192.168.100.1 is the Domain Controller that is providing the VPN connection to Azure.
- 192.168.100.254 is the router to the internet.
- The default gateway of the DC is 192.168.100.254 (so, by default, the DC routes traffic out to the internet via the router).
- The network is configured to obtain DHCP leases from the DC rather than the router.
- The DC is configured to issue DHCP leases that use the DC as the default gateway, so that packets from the rest of the office network that are destined for the cloud go through the VPN, while packets destined for the internet are forwarded to the router.
With this configuration, internet traffic is working fine. Everything on the office network is able to reach the internet just fine. But the cloud can't access anything on the local network and vice-versa.
Here's what the server indicates the status of the S2S Interface is:
Get-VpnS2SInterface -Name "Azure (vpn-subnet-to-la)"
RoutingDomain Name Destination AdminStatus ConnectionState IPv4Subnet
------------- ---- ----------- ----------- --------------- ----------
- Azure (vpn-subnet... {12.74.131.73} True Connected {10.3.0.0/16:256}
Here's a trace route showing that traffic destined for the cloud is being routed wrongly through the router:
tracert 10.1.2.7
Tracing route to 10.1.2.7 over a maximum of 30 hops
1 <1 ms <1 ms <1 ms dsldevice.attlocal.net [192.168.100.254]
2 * * *
3 * * *
Why is Windows not routing through the correct interface?
回答1:
It appears that the unexpected power outage caused Windows to reinitialize the S2S interface so that it has a different interface ID. Note that in the original script I ran back in June, the interface number was 30
.
But, when I deleted the static route and re-added it, I got:
route delete 10.1.0.0
route -p ADD 10.1.0.0 MASK 255.255.0.0 10.3.0.1 IF 30
The route addition failed: The system cannot find the file specified.
This prompted me to review the interface list at the top of the route print
output:
===========================================================================
Interface List
15...6c 4b 90 21 ab 9b ......Intel(R) Ethernet Connection (2) I219-LM
27...........................Azure (vpn-subnet-to-la)
1...........................Software Loopback Interface 1
===========================================================================
Note that the interface number is now 27
. So I ran:
route -p ADD 10.1.0.0 MASK 255.255.0.0 10.3.0.1 IF 27
OK!
Now when I run trace route:
tracert 10.1.2.7
Tracing route to 10.1.2.7 over a maximum of 30 hops
1 <1 ms <1 ms <1 ms server.subdomain.mydomain.com [192.168.100.1]
2 34 ms 33 ms 35 ms 10.1.2.7
Trace complete.
来源:https://stackoverflow.com/questions/61151643/why-is-my-site-to-site-vpn-connection-between-windows-server-2019-and-azure-sudd