Last night a customer called, frantic, because Google had cached versions of private employee information. The information is not available unless you login.
They had do
Though the question mainly references session identifiers, the length of the identifier struck me as unusual.
There are at least two types of cookie/cookieless operations that can modify the query string to include an ID.
They are completely independent of each other (as far as I can tell).
A cookieless session allows the server to access session state data based on a unique ID in the URL versus a unique ID in a cookie. This is usually considered a fine practice, though ASP.Net reuses session IDs which makes it more prone to session fixation attempts (separate topic but worth knowing about).
Does session identity in ASP.net depend solely on the cookie? Can anyone, from any IP, with the cookie-url, access that session? Does ASP.net not, by default, also take into account?
The session ID is all that is required.
General Session Security Reading
Based on the length of the example data, I'm guessing your URL actually contains a forms authentication value, not a session ID. The source code suggests that cookieless mode is not something you must explicitly enable.
/// <summary>ASP.NET determines whether to use cookies based on
/// <see cref="T:System.Web.HttpBrowserCapabilities" /> setting.
/// If the setting indicates that the browser or device supports cookies,
/// cookies are used; otherwise, an identifier is used in the query string.</summary>
UseDeviceProfile
Here's how the determination is made:
// System.Web.Security.CookielessHelperClass
internal static bool UseCookieless( HttpContext context, bool doRedirect, HttpCookieMode cookieMode )
{
switch( cookieMode )
{
case HttpCookieMode.UseUri:
return true;
case HttpCookieMode.UseCookies:
return false;
case HttpCookieMode.AutoDetect:
{
// omitted for length
return false;
}
case HttpCookieMode.UseDeviceProfile:
if( context == null )
{
context = HttpContext.Current;
}
return context != null && ( !context.Request.Browser.Cookies || !context.Request.Browser.SupportsRedirectWithCookie );
default:
return false;
}
}
Guess what the default is? HttpCookieMode.UseDeviceProfile
. ASP.Net maintains a list of devices and capabilities. This list is generally a very bad thing; for example, IE11 gives a false positive for being a downlevel browser on par with Netscape 4.
I think Gene's explanation is very likely; Google found the URL from some user action and crawled it.
It's completely conceivable that the Google bot is deemed to not support cookies. But this doesn't explain the origin of the URL, i.e. what user action resulted in Google seeing a URL with an ID already in it? A simple explanation could be a user with a browser that was deemed to not support cookies. Depending on the browser, everything else could look fine to the user.
The timing, i.e. the duration of validity seems long, though I'm not that familiar with how long the authentication tickets are valid and under what circumstances they could be renewed. It's entirely possible ASP.Net continued to reissue/renew tickets as it would do for a continually active user.
I'm making a lot of assumptions here, but If I'm correct:
Explicitly disable cookieless behavior by using HttpCookieMode.UseCookies
.
web.config:
<authentication mode="Forms">
<forms loginUrl="~/Account/Login.aspx" name=".ASPXFORMSAUTH" timeout="26297438"
cookieless="UseCookies" />
</authentication>
While this should resolve the behavior, you might investigate extending the forms authentication HTTP module and adding additional validation (or at least logging/diagnostics).
You asked for thoughts, so I'll give some. No warranty expressed or implied.
Give up the idea that your site is configured not to encode session information in URIs. With very high probability it did so. Either you're wrong about the configuration or (more likely) there's a bug that caused it to do so.
That leaves the central question: how Google obtained the session URI?
You didn't say anything about the customer base. Here's a guess:
A customer logged into the system in a way that produced a URI encoding of the session, then emailed this using a gmail account to someone else. Google scanned the email and provided the URI to the crawler bot.
There are other, similar ways that a customer whose client produced the URI could inadvertently surrender it to Google. Google Drive document. Google Plus posting. Etc.
Google may not be evil, but they're nonetheless everywhere. Their use agreement lets them move links across product boundaries, in this case mail (etc.) to search.
The real question you should be thinking about is why your site is not protected from cross-site request forgery. The Rails folks explain this pretty nicely. The Rails protect_from_forgery
mechanism would have prevented the reported problem.
A related question is why the encoded cookie (apparently) never expires. It ought to be easy to make sessions contain timestamps to make this so.