jsessionid is the parameter that a Servlet engine adds to your site’s URL if you’ve enabled cookies in your config but the user viewing the site doesn’t have cookies enabled.
It then allows a cookie-less user to use your site and maintain their session.
It seems like a good idea but it’s a bit flawed.
The author of randomCoder has summarised the flaws quite well.
Every link on your site needs manual intervention
Cookieless sessions are achieved in Java by appending a string of the format ;jsessionid=SESSION_IDENTIFIER to the end of a URL. To do this, all links emitted by your website need to be passed through either HttpServletRequest.encodeURL(), either directly or through mechanisms such as the JSTL <c:out /> tag. Failure to do this for even a single link can result in your users losing their session forever.
Using URL-encoded sessions can damage your search engine placement
To prevent abuse, search engines such as Google associate web content with a single URL, and penalize sites which have identical content reachable from multiple, unique URLs. Because a URL-encoded session is unique per visit, multiple visits by the same search engine bot will return identical content with different URLs. This is not an uncommon problem; a test search for ;jsessionid in URLs returned around 79 million search results.
It’s a security risk
Because the session identifier is included in the URL, an attacker could potentially impersonate a victim by getting the victim to follow a session-encoded URL to your site. If the victim logs in, the attacker is logged in as well – exposing any personal or confidential information the victim has access to. This can be mitigated somewhat by using short timeouts on sessions, but that tends to annoy legitimate users.
There’s one other factor for me too; public users of my site don’t require cookies – so I really don’t need jsessionids at all.
Fortunately, he also presents an excellent solution to the problem.
The solution is to create a servlet filter which will intercept calls to HttpServletRequest.encodeURL() and skip the generation of session identifiers. This will require a servlet engine that implements the Servlet API version 2.3 or later (J2EE 1.3 for you enterprise folks). Let’s start with a basic servlet filter:
He then goes on to dissect the code section by section and presents a link at the end to download it all.
So I downloaded it, reviewed it, tested it and implemented it on my site.
It works a treat!
However, I still had a problem; Google and other engines still have lots of links to my site with jsessionid in the URL.
I wanted a clean way to remove those links from its index.
Obviously I can’t make Google do that directly.
But I can do it indirectly.
The trick is first to find a way to rewrite incoming URLs that contain a jsessionid to drop that part of the URL.
Then to tell the caller of the URL to not use that URL in future but to use the new one that doesn’t contain jsessionid.
Sounds complicated, but there are ways of doing both.
I achieved the first part using a thing called mod rewrite.
This allows me to map an incoming URL to a different URL – it’s commonly used to provide clean URLs on Web sites.
For the second part there is a feature of the HTTP spec that allows me to indicate that a link has been permanently changed and that the caller should update their link to my site.
301 Moved Permanently
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible.
So, putting these two together, I wrote the following mod rewrite rules for Apache.
ReWriteRule ^/(\w+);jsessionid=\w+$ /$1 [L,R=301]
ReWriteRule ^/(\w+\.go);jsessionid=\w+$ /$1 [L,R=301]
The first rule says that any URLs ending in jsessionid will be rewritten without the jsessionid.
The second does the same but maps anything ending in .go – I was too lazy to work out a single pattern to do both types of URLs in one line.
And I used that all-important 301 code to persuade Google to update its index to the new link.
So, from now on – my pages will no longer output jsessionids and any incoming links that include them will have them stripped out.
In other words; jsessionids purged.