A pair of URL path-handling bugs/oddities

Consider this simple bit of code:

Feed in some URLs, and see what you get:

Input Output
http://example.com/foo/bar http://example.com/foo/
http://example.com/foo/bar/ http://example.com/foo/
http://example.com/ http://example.com/../
http://example.com/foo/bar// http://example.com/foo/bar//../

All makes sense until the last one I think.

Possibly the system sees the double slash a being an empty path component, which it can only cancel out with a .. component? But then, why does -lastPathComponent correctly return @"bar"? And if so, wouldn't it return http://example.com/foo/bar/ ?


Hmm, no problem, -standardizedURL is documented to resolve . and .. components!

Input Output
http://example.com/foo/bar//../ http://example.com/foo/bar//

What?! It seems the standardisation routine sees the empty path component and cancels that out with the .. component. Maddening!


So the end result here is that if you need to remove the last path component of a URL, and stand a chance of being passed in one ending in two or more slashes you're kinda stuck.


This does give the desired result for a double-slash-ending URL:

But it does the wrong thing for all others! (Single or no slash URLs take off too many components; triple or more slashes remain stuck)

© Mike Abdullah 2007-2013