A pair of URL path-handling bugs/oddities

Consider this simple bit of code:

Feed in some URLs, and see what you get:

Input Output
http://example.com/foo/bar http://example.com/foo/
http://example.com/foo/bar/ http://example.com/foo/
http://example.com/ http://example.com/../
http://example.com/foo/bar// http://example.com/foo/bar//../


All makes sense until the last one I think.

Possibly the system sees the double slash a being an empty path component, which it can only cancel out with a .. component? But then, why does -lastPathComponent correctly return @"bar"? And if so, wouldn't it return http://example.com/foo/bar/ ?

rdar://problem/12842744

Hmm, no problem, -standardizedURL is documented to resolve . and .. components!

Input Output
http://example.com/foo/bar//../ http://example.com/foo/bar//


What?! It seems the standardisation routine sees the empty path component and cancels that out with the .. component. Maddening!

rdar://problem/12842781

So the end result here is that if you need to remove the last path component of a URL, and stand a chance of being passed in one ending in two or more slashes you're kinda stuck.

Workarounds

This does give the desired result for a double-slash-ending URL:

But it does the wrong thing for all others! (Single or no slash URLs take off too many components; triple or more slashes remain stuck)

© Mike Abdullah 2007-2013