I thought those pages would have to either send a 404 / 410 header or be blocked by robots.txt to have the URL removal tool work, no?
No, not as I understand it.
The removal tool only removes a url for 90 days. If the page no longer exists after you remove it with the removal tool, you need to make sure that a 404/410 is sent before the 90 days is up to keep Google from re-including the url. I don't believe a 404 or 410 is necessary for the tool to work in the first place, though.
And, if the page will still exist after the 90 days, you can use robots.txt to keep Google from spidering the page. robots.txt, though, will not necessarily keep all references to the page out of the index.
You can use meta robots noindex (
without robots.txt) to keep references to an existing page out of the visible index once the 90 days is up. Again, noindex should not be necessary for the tool to work.
And here's an excellent discussion on the distinction between indexing and crawling... not necessarily relevant to a discussion about the removal tool, but since I've mention robots.txt and noindex in the same discussion, someone will surely leap in to clarify, and we'll be doomed to repeat this discussion again ;) ...
Pages are indexed even after blocking in robots.txt http://www.webmasterworld.com/google/4490125.htm [webmasterworld.com]