The buffers dont flip at a fixed rate matched with the monitor refresh, they flip once the frame has been drawn, the tearing comes when the monitor is mid-refresh and the buffer flips, not the buffer flipping as the graphics card is drawing to it.
I think there is a minor delay between refresh on the screen, with CRTs there was a small delay where the cathode ray tube would reposition from the bottom of the screen to the top ready to start the new draw, however I believe it takes most of the 16.7ms to actuall draw the frame. If a 60hz monitor could draw the entire screen in say 8ms instead of 16.7 then they'd sell it as a 120hz monitor
But fundamentally Its possible for the buffer to flip as the screen is between drawing in which case you dont get a tear, but thats very unlikely.
I think he was referring to the time it takes to send the data over DVI/HDMI/whatever. Remember, the buffer gets sent over a cable to the monitor, and the monitor draws from that data - it doesn't draw directly from the buffer. But afaik DVI isn't setup like that, so it still takes 16.7ms to "draw" from the buffer even though in this case draw actually means transmit over a cable.
You dont understand it frame by frame but overall movement of objects through the scene is going to seem smoother because you're seeing the scene over a period of time rather than just a snapshot. Kind of like real motion blur captured by cameras with long aperture times, film at 24-25fps seems smooth because each frame is capturing information of the world over the period of the frame which visually shows as motion blur, where as traditional rendering shows snapshots and needs a higher frame rate to seem smooth.
But what I'm saying is that it doesn't appear like that. It still looks like a single snapshot of time. There isn't any added blur as a result, so it won't be any smoother.
Yes you still only get 60 complete frames/refreshes, but you see more information than a single frame is capable of telling, as I said you can infer direction and speed of movement of objects in the scene with one frame. Good for fast moving targets or rapidly chaning viewport direction.
No you can't. You can't infer direction or speed at all with one frame, even if that complete frame is composed of multiple frames, at least not without some serious analyzing of the scene (and depending on where the tears are, it won't be possible at all), and I dispute that you will be able to analyze and figure that out in 1/60th of a second.