The best, simplest explanation of the Vis process I've ever seen is in this short (under 3 minutes) presentation by Michael Abrash. He steps you through exactly what the solution is, and it seriously couldn't be clearer.
https://m.youtube.com/watch?v=1AUxDCHaw84
One problem with classic Quake Vis is that it every surface in a map contributes exponentially to the complexity. This is manageable for coarse structures like the ID1 maps. When lots of small detail and trim is added, the number of surfaces jumps and the complexity is too much.
A second problem is that the "visibility through portals" approach doesn't work well with large open spaces.
A third problem is that the generated PVS is static, so moving objects, such as doors, don't help to reduce map complexity.
Much of this is solved by Quake 2, where the BSP format separates structural brushes from detail brushes, and where areaportals can allow doors to exclude parts of the map behind them.
It's true that rendering optimisations can help reduce the requirement for Vis, but only in the renderer. Vis is also used to optimise entity interactions on the server, as well as server-to-client traffic. An unvised map won't be able to use those optimizations.