Kinn
I got poor performance on some maps with Mark v on my surface pro. My pro can run dark souls, burnout paradise and some other decent games so it's no slouch.
Dark places and Rmq tend to be better performance for big maps
#63 posted by JneeraZ on 2015/09/23 13:23:41
I think it's just the old rendering code. It probably spends more time figuring out what not to draw than it would take to just draw it...
What Warren Said
TB just renders everything in every frame and it's fairly smooth. I'm sure with proper optimization it could be faster by a factor of two.
#65 posted by Spirit on 2015/09/23 15:14:16
Most engines were focused on features and that was many years ago. Modern engines that break some hardware compatibilities can do it. Iirc it's VBO, as mh did.
#66 posted by Kinn on 2015/09/23 16:17:10
I think it's just the old rendering code. It probably spends more time figuring out what not to draw than it would take to just draw it...
So...in an engine that uses some reasonably modern rendering code, an unvised map would run quicker than the same vised map? Is this the case with Quakespasm?
#67 posted by JneeraZ on 2015/09/23 16:46:12
Potentially. A lot of modern games are throwing around props in scenes that have more triangles than entire Quake levels.
#68 posted by metlslime on 2015/09/23 17:07:51
an unvised map would not run quicker -- but it might run at nearly the same speed with the right engine code. The only way unvised would be quicker is if the engine detected this case and used different code to render it compared to a vised map.
But the main point still stands, polygons are not the bottleneck for quake on modern hardware, it's things like draw calls, lightmap uploads, and (perhaps) geometry uploads.
I believe the ideal quake renderer for modern hardware would reduce draw calls down to 1 per texture, put all geometry in vertex buffers (potentially ignoring vis to make this work,) and do lightstyle calculations in hardware.
#69 posted by JneeraZ on 2015/09/23 17:27:28
It would definitely be a revolution to leave VIS behind ... imagine the iteration times if level designers only had to QBSP and LIGHT.
#70 posted by Spike on 2015/09/23 20:05:32
Even if you stick the entire map into a single vbo and draw it with a single draw call (which is possible, except for sky+water), you will still suffer from overdraw.
those individual props with more polys than entire quake maps do not have 15 different rooms overlapping each other.
it might be interesting to throw the entire map at the screen in a single draw call, both with and without an early z pass...
Kinn
#71 posted by ericw on 2015/09/23 21:39:25
Quakespasm has upgraded world and mdl rendering paths vs Fitzquake, if I disable these with the "-novbo" option my framerate goes from 60 -> high 20s in that map.
However, I think QS would still run faster if this map were vised - it could skip drawing entities, and skip draw calls for the world (there's a draw call for each batch of polys that share a texture and a lightmap.)
IMO all maps should still be vised, if someone did develop an engine that was faster ignoring vis data, that engine could just ignore the visdata.
What would be interesting, I think, is an upgraded / state of the art vis tool that could give less exact results but finish in a reasonable time. I wonder if there's some off the shelf, open source vis algorithm that could be used. I tried to vis the jam version of jam6_ericwtronyn but the progress indicator got stuck at the end after 2 weeks, it didn't finish in the following week so I killed it.
Vis isn't going to do a ton on this map, but if it could just separate the cave and outdoor sections that would help.
it might be interesting to throw the entire map at the screen in a single draw call, both with and without an early z pass...
I'm curious how you'd handle the texture switches when drawing the world in 1 call, is there a better way than a texture atlas? When I read about atlases, they sounded like a big pain because you have to partially tile the textures in the atlas so texture filtering doesn't read into adjacent textures.
Ericw
#72 posted by Kinn on 2015/09/23 21:53:06
Thanks, some cool info there :)
I tried to vis the jam version of jam6_ericwtronyn but the progress indicator got stuck at the end after 2 weeks
o_O
To what extent were detail brushes used in that map? I figured that kind of solved the vis time problem, even for crazy bonkers tronyn stuff.
#73 posted by ericw on 2015/09/23 21:58:40
It made good use of func_detail too, had something like 7k clusters, 24k leafs - so vis only had to be computed for the 7k clusters.
Could be it's just the map layout being open. For comparison I've tested ijed's telefragged.bsp and it full vises in about a minute!
Jam6_ericwtronyn
#74 posted by Kinn on 2015/09/23 22:08:17
I just realised the map source is provided so I fired it up in radiant, pressed Alt-2 to only show the world brushes, and...yes I see the problem :)
#70
#75 posted by metlslime on 2015/09/23 22:25:34
Yes, overdraw is the thing I forgot... Still need a good solution for that.
Noob Question About Overdraw
#76 posted by Kinn on 2015/09/23 22:48:49
I'm having some trouble understanding the technical bits - so I assume drawing X triangles that are all stacked behind each other (lots of overdraw) is slower than drawing the same number of triangles spread out on a sheet but still all in view (no overdraw) ?
#77 posted by rebb on 2015/09/23 23:41:16
Carmack himself stated a while ago, you can ultimately throw brute force at every problem once the hardware is fast enough. Vis is a clever solution to a problem, precalculate instead of doing things during runtime.
Afaik the main problem with overdraw is related to shading, ie too many fragment-shader invocations per pixel ( scenes with many overlapping particles tend to show this, makes older GPUs rev up nicely ), but shouldn't early-Z take care of this for opaque surfaces ?
But it probably still overdraws during the z-phase as you end up trying to draw a lot of unnecessary polygons, so it depends on the player hardware.
Any software engine people want to chime in ? I guess it might be quite bad for them at least.
How Do I Show FPS In Quakespasm?
#78 posted by mwh on 2015/09/23 23:50:20
Everything seems to run well for me, I guess the intel GPUs are finally good with Broadwell :-)
#79 posted by JneeraZ on 2015/09/24 00:06:28
"but shouldn't early-Z take care of this for opaque surfaces "
It does. If the depth buffer rejects a pixel, you won't eat the rendering.
Overdraw is really only a problem on systems without depth buffers or stacks of non-opaque surfaces - like particle systems or glass.
#80 posted by Spike on 2015/09/24 00:28:10
The main reason to always use vis, even if an engine is faster by disregarding vis entirely, is serverside culling and networking bandwidth even if the client totally disregards it.
culling realtime lights via vis is also very useful, although I suppose you could also use oclusion queries for that.
@ericw
use GL_ARB_bindless_texture
atlasing or texture arrays are also an option, but more fiddly (but also more likely to be supported by hardware).
water+skys could be done with subroutines.
@Kinn
overdraw is when you draw the same pixel multiple times. the earlier times become redundant and are essentually become a waste of memory bandwidth.
typically, graphics cards utilize an 'early z' optimisation which massively reduces the cost of overdraw, so if you draw the only world's depth first, then draw it normally (with depthfunc gl_equal), then you're not wasting time calculating the colours+textures of geometry which will never be seen.
really the advanttage depends on how expensive your fragment shaders are (including the cost of texture lookups+bandwidth).
Quake's software renderer had a zero-overdraw strategy. vanilla glquake draws triangles as they come from the bsp tree (nearest first). all modern glquake ports instead batch by texture, which can result in excess overdraw.
it'd be nice to return to a single-draw-call nearest-first renderer. best of both worlds - assuming your hardware+drivers are recent enough...
#81 posted by metlslime on 2015/09/24 02:31:49
it'd be nice to return to a single-draw-call nearest-first renderer
I don't know much about current hardware capabilities, but is this even possible, given that there are typically dozens to 100 textures in a bsp, plus a bunch of lightmaps that, even with atlassing, probably can't fit in a single texture? Are there enough texture units on modern cards to accommodate all of this?
@metlslime
#82 posted by Spike on 2015/09/24 05:23:36
GL_ARB_bindless_texture
no binding = no texture unit limit.
pass the texture via a vertex attribute.
GL_ARB_shader_subroutine
efficient branching, based upon vertex attributes.
both together and you have some serious dependancies on modern hardware... but should be able to draw the entire world in a single draw call - so long as your graphics card has enough memory (probably not an issue with vanilla textures, but will undoubtably be an issue with replacements).
I've not used either, so while I'm sure its possible, I'm not sure on the actual practicalities, but hey...
New Vis Tool?
#83 posted by Kinn on 2016/01/08 20:53:28
Post #38, quoting here (from mh)
I've been thinking more about the idea I mentioned above (using occlusion queries) and it seems to me that something may be possible by using a combination of the world bounds and reducing to the same 16-texel scale as lightmaps.
So what I'm thinking here is to divide the world into a grid of 16x16x16 boxes, then for each box do a Mod_PointInLeaf on the center. If it's in solid don't bother, otherwise run a 6-view-draw with occlusion queries and merge the resulting leafs into visibility for this leaf (which will have been cleared to nothing visible at initialization).
Obviously there are probably edge cases that I haven't fully thought through, but overall this is an interesting enough approach that I might even code something up.
Ignoring the realtime thing (the subject of this thread) - if an offline vis tool was developed that used such a GPU-based occlusion approach to create the vis data - would this theoretically lead to higher quality visdata than the current portal-based method?
It would certainly allow for much more open maps, surely?
#84 posted by ericw on 2016/01/08 21:43:30
Kinn, it does sound like a tempting/cool idea.
I'm not sure if the vis quality difference would be noticeable.
The biggest advantage, I think, would be vis time being proportional only to the interior volume of the map. Also func_detail would be unnecessary.
The disadvantage is you'd be moving to a system that could, in corner cases, draw less than it should. I'm thinking a hole in the wall where you have to stand just in the right place to see through, and the sample points used by vis never line up with that spot. Probably a 16x16x16 grid would be fine enough that it'd never happen in practice.
The other concern is, how fast will it be? An 8192x8192x8192 box is the worst case. That's 512^3 vis sample points using a 16x16x16 grid, and if each vis sample point can be computed in 1ms (rendering all 6 views and getting the occlusion query results) that gives you 37 hours.
#85 posted by Lunaran on 2016/01/08 23:45:08
The problem with open maps is not how vis tests visibility, it's how the world it's testing is split up into a tree. If the splits in the tree don't correspond to occlude-able pockets of geometry, it doesn't matter what method you use to determine which ones can see which ones, you're always going to be 'seeing' geometry you think you shouldn't.
The solution you're looking for is careful construction and planning of your big open map, and hint brushes.
Or
#86 posted by ijed on 2016/01/11 04:37:59
Hint the lot and trust the player has 256 allocated...
|