Q3RT
If I Have Time
#52 posted by ericw on 2015/05/10 20:44:45
I want to try making a version of light.exe that runs on the gpu (OpenCL).
I'm pretty sure it's possible, only question is whether it will be much faster than the cpu version.
Source Code
#53 posted by inertia on 2015/05/10 21:16:04
Where can I find source code for modern vis tools? I'd like to learn how some of this stuff is implemented.
#54 posted by JneeraZ on 2015/05/11 01:12:53
You probably don't. :) Modern tools are stapled on top of the old code. And the old code will drive you to drink, trust me.
#55 posted by inertia on 2015/05/11 01:31:32
There's no hope for this project if the code is THAT bad :-)
#56 posted by JneeraZ on 2015/05/11 01:37:22
He's talking about a whole new methodology ... something done at runtime. I haven't seen the game code itself so I don't know how hard it is to mod but I suspect it's been improved in the various engine code bases.
The tools ... not so much. :)
It's A Hefty Task
and shit, if you're re-writing engine code to vis during runtime, then you might as well make a whole new map format to boot.
The Problem With Vis...
#58 posted by mh on 2015/05/11 18:58:17
...is that it really operates on too fine-grained a level for a modern renderer. Like a lot of things in Quake, it made sense for a software renderer on a lower-specced PC, where every polygon you could save performance by not drawing was a win, but with even halfway reasonably decent hardware acceleration that just goes out the window.
Some relevant notes about the XBox 360 port of Quake 2: http://www.eurogamer.net/articles/digitalfoundry-2015-quake-2-on-xbox-360-the-first-console-hd-remaster - it just didn't bother using vis at all and still managed 60 fps with 4x MSAA at HD resolutions.
Culling of unseen polygons is also eliminated in the Xbox 360 version, deemed unnecessary due to the paltry number of triangles used per map - meaning that the entire world is drawn each and every frame.
That's fine for original content but is obviously going to fall down (badly) on some of the more brutal modern maps. But it does highlight that the really fine-grained per-leaf visibility is essentially disposable when dealing with more modern PCs than the original engines targetted.
if you're re-writing engine code to vis during runtime, then you might as well make a whole new map format to boot
This can seem to make sense on the surface, but you need to dig a little deeper. One of the reasons why BSP2 was successful is that it changed as little as possible in the format. There were discussions about what features it should have while it was being specced (and I did the original spec and implementation so I can be 100% certain about this) and it kept on coming back to making it as easy for other engine authors to implement as possible. So while it could have had features like built-in RGB lightmaps, 32-bit textures (or even a separate palette per-texture), or others, it didn't. It didn't even change the .map format so that mappers could continue using their favourite editors, and all that was required in the engine and tools was a few #defines, some new structs and a bunch of copy-and-paste code.
What's really required to make Vis more efficient is to change it's granularity from per-leaf to something like per-room. I have no idea what that would entail in terms of tool-work, but engine-side it could lead to better efficiencies from less BSP tracing while drawing and being able to build static batches from world geometry.
Although
#59 posted by ericw on 2015/05/11 19:47:37
we have per-room vis already, in a way, if the mapper makes heavy use of func_detail.
crazy idea, maybe you can recover the "leaf clusters" in the engine if you want coarser granularity vis data for rendering. Just group all leafs together that have the same visdata?
Stuff
#60 posted by gb on 2015/05/11 22:54:21
BSP2 was limited by the requirement that it needed to work with Worldcraft and a Fitzquake derived engine, because switching editors proved to be an unpopular idea and dropping our sympathic newborn engine for Darkplaces or FTE seemed too heartless even to me at the time, although it would have been the right thing to do in retrospect (and it was the first thing I did afterwards.)
In the big picture, BSP2 is a foul compromise but a nice thing if you want to keep the Q1BSP pipeline.
About reducing the VIS detail:
Add a compiler switch that lets the mapper disable automatic vising. Then add a new custom texture (like "trigger") that lets the mapper create portals manually.
I did a similar thing in my single-player maps (which are both very large and very detailed) when I still used FBSP and it resulted in a HUGE performance boost. Despite already using detail brushes. I inserted just enough portals to cull far away areas of the map, instead of going overboard with it like the Quake compilers do by default. Performance is then mostly limited by batching.
The performance increase was comparable to the improvement in Vis time after using func_detail.
I got the idea from looking at how Call of Duty (1) does it, since that's a Quake 3 based game with relatively large outdoor maps. Turns out they changed it completely and yes, the mapper has to manually portal the map in that game.
Quake 1 (and Quake 3) vising was developed for corridor shooters running on 90s consumer hardware. No wonder they tried to cull every little bit whenever possible. But hardware and Quake mapping have changed so much that this formerly very effective method has turned into an obstacle, and a massive obstacle at that.
It is probably less noticeable with deathmatch maps, and thus Quake 3 maps. But single player maps are bogged down by this massive amount of unnecessary info.
Naive Musings...
#61 posted by Kinn on 2015/09/23 11:14:36
So just for laughs I flew around jam6_ericwtronyn - which is about the heaviest thing I can throw at the quake engine right now (I think) - and noted that in the most epic view I could find, I was getting around 30,000 wpoly and 70,000 epoly. I think this map is unvised, but looking at the structure of it, I can't imagine vising it would bring those polycounts down much.
Now, unless I'm missing something, those kinds of polycounts shouldn't trouble any sort of even vaguely modern hardware (didn't Doom 3 have like 150,000 polys in a typical scene in 2004?)
So...questions...
I get a solid 60fps with jam6_ericwtronyn on a reasonably modern laptop running Quakespasm. Does anyone here get bad performance in this map, and if so - what hardware/engine you running it on?
Are there other factors at work that cause unvised quake maps to perform slowly that are not to do with polycount? Things like 400 monsters running LOS checks?
Kinn
I got poor performance on some maps with Mark v on my surface pro. My pro can run dark souls, burnout paradise and some other decent games so it's no slouch.
Dark places and Rmq tend to be better performance for big maps
#63 posted by JneeraZ on 2015/09/23 13:23:41
I think it's just the old rendering code. It probably spends more time figuring out what not to draw than it would take to just draw it...
What Warren Said
TB just renders everything in every frame and it's fairly smooth. I'm sure with proper optimization it could be faster by a factor of two.
#65 posted by Spirit on 2015/09/23 15:14:16
Most engines were focused on features and that was many years ago. Modern engines that break some hardware compatibilities can do it. Iirc it's VBO, as mh did.
#66 posted by Kinn on 2015/09/23 16:17:10
I think it's just the old rendering code. It probably spends more time figuring out what not to draw than it would take to just draw it...
So...in an engine that uses some reasonably modern rendering code, an unvised map would run quicker than the same vised map? Is this the case with Quakespasm?
#67 posted by JneeraZ on 2015/09/23 16:46:12
Potentially. A lot of modern games are throwing around props in scenes that have more triangles than entire Quake levels.
#68 posted by metlslime on 2015/09/23 17:07:51
an unvised map would not run quicker -- but it might run at nearly the same speed with the right engine code. The only way unvised would be quicker is if the engine detected this case and used different code to render it compared to a vised map.
But the main point still stands, polygons are not the bottleneck for quake on modern hardware, it's things like draw calls, lightmap uploads, and (perhaps) geometry uploads.
I believe the ideal quake renderer for modern hardware would reduce draw calls down to 1 per texture, put all geometry in vertex buffers (potentially ignoring vis to make this work,) and do lightstyle calculations in hardware.
#69 posted by JneeraZ on 2015/09/23 17:27:28
It would definitely be a revolution to leave VIS behind ... imagine the iteration times if level designers only had to QBSP and LIGHT.
#70 posted by Spike on 2015/09/23 20:05:32
Even if you stick the entire map into a single vbo and draw it with a single draw call (which is possible, except for sky+water), you will still suffer from overdraw.
those individual props with more polys than entire quake maps do not have 15 different rooms overlapping each other.
it might be interesting to throw the entire map at the screen in a single draw call, both with and without an early z pass...
Kinn
#71 posted by ericw on 2015/09/23 21:39:25
Quakespasm has upgraded world and mdl rendering paths vs Fitzquake, if I disable these with the "-novbo" option my framerate goes from 60 -> high 20s in that map.
However, I think QS would still run faster if this map were vised - it could skip drawing entities, and skip draw calls for the world (there's a draw call for each batch of polys that share a texture and a lightmap.)
IMO all maps should still be vised, if someone did develop an engine that was faster ignoring vis data, that engine could just ignore the visdata.
What would be interesting, I think, is an upgraded / state of the art vis tool that could give less exact results but finish in a reasonable time. I wonder if there's some off the shelf, open source vis algorithm that could be used. I tried to vis the jam version of jam6_ericwtronyn but the progress indicator got stuck at the end after 2 weeks, it didn't finish in the following week so I killed it.
Vis isn't going to do a ton on this map, but if it could just separate the cave and outdoor sections that would help.
it might be interesting to throw the entire map at the screen in a single draw call, both with and without an early z pass...
I'm curious how you'd handle the texture switches when drawing the world in 1 call, is there a better way than a texture atlas? When I read about atlases, they sounded like a big pain because you have to partially tile the textures in the atlas so texture filtering doesn't read into adjacent textures.
Ericw
#72 posted by Kinn on 2015/09/23 21:53:06
Thanks, some cool info there :)
I tried to vis the jam version of jam6_ericwtronyn but the progress indicator got stuck at the end after 2 weeks
o_O
To what extent were detail brushes used in that map? I figured that kind of solved the vis time problem, even for crazy bonkers tronyn stuff.
#73 posted by ericw on 2015/09/23 21:58:40
It made good use of func_detail too, had something like 7k clusters, 24k leafs - so vis only had to be computed for the 7k clusters.
Could be it's just the map layout being open. For comparison I've tested ijed's telefragged.bsp and it full vises in about a minute!
Jam6_ericwtronyn
#74 posted by Kinn on 2015/09/23 22:08:17
I just realised the map source is provided so I fired it up in radiant, pressed Alt-2 to only show the world brushes, and...yes I see the problem :)
#70
#75 posted by metlslime on 2015/09/23 22:25:34
Yes, overdraw is the thing I forgot... Still need a good solution for that.
|