@QuakePone
#2641 posted by ericw on 2017/01/16 23:41:09
I implemented a lightmap format change that may help the AD start map performance on your Intel graphics system:
1. unmodified svn r1371
2. new lightmap format (GL_BGRA GL_UNSIGNED_INT_8_8_8_8_REV)
I wonder if you could test these two buildson your system and report what FPS range you get in ad1.5's start map for each build.
link to source code
Fyi
#2642 posted by ericw on 2017/01/17 00:32:09
I can't measure any difference between those builds on:
-Intel HD Graphics 4400 (2013? laptop)
-Intel 855GM (2004 laptop)
#2643 posted by Baker on 2017/01/17 01:11:36
Back a few years, I tried that modification. Tried the binary on several machines, couldn't locate one where it made a difference (Intel, GeForce, Radeon, etc.)
I believe such machines do exist, but I just didn't have any of them.
#2644 posted by ericw on 2017/01/17 04:03:49
Baker, yeah.. MH mentioned in the insideqc thread:
This is only really important for Intels that have hardware T&L - say, the 965 onwards; it seems as though earlier generations are more tolerant.
I guess my old intel is the more tolerant earlier generation.. and my newer 4400 came out several years after that thread, so apparently Intel fixed the perf issue by that time.
#2645 posted by mh on 2017/01/17 08:30:58
Yeah, I haven't tested this in a long time but I don't believe it's an issue any more either. GL_RGB should still be an issue, but GL_RGBA should be fine.
#2646 posted by mh on 2017/01/17 09:08:30
Just for reference, I've tested these builds on a more recent Intel 520 machine without measuring any difference either. This supports my guess that with DX10+ class hardware (where Microsoft forced the vendors to standardise on RGBA) it no longer matters.
A more interesting benchmark would be gl_flashblend 1 versus gl_flashblend 0. That would really help determine if it's dynamic light uploads.
#2647 posted by ericw on 2017/01/17 10:02:54
Thanks. For reference, build #1 I posted above (and QS releases) are using GL_RGBA / GL_UNSIGNED_BYTE.
--
Question for Shamblernaut:
Can you still reproduce the console retraction bug with sdl1.2? I just tried again and can't trigger it (though I remember seeing it like 1-2 years ago).
I tried in the default ubuntu 16.04 Unity desktop as well as the "ubuntu flashback (metacity)" one. Tried both windowed and fullscreen.
Further Comparison
#2648 posted by mh on 2017/01/17 10:43:54
This time on an Intel HD2000, which is closr in performance to @QuakePone's, but still no measurable difference.
Shamblernaut
#2649 posted by ericw on 2017/01/17 23:48:34
You can disregard my question, managed to reproduce the console retracting bug
I Didn't Even See The Question
That's what I get for skim reading.
As an aside, is there any particular reason for console commands being case sensitive? I understand that filenames require it, especially in *nix based systems. Is this just a legacy code thing or are some commands actually case sensitive?
#2651 posted by Spike on 2017/01/18 10:32:28
in certain cases, yes, especially if you're exposing them to qc.
there's no good reason for case-sensitive tab completion though.
I Honestly Can't Even
#2652 posted by QuakePone on 2017/01/21 05:46:09
I did test ad_start first by doing nothing and then b-hopping around until I catched those frame drops in each version. The more I tested the less times I encountered them and I usually got different results every time I restarted the map in 92.1 before this.
FPS go around 27 but in 92.1 I sometimes got the framedrops during late tests. It's very unconsistent.
You can't really take my word here, I am very confused about all of this and in the end I will probably need an engine that takes more advantage of unused GPU power. Sorry, I can't be of much help here.
#2653 posted by ericw on 2017/01/21 06:34:16
Ok, thanks for checking anyway. It sounds like the GL_UNSIGNED_INT_8_8_8_8_REV thing is no longer an issue.
My laptop with Intel HD 4400 is faster, but QS doesn't exactly fly on content like AD; some time I will have to look more closely at whether I get the same speed increase when switching to MarkV DX9.
AD Content
#2654 posted by mh on 2017/01/21 11:21:02
This isn't something I've sat down and analyzed. I'm just making general observations here.
Even with the fastest lightmap updates in the world, you'll still bottleneck on lightmap updates. In RMQ we had a scene with 20+ flickering candles; every lightmap in the scene would get updated and it would pull the NV 280 (I think) machine I was using at the time down to 20 fps.
GPU lightmaps are the future. Dynamics still have a cost but lightstyles are completely free.
FitzQuake's brush model renderer sucks. If you were to be actively malicious and deliberately set out to write a bad brush model renderer you might do worse but I doubt it. Ideally brush models should go through the same renderer as the world. A map with lots of brush models is always going to run slow.
FitzQuake's sky renderer sucks. If you were to be actively malicious and deliberately set out to write a bad sky renderer you might do worse but I doubt it. I had one of those "sweet baby Jesus" moments when I analyzed a scene in PIX and saw that 80% of the time was spent drawing sky. Any scene with sky in it is going to bog down on slower hardware.
What both of those have in common is per-polygon state changes. In the absence of batching in your own code the driver will make reasonable efforts to batch itself, but state changes will break that.
There are certain items of "Quake lore", such as Quake can't do big scenes, sky is slow, disable multitexturing, and so on. They're not true. The problem is in the implementation, and they can all be solved.
@mh
#2655 posted by Baker on 2017/01/21 12:18:12
In Quakespasm, the brush models are drawn with the world, if I recall.
As a general note, since I don't know if you are aware of this -- Arcane Dimensions is a bit less optimal than some other single releases in a couple of departments:
1) First, it uses sprites as particles. So each little floaty pixel is an entity. You can turn this off by doing "temp1 0" and restarting a map, if I recall.
2) Second, you know static entities like torches? Arcane Dimensions doesn't do torches as static entities. And Arcane Dimensions uses a lots of torches.
I haven't checked, but combined with the fact that Arcane Dimensions maps tend to have tons of details and complex geometry and generally the maps tend to have at 200-300 monsters and super-tons of items/books/etc ...
It's a bigger load than most maps tend to be.
And because of the torches and particles, I suspect puts more pressure on the QuakeC interpreter too.
The end result is very cool and creative.
Also: I have never checked, but in complicated maps with huge entity counts -- how much is vis helping? Engines with r_lockvis you might not be able to tell ...
(If I "r_lockpvs 1" in Mark V, then no clip outside the start map it looks like this ...)
http://quakeone.com/markv/media/start_map_r_lockpvs.png
Now, mh, look at this ...
On ad_mountain ...
I type "r_lockpvs 1" and noclip outside map ....
http://quakeone.com/markv/media/ad_mountain_r_lockpvs.png <--- RED ALERT!!!
So ... whatever is going on ... vis is doing a very poor job ...
Imagine how many entities are drawn per frame with vis doing that? 500? 1000? I don't know, but frustum culling cuts some of that down.
So I wouldn't approach this this that the maps are optimal and the mod are optimal from a performance standpoint and the Quake tool are doing a good job.
Arcane Dimensions is a great experience, but there is plenty of room for performance.
#2656 posted by Baker on 2017/01/21 12:45:13
2nd opinion on "r_lockpvs 1" using DirectQ, just to make sure.
http://quakeone.com/markv/media/ad_mountain_r_lockpvs_directq.png
And let's throw ARWOP Roman1 into the mix which isn't Arcane Dimensions ....
http://quakeone.com/markv/media/arwop_roman1_r_lockpvs.png
^^^ With vis generating results like the above, well --- let's just say the wide open outdoors map thing sure doesn't vis very well.
Vis On Big Maps
Is a struggle.
In the old days mappers used water volumes to control vis as they were vis blockers. I asked ericw to add a vis brush to perform this task instead when compiling. Don't think it was ever implementedy
#2658 posted by Baker on 2017/01/21 13:13:52
Vis doesn't do its job all that well, not even on many of the original Quake maps.
But Sometimes it does.
But the real problem is even more insidious.
Vis is used to determine what entities Quake thinks you can see!
If it is drawing damn near the whole map, it's also drawing all those spritey particles and perhaps most of the monsters in the map.
@mh ----
#2659 posted by Baker on 2017/01/21 13:20:34
In Mark V, if I type sv_cullentities 2 in the console.
I get a significant frames per second boost. 15-20% on ad_mountain.
sv_cullentities 2 is a very strict "is the entity visible" check on the server side against all entities. It is otherwise known as "anti-wallhack".
It's also a somewhat cpu expensive, but I guess it is saving several hundred entities from being drawn at all so it is paying off here.
That Ad_mountain Pic
#2660 posted by Kinn on 2017/01/21 13:25:06
Could that be detail brush shenanigans? I remember a discussion once where it was concluded that detail brushes can sometimes mess up the PVS if they totally cover a world face.
Baker
Try it in id1 Zendar and then ad_zendar?
I recall sock doing some hint brush fuckery, maybe that's the culprit.
#2662 posted by mh on 2017/01/21 14:33:54
I wouldn't read too much into ARWOP behaviour, ARWOP is very much a special case.
The AD screenshots definitely indicate that something different is going on too. I'm going to do a bunch of testing on that and see can I figure what it is.
Re: VIS I'm amazed that no Q1 VIS tool has implemented area portals. Being able to lob off huge chunks of geometry just by closing a door is a win. I'd love to see Q1 VIS moving to legacy; engines can still load & use it for compat but let's standardize on Q2 VIS for the future.
Hypothetical Question
#2663 posted by Kinn on 2017/01/21 15:26:29
is there a different culling method that could potentially be written into a quake engine that would be better suited to big open maps?
Vising Is Often On The Mapper
designing as intelligently as possible. Don't make giant box maps (something I need to work on myself).
However it would be nice to have some extra toys to work with.
#2665 posted by mh on 2017/01/21 17:02:28
Big open maps and culling are not the problem. The current culling is perfectly capable of handling them, a minor optimization to R_RecursiveWorldNode makes it a little better (software Quake has the optimization, GL doesn't).
The problem is in the renderer and server processing. QuakeSpasm now has the necessary renderer work to be able to handle big scenes. It could go faster by bumping the hardware requirements to GL 3.x and implementing array textures for lightmaps, as well as putting in some instancing for MDLs, but that might be too intrusive a change to the codebase.
Server processing is a problem and QC is just too slow for large entity counts. That's the next tree that has to fall.
|