#2646 posted by mh on 2017/01/17 09:08:30
Just for reference, I've tested these builds on a more recent Intel 520 machine without measuring any difference either. This supports my guess that with DX10+ class hardware (where Microsoft forced the vendors to standardise on RGBA) it no longer matters.
A more interesting benchmark would be gl_flashblend 1 versus gl_flashblend 0. That would really help determine if it's dynamic light uploads.
#2647 posted by ericw on 2017/01/17 10:02:54
Thanks. For reference, build #1 I posted above (and QS releases) are using GL_RGBA / GL_UNSIGNED_BYTE.
--
Question for Shamblernaut:
Can you still reproduce the console retraction bug with sdl1.2? I just tried again and can't trigger it (though I remember seeing it like 1-2 years ago).
I tried in the default ubuntu 16.04 Unity desktop as well as the "ubuntu flashback (metacity)" one. Tried both windowed and fullscreen.
Further Comparison
#2648 posted by mh on 2017/01/17 10:43:54
This time on an Intel HD2000, which is closr in performance to @QuakePone's, but still no measurable difference.
Shamblernaut
#2649 posted by ericw on 2017/01/17 23:48:34
You can disregard my question, managed to reproduce the console retracting bug
I Didn't Even See The Question
That's what I get for skim reading.
As an aside, is there any particular reason for console commands being case sensitive? I understand that filenames require it, especially in *nix based systems. Is this just a legacy code thing or are some commands actually case sensitive?
#2651 posted by Spike on 2017/01/18 10:32:28
in certain cases, yes, especially if you're exposing them to qc.
there's no good reason for case-sensitive tab completion though.
I Honestly Can't Even
#2652 posted by QuakePone on 2017/01/21 05:46:09
I did test ad_start first by doing nothing and then b-hopping around until I catched those frame drops in each version. The more I tested the less times I encountered them and I usually got different results every time I restarted the map in 92.1 before this.
FPS go around 27 but in 92.1 I sometimes got the framedrops during late tests. It's very unconsistent.
You can't really take my word here, I am very confused about all of this and in the end I will probably need an engine that takes more advantage of unused GPU power. Sorry, I can't be of much help here.
#2653 posted by ericw on 2017/01/21 06:34:16
Ok, thanks for checking anyway. It sounds like the GL_UNSIGNED_INT_8_8_8_8_REV thing is no longer an issue.
My laptop with Intel HD 4400 is faster, but QS doesn't exactly fly on content like AD; some time I will have to look more closely at whether I get the same speed increase when switching to MarkV DX9.
AD Content
#2654 posted by mh on 2017/01/21 11:21:02
This isn't something I've sat down and analyzed. I'm just making general observations here.
Even with the fastest lightmap updates in the world, you'll still bottleneck on lightmap updates. In RMQ we had a scene with 20+ flickering candles; every lightmap in the scene would get updated and it would pull the NV 280 (I think) machine I was using at the time down to 20 fps.
GPU lightmaps are the future. Dynamics still have a cost but lightstyles are completely free.
FitzQuake's brush model renderer sucks. If you were to be actively malicious and deliberately set out to write a bad brush model renderer you might do worse but I doubt it. Ideally brush models should go through the same renderer as the world. A map with lots of brush models is always going to run slow.
FitzQuake's sky renderer sucks. If you were to be actively malicious and deliberately set out to write a bad sky renderer you might do worse but I doubt it. I had one of those "sweet baby Jesus" moments when I analyzed a scene in PIX and saw that 80% of the time was spent drawing sky. Any scene with sky in it is going to bog down on slower hardware.
What both of those have in common is per-polygon state changes. In the absence of batching in your own code the driver will make reasonable efforts to batch itself, but state changes will break that.
There are certain items of "Quake lore", such as Quake can't do big scenes, sky is slow, disable multitexturing, and so on. They're not true. The problem is in the implementation, and they can all be solved.
@mh
#2655 posted by Baker on 2017/01/21 12:18:12
In Quakespasm, the brush models are drawn with the world, if I recall.
As a general note, since I don't know if you are aware of this -- Arcane Dimensions is a bit less optimal than some other single releases in a couple of departments:
1) First, it uses sprites as particles. So each little floaty pixel is an entity. You can turn this off by doing "temp1 0" and restarting a map, if I recall.
2) Second, you know static entities like torches? Arcane Dimensions doesn't do torches as static entities. And Arcane Dimensions uses a lots of torches.
I haven't checked, but combined with the fact that Arcane Dimensions maps tend to have tons of details and complex geometry and generally the maps tend to have at 200-300 monsters and super-tons of items/books/etc ...
It's a bigger load than most maps tend to be.
And because of the torches and particles, I suspect puts more pressure on the QuakeC interpreter too.
The end result is very cool and creative.
Also: I have never checked, but in complicated maps with huge entity counts -- how much is vis helping? Engines with r_lockvis you might not be able to tell ...
(If I "r_lockpvs 1" in Mark V, then no clip outside the start map it looks like this ...)
http://quakeone.com/markv/media/start_map_r_lockpvs.png
Now, mh, look at this ...
On ad_mountain ...
I type "r_lockpvs 1" and noclip outside map ....
http://quakeone.com/markv/media/ad_mountain_r_lockpvs.png <--- RED ALERT!!!
So ... whatever is going on ... vis is doing a very poor job ...
Imagine how many entities are drawn per frame with vis doing that? 500? 1000? I don't know, but frustum culling cuts some of that down.
So I wouldn't approach this this that the maps are optimal and the mod are optimal from a performance standpoint and the Quake tool are doing a good job.
Arcane Dimensions is a great experience, but there is plenty of room for performance.
#2656 posted by Baker on 2017/01/21 12:45:13
2nd opinion on "r_lockpvs 1" using DirectQ, just to make sure.
http://quakeone.com/markv/media/ad_mountain_r_lockpvs_directq.png
And let's throw ARWOP Roman1 into the mix which isn't Arcane Dimensions ....
http://quakeone.com/markv/media/arwop_roman1_r_lockpvs.png
^^^ With vis generating results like the above, well --- let's just say the wide open outdoors map thing sure doesn't vis very well.
Vis On Big Maps
Is a struggle.
In the old days mappers used water volumes to control vis as they were vis blockers. I asked ericw to add a vis brush to perform this task instead when compiling. Don't think it was ever implementedy
#2658 posted by Baker on 2017/01/21 13:13:52
Vis doesn't do its job all that well, not even on many of the original Quake maps.
But Sometimes it does.
But the real problem is even more insidious.
Vis is used to determine what entities Quake thinks you can see!
If it is drawing damn near the whole map, it's also drawing all those spritey particles and perhaps most of the monsters in the map.
@mh ----
#2659 posted by Baker on 2017/01/21 13:20:34
In Mark V, if I type sv_cullentities 2 in the console.
I get a significant frames per second boost. 15-20% on ad_mountain.
sv_cullentities 2 is a very strict "is the entity visible" check on the server side against all entities. It is otherwise known as "anti-wallhack".
It's also a somewhat cpu expensive, but I guess it is saving several hundred entities from being drawn at all so it is paying off here.
That Ad_mountain Pic
#2660 posted by Kinn on 2017/01/21 13:25:06
Could that be detail brush shenanigans? I remember a discussion once where it was concluded that detail brushes can sometimes mess up the PVS if they totally cover a world face.
Baker
Try it in id1 Zendar and then ad_zendar?
I recall sock doing some hint brush fuckery, maybe that's the culprit.
#2662 posted by mh on 2017/01/21 14:33:54
I wouldn't read too much into ARWOP behaviour, ARWOP is very much a special case.
The AD screenshots definitely indicate that something different is going on too. I'm going to do a bunch of testing on that and see can I figure what it is.
Re: VIS I'm amazed that no Q1 VIS tool has implemented area portals. Being able to lob off huge chunks of geometry just by closing a door is a win. I'd love to see Q1 VIS moving to legacy; engines can still load & use it for compat but let's standardize on Q2 VIS for the future.
Hypothetical Question
#2663 posted by Kinn on 2017/01/21 15:26:29
is there a different culling method that could potentially be written into a quake engine that would be better suited to big open maps?
Vising Is Often On The Mapper
designing as intelligently as possible. Don't make giant box maps (something I need to work on myself).
However it would be nice to have some extra toys to work with.
#2665 posted by mh on 2017/01/21 17:02:28
Big open maps and culling are not the problem. The current culling is perfectly capable of handling them, a minor optimization to R_RecursiveWorldNode makes it a little better (software Quake has the optimization, GL doesn't).
The problem is in the renderer and server processing. QuakeSpasm now has the necessary renderer work to be able to handle big scenes. It could go faster by bumping the hardware requirements to GL 3.x and implementing array textures for lightmaps, as well as putting in some instancing for MDLs, but that might be too intrusive a change to the codebase.
Server processing is a problem and QC is just too slow for large entity counts. That's the next tree that has to fall.
Vis
#2666 posted by ericw on 2017/01/21 17:44:55
I agree with mh, it still works impressively well on big, open maps, and it matters a lot on AD maps. ad_tfuma was fastvised it right up until release, because Fifth and I both assumed vis would take forever and not be able to optimize much, since it's more or less a giant box with some underground areas. Just before release, one of the testers complained of low fps, so I tried full-vising it - took only 20 minutes (8 threads), and the wpolys visible from the start position were cut in half (30k to 14k iirc).
QS's brush model renderer was the first renderer thing I changed (this was ~2014 for RRP, ijed's map had some complex bmodels), they're not merged with the world, but each is drawn using the world renderer.
QS uses the Fitz sky renderer unmodified. (same for liquids).
GPU lightmaps: blending the 4 lightmaps per face sounds like a straightforward good idea. I'm wondering if it will be difficult to do the moving dynamic lights from rockets/etc. IMHO they need to be kept at the low-res of the lightmaps - when I was experimenting with high res lightmaps and LIT2, I remember noticing that rocket trails really stuck out like a sore thumb when they were over a high-res lightmap. @mh did you do any experiments with this part of GPU lightmaps?
GPU Lightmaps
#2667 posted by mh on 2017/01/21 17:59:19
I draw dynamics as additive blended extra passes over the scene. Full per-pixel quality. TBH it doesn't bother me but maybe this is one of those things that annoys different people in different ways.
I guess you could use a low res attenuation map texture but I just do the maths.
Optimizations: Eric Lengyels scissor test optimization is mostly designed for shadows but helps here too. Frustum culling the dlight. Only light surfaces that were drawn in the main pass.
I also add dynamics to BSP brush models using this scheme.
Light styles: I actually use 3 textures, one for each of R, G and B and encode the styles into the color channels. Animation is a 4-component DotProduct per channel and combine them to the final color.
@mh - Just So You Have All The Information.
#2668 posted by Baker on 2017/01/21 21:26:33
Another side effect, not related to rendering, of the vis problems combined with spritey particles is huge demo sizes.
As you know, the vis information is used to determine what entities the server side thinks is visible to you.
If you record a demo in most of the Arcane Dimensions maps, the demo is going to be massive after 30 minutes of play (600 MB).
I ended up disabling the autodemo feature by default in Mark V because recording a demo in Arcane Dimensions became a performance issue ...
... Which users perceive as "this engine is choppy", hehe.
So anyway, the full deck of cards.
/The very nice thing about Arcane Dimensions is that it is open source. Sock is #1 in my book because he's always been open source.
Which also means it is possible for new optimizations to theoretically be written eventually to shift some burdens from the QuakeC to the engine (particles).
Arcane Dimensions merely exposes things that the map compile tools and the engine particle systems could handle better.
Ad+demos
#2669 posted by Spike on 2017/01/22 00:00:27
particles - qss uses fte's particle system to support the effectinfo stuff that sock used in ad, as a result qss should have much more sane demo sizes with ad 1.5.
choppy - use a thread to perform the disk writes, then you don't get stalls from flushing happening on the main thread.
making that change reportedly solved multi-second stalls with fte on linux.
size - gzip it as you go (you don't rewind, right?...).
#2669
#2670 posted by topher on 2017/01/22 00:16:09
damn
the demo that i recorded for ad_magna is 2gbytes
i checked with r_showtris 1 and no particles-entities, the demos should be smaller
|