 @spike
#617 posted by Baker on 2016/12/07 03:05:57
Yeah, eventually I do have something like a "surface effect texture" planned in my head for possible surface effects.
Might as well ask your thoughts on this question to you ...
Although not soon, I would like to use probably 4-8 QME 3.1 bit flag slots but and would like to avoid any possible with what FTE uses.
One example might be to indicate additive blending.
/I have not put much thought into this lately, but while discussion about future mapping enhancements ensues ... is fairly good time to bring up those thoughts.
 Modelflags
#618 posted by Spike on 2016/12/07 03:42:25
hexen2 defines modelflags up to and including 23.
1<<24 is the first undefined one as far as fte is concerned.
Not that fte implements all of the hexen2 ones, or that I'm even aware of any models that use them... but hey.
that said, for additive, you're probably better off sticking with EF_ADDITIVE=(1<<5). Yes, this can be used by mappers, and does not necessitate any wire change (read: protocol uses a previously-unused bit which permits backwards-compatibility on the assumption that old implementations can safely it).
maybe some of the other bits you're thinking of are similarly already implemented in another engine.
 Intel Video Identification Bug
#619 posted by mh on 2016/12/07 18:07:12
if (!strcmp(gl_vendor, "Intel"))
I see that Mark V has inherited this from the original code too.
The D3D equivalent (which is read from the driver so I don'thave control over it) is actually "Intel(R) HD Graphics" so this test will incorrectly not identify it.
Strictly speaking this is also a bug in the GL case because Intel may change their vendor string.
Change it to:
if (strstr(gl_vendor, "Intel"))
#620 posted by mh on 2016/12/07 20:54:15
dx8_mark_v -width 750 -height 550
= borked up screenshots
Fixed.
This is actually a wrapper bug, so my apologies for my previous misdiagnosis.
@Baker: in the glReadPixels implementation, "case GL_RGB" should also be using "srcdata[srcpos * 4..." because the source data will be 4-byte, even if the dest is 3.
I may have mentioned a while back that there are advantages to going native, and screenshots are one such.
Baker has implemented PNG screenshots, so in the native GL code he's doing a glReadPixels, then converting the memory buffer to PNG format (presumably via a statically linked libpng or maybe stb_image) and saving it out.
A native D3D version wouldn't do half of that work. Instead it could just use D3DXSaveSurfaceToFile on the backbuffer. Give it a filename, specify the type, boom, done, three lines of code.
I'm going to add some native exports to the new wrapper for functionality such as this.
#621 posted by Gunter on 2016/12/07 21:36:10
The DX stuff mh is doing sounds really great.
I don't understand most of it, but I hear things like "improved performance." And bug fixes are always good.
And if I'm getting the gist of things regarding more rendering passes, this might allow addressing of some issues:
Fullbright textures should not be affected by contrast -- it makes them ugly.
Screen Blends should not be affected by gamma or contrast -- I have found this is the main thing that makes them far too intense. When I use vid_hardwaregamma 0 and various values for txgamma, the screen blends look perfect, though if I mess with the contrast slider, it makes the blends too intense again.
So yeah, if that's a possibility, the screen blends should be drawn last so they are not affected by gamma or contrast.
But I have no real understanding of this low-level 3D rendering stuff. Though it sounds like there will be a great benefit from mh's work.
 @mh
#622 posted by Baker on 2016/12/07 21:49:24
Mark V applies gamma and contrasts to screenshots (only) when applicable.
For instance
1) If you are using hardware gamma/contrast, it will adjust the screenshot accordingly.
2) If you are not using hardware gamma/contrast, it will not apply it to the screenshot.
So depending on situation, writing directly to file is not necessarily desirable because screenshots could be, for instance, too dark/etc.
#623 posted by Gunter on 2016/12/07 23:49:46
Hm, there's an issue that looks similar to the previous QMB lighting texture thing with txgamma, but it appears whether or not txgamma is being used, wherever there is fog.
I first noticed it at very long range when I was zooming in, because I use very light fog, but if you set the fog much denser, like .5 and then fire the lightning gun, you will see the bad effect.
#624 posted by mh on 2016/12/08 00:14:29
Mark V applies gamma and contrasts to screenshots (only) when applicable.
Ahhh, understood.
By the way, here's a sneak preview of something I just did: http://i64.tinypic.com/o6flm1.jpg
This was actually taken with the GL version, just to demonstrate that there's no shader trickery or suchlike going on.
#626 posted by dwere on 2016/12/08 04:52:19
Mark V seems to struggle with complex maps that are smooth in QS. On my questionable system, I mean.
jam2_mfx.bsp is a good example. Right at the start, looking towards the palace produces a very noticeable slowdown.
Fitzquake v0.85 also has this problem.
 @dwere - Vertex Arrays/ Water Warp/ IPad Version ...
#627 posted by Baker on 2016/12/08 05:31:11
@mh - Haha, your water warp. I suspected you would do that ;-)
---------------
@dwere
Especially on older hardware, vertex arrays help achieve a more reliable 72 frames per second in the BSP2 era.
I hadn't implemented yet them because there were many other things on the to-do list. I'm prototyping an iPad/iPhone version, which uses Open GLES which requires use of vertex arrays so I actually have to implement vertex arrays here in a hour or so. I'm still sticking with the "that's it for 2016", but version 1.3 will have it.
---
@gunter - You are probably right on blending. I'm hoping that MH will provide HLSL shader option for gamma/contrast in DX9 ... everyone has their wish list ;-) btw ... I still hate your computer, but I sure appreciate all the compatibility testing it has helped provide.
iPhone/iPad - 2017
But since I'm making a prototype iPad/iPhone version right now which will controls similar to Minecraft on the iPad which is very playable on an iPad.
(Android is more of a pain because very crude development tools which is like banging rocks together to make fire. iPhone development tools have always been very nice.)
/Any new builds will be 2017 and unsure where the priority of an iPhone/iPad version. Now that I have stable release and zero issues outstanding, playing around is a bit more leisurely. May upload a video later tonight after I get it initially running ...
#628 posted by mh on 2016/12/08 07:35:12
I couldn't not do water warp.
Performance. Vertex arrays help with big maps, but they're only part of the story. What really helps is draw call batching, and vertex arrays are just a tool that allows you to do batching.
glBegin/glEnd code is fine for ID1 maps but as maps get bigger this kind of code gets slower and slower.
Toss in a few dynamic lights and/or animated lightstyles (which stock Fitz also handles very poorly) and it gets worse.
Batching is all about taking a big chunk of work and submitting it in a single large draw call, instead of several hundred tiny draw calls. Each draw call has a fixed overhead, irrespective of how much work it does, and that overhead is typically on the CPU. So if you have, say, 1000 surfaces all of which have the same texture and lightmap, drawing them with a single call will have 1/1000 the CPU overhead that drawing them with 1000 calls would.
Stock Fitz would also benefit from lightmap update batching. Again it's the difference between "few large updates" versus "lots of tiny updates" with the former being more efficient. Stock Fitz also uses GL_RGB which compounds the problem by forcing a format conversion in the driver. This stuff is mostly hidden on modern hardware, but you can still find devices (and some scenes in maps) where you get an unexpected nasty surprise.
Ironically, one way to make stock Fitz run faster can be to disable multitexture. Try it - pick a scene where it gets the herky-jerkys then compare with running with -nomtex. This will cause it to have fewer state changes between draw calls so that the driver can optimize better, as well as batch up it's lightmap updates (for the world, not bmodels which are still slow). Depending on the scene, the extra passes might turn out to be a lower workload.
If the engine itself implemented all of this batching then running with -nomtex would not be necessary.
The D3D9 wrapper takes the supplied glBegin/glEnd calls and attempts to be somewhat agressive about batching them up. It converts everything to indexed triangle lists and concatenates multiple draw calls that don't have state or texture changes between them. It also attempts to merge multiple lightmap updates.
None of this is as efficient as if the engine did it itself, of course. Going into the actual GL code and doing it right from the get-go is always going to give the best results.
 Warping The Water
Well, I wouldn't care whether "authentic" waterwarp is implemented into the DirectX or rather OpenGL build. But the one that would get it would be my personal default. :P
 @johhny Law
#630 posted by Baker on 2016/12/09 10:44:21
I'm sizing up Requiem to see what unique things it adds for likely addition ...
I know it can create items (interesting idea), for instance. jdhack had some interesting ideas in there.
A question for you, if you know ...
I can't get Requiem to run on Linux, it says "libGL.so.1 not found". Engines like ezQuake run fine for me on Ubuntu Linux or even super old FitzQuake 0.80 SDL. Could it possibly be expecting a 32-bit .so ?
If you happen to know ...
 @nightFright
#631 posted by mh on 2016/12/09 11:04:32
Well, I wouldn't care whether "authentic" waterwarp is implemented into the DirectX or rather OpenGL build. But the one that would get it would be my personal default.
What if both were able to get it? :)
 The Eternal Conflict
That would mean you and Baker found a way to solve your epic "conflict" regarding its implementation? Sounds like a great X-Mas gift to me, actually...!
#633 posted by Baker on 2016/12/09 14:36:12
I wouldn't call it a conflict, hehe.
The DirectX version implementing DirectX features is just natural.
The OpenGL remaining at 1.2 for broad hardware compatibility is not something very bloody likely to stop MH.
To say MH is good at rendering is like saying Isaac Newton was good at calculus or that Einstein was pretty okay at physics ;-)
 About MH ...
#634 posted by Bake on 2016/12/09 14:40:12
There's assembly language in his shaders in the RMQ engine.
 @Baker - Gamma And Contrast
#635 posted by mh on 2016/12/09 15:11:31
Currently going through the MarkV code to figure how it implements gamma and contrast in the GL renderer.
To be honest, I see absolutely nothing in either that wouldn't work in D3D right now; even version 8.
Gamma just sets adjusted gamma ramps, which will also work in D3D. D3D does have it's own gamma functions too, but you're better off using the native Windows SetDeviceGammaRamp/GetDeviceGammaRamp stuff (in particular the D3D functions don't work at all in windowed modes, whereas the native functions do).
Contrast is just a load of blended quads over the screen. The only thing I see in there that may be a problem with D3D8 is commented-out code enabling scissor test. D3D8 doesn't have scissor test but D3D9 does.
For D3D9 I'm going to do something different.
I'm going to do shader-based gamma and contrast using render-to-texture. This is achievable with D3D shader model 2 shaders, broadly equivalent to OpenGL 1.5 with ARB assembly shaders so compatibility remains good. It will enable gamma and contrast in windowed modes without affecting the entire display, and it won't require gamma-and-contrast-adjusting screenshots.
The interface will be 1 function: BOOL D3D_SetupGammaAndContrast (float gamma, float contrast) which it will be appropriate to call in GL_BeginRendering. Returns TRUE if it was able to do it's thing (or if gamma and contrast are both 1), FALSE otherwise in which case you need to decide on a fallback - either route it through the GL codepath (which should also work) or do nothing. Everything else will be automagic.
#636 posted by Baker on 2016/12/09 16:59:39
Likely future scheme, not that this would have any impact on coding anyway ...
vid_hardwaregamma (*) - following the FitzQuakian cvar scheme that I rather like such as r_lerpmodels 0 (off), 1 (best), 2 (always)...
vid_hardwaregamma (or whatever name becomes)
0 - Never. Use best available non-hardware method
1 - Windowed mode uses non-hardware method (looks better on desktop), fullscreen uses hardware method (faster and hardware method is also brighter, some displays tend towards the dark side no matter what without hardware gamma). Default.
2 - Hardware method always.
(*) bad name because also does contrast?
 GL_RGBA
#637 posted by Baker on 2016/12/09 17:28:03
Also: may notice the code is biased towards GL_RGBA and I have to switch the bytes around to be BGRA for various operations. It isn't actually an oversight or an inefficiency I didn't correct, but rather that OpenGLES only has GL_RGBA.
Just wanted to point that out because I know you may see that code and think "This is so wrong."
/The video/input/system code long ago was entirely rewritten in a way to support devices. In some of the files, there is living device code from back in 2014.
 "libGL.so.1 Not Found"
#638 posted by Joel B on 2016/12/09 20:48:24
Hmm.
So, my reQuiem test builds were done on CentOS 6.4. Looking at that setup now, ldd says that my reQuiem-debug.glx executable is using /usr/lib64/libGL.so.1.
rpm -qf on that file shows that it came from the package mesa-libGL-9.0-0.8.el6_4.3.x86_64
I don't remember now if that was something that I explicitly installed for reQuiem's benefit.
#639 posted by mh on 2016/12/10 04:51:35
Also: may notice the code is biased towards GL_RGBA and I have to switch the bytes around to be BGRA for various operations. It isn't actually an oversight or an inefficiency I didn't correct, but rather that OpenGLES only has GL_RGBA.
This doesn't actually matter at all in D3D aside from getting the byte ordering right, because you're writing the data directly to the texture rather than relying on the driver to not screw up.
#640 posted by Spike on 2016/12/10 10:28:31
writing device memory using bytes instead of a cacheline-aligned memcpy will be slower, but whatever. modern apis just have you write it into ram and have the gpu itself move it onto the gpu's memory so there's no issues with uncached memory or whatever.
either way, d3d10+(eg gl3.0)/vulkan hardware has proper RGBA texture uploads so its not like modern gpus care. older gpus/apis will still need some sort of conversion but its okay to be lazy and submit only the lightmaps as bgra. streaming is the only time it really matters. oh noes! loading took at extra blinks-duration! *shrug*
 Compiling Requiem On Linux
#641 posted by Baker on 2016/12/10 12:06:24
Ok .. first snag ...
#include <sys/cdefs.h> file not found
Solved with: sudo apt-get install -y libc6-dev-i386
Then next issue ...
fatal error: X11/extensions/xf86dga.h: No such file or directory
Does ... sudo apt-get install libxxf86vm-dev -y
But is already installed.
Goes to /usr/include/x11/extension ... no such file as xf86dga.h. Slight Googling turns up ... "xf86dga.h is obsolete and may be removed in the future."
Looks like future is now. See note about warning include <X11/extensions/Xxf86dga.h> instead. on that same page I Googled.
Don't have one of those sitting in /usr/include/x11/extensions either. Hmmm. Hope is not brick wall.
@johnny - I'm posting this for informational purposes. I never expect anyone in particular to assist, just fyi. I'm hoping someone reading this thread that knows what the above could be about may chime in.
|