Func_Msgboard: Mark V - Release 1.00

Posted by Baker on 2016/11/19 04:53:11

http://quakeone.com/markv/

* Nehahra support -- better and deeper link
* Mirror support, "mirror_" textures. video
* Quaddicted install via console (i.e. "install travail")
* Full external texture support DP naming convention
* Enhanced dev tools texturepointer video inspector video
* IPv6 support, enhanced server capabilities
* Enhance co-operative play (excels at this!)
* Software renderer version (WinQuake)
* "Find" information command (ex. type "find sky")

Thanks to the beta testers! NightFright, fifth, spy, gunter, pulsar, johnny law, dwere, qmaster, mfx, icaro, kinn, adib, onetruepurple, railmccoy

And thanks to the other developers who actively provided advice or assistance: Spike (!), mh, ericw, metlslime and the sw guys: mankrip and qbism.

/Mac version is not current yet ...; Linux will happen sometime in 2017

First | Previous | Next | Last

#620 posted by mh on 2016/12/07 20:54:15

dx8_mark_v -width 750 -height 550

= borked up screenshots

Fixed.

This is actually a wrapper bug, so my apologies for my previous misdiagnosis.

@Baker: in the glReadPixels implementation, "case GL_RGB" should also be using "srcdata[srcpos * 4..." because the source data will be 4-byte, even if the dest is 3.

I may have mentioned a while back that there are advantages to going native, and screenshots are one such.

Baker has implemented PNG screenshots, so in the native GL code he's doing a glReadPixels, then converting the memory buffer to PNG format (presumably via a statically linked libpng or maybe stb_image) and saving it out.

A native D3D version wouldn't do half of that work. Instead it could just use D3DXSaveSurfaceToFile on the backbuffer. Give it a filename, specify the type, boom, done, three lines of code.

I'm going to add some native exports to the new wrapper for functionality such as this.

#621 posted by Gunter on 2016/12/07 21:36:10

The DX stuff mh is doing sounds really great.

I don't understand most of it, but I hear things like "improved performance." And bug fixes are always good.

And if I'm getting the gist of things regarding more rendering passes, this might allow addressing of some issues:

Fullbright textures should not be affected by contrast -- it makes them ugly.

Screen Blends should not be affected by gamma or contrast -- I have found this is the main thing that makes them far too intense. When I use vid_hardwaregamma 0 and various values for txgamma, the screen blends look perfect, though if I mess with the contrast slider, it makes the blends too intense again.

So yeah, if that's a possibility, the screen blends should be drawn last so they are not affected by gamma or contrast.

But I have no real understanding of this low-level 3D rendering stuff. Though it sounds like there will be a great benefit from mh's work.

@mh

#622 posted by Baker on 2016/12/07 21:49:24

Mark V applies gamma and contrasts to screenshots (only) when applicable.

For instance

1) If you are using hardware gamma/contrast, it will adjust the screenshot accordingly.
2) If you are not using hardware gamma/contrast, it will not apply it to the screenshot.

So depending on situation, writing directly to file is not necessarily desirable because screenshots could be, for instance, too dark/etc.

#623 posted by Gunter on 2016/12/07 23:49:46

Hm, there's an issue that looks similar to the previous QMB lighting texture thing with txgamma, but it appears whether or not txgamma is being used, wherever there is fog.

I first noticed it at very long range when I was zooming in, because I use very light fog, but if you set the fog much denser, like .5 and then fire the lightning gun, you will see the bad effect.

#624 posted by mh on 2016/12/08 00:14:29

Mark V applies gamma and contrasts to screenshots (only) when applicable.

Ahhh, understood.

By the way, here's a sneak preview of something I just did: http://i64.tinypic.com/o6flm1.jpg

This was actually taken with the GL version, just to demonstrate that there's no shader trickery or suchlike going on.

#625 posted by FifthElephant on 2016/12/08 00:18:49

awesome! Classic water!

#626 posted by dwere on 2016/12/08 04:52:19

Mark V seems to struggle with complex maps that are smooth in QS. On my questionable system, I mean.

jam2_mfx.bsp is a good example. Right at the start, looking towards the palace produces a very noticeable slowdown.

Fitzquake v0.85 also has this problem.

@dwere - Vertex Arrays/ Water Warp/ IPad Version ...

#627 posted by Baker on 2016/12/08 05:31:11

@mh - Haha, your water warp. I suspected you would do that ;-)
---------------

@dwere

Especially on older hardware, vertex arrays help achieve a more reliable 72 frames per second in the BSP2 era.

I hadn't implemented yet them because there were many other things on the to-do list. I'm prototyping an iPad/iPhone version, which uses Open GLES which requires use of vertex arrays so I actually have to implement vertex arrays here in a hour or so. I'm still sticking with the "that's it for 2016", but version 1.3 will have it.

---
@gunter - You are probably right on blending. I'm hoping that MH will provide HLSL shader option for gamma/contrast in DX9 ... everyone has their wish list ;-) btw ... I still hate your computer, but I sure appreciate all the compatibility testing it has helped provide.

iPhone/iPad - 2017

But since I'm making a prototype iPad/iPhone version right now which will controls similar to Minecraft on the iPad which is very playable on an iPad.

(Android is more of a pain because very crude development tools which is like banging rocks together to make fire. iPhone development tools have always been very nice.)

/Any new builds will be 2017 and unsure where the priority of an iPhone/iPad version. Now that I have stable release and zero issues outstanding, playing around is a bit more leisurely. May upload a video later tonight after I get it initially running ...

#628 posted by mh on 2016/12/08 07:35:12

I couldn't not do water warp.

Performance. Vertex arrays help with big maps, but they're only part of the story. What really helps is draw call batching, and vertex arrays are just a tool that allows you to do batching.

glBegin/glEnd code is fine for ID1 maps but as maps get bigger this kind of code gets slower and slower.

Toss in a few dynamic lights and/or animated lightstyles (which stock Fitz also handles very poorly) and it gets worse.

Batching is all about taking a big chunk of work and submitting it in a single large draw call, instead of several hundred tiny draw calls. Each draw call has a fixed overhead, irrespective of how much work it does, and that overhead is typically on the CPU. So if you have, say, 1000 surfaces all of which have the same texture and lightmap, drawing them with a single call will have 1/1000 the CPU overhead that drawing them with 1000 calls would.

Stock Fitz would also benefit from lightmap update batching. Again it's the difference between "few large updates" versus "lots of tiny updates" with the former being more efficient. Stock Fitz also uses GL_RGB which compounds the problem by forcing a format conversion in the driver. This stuff is mostly hidden on modern hardware, but you can still find devices (and some scenes in maps) where you get an unexpected nasty surprise.

Ironically, one way to make stock Fitz run faster can be to disable multitexture. Try it - pick a scene where it gets the herky-jerkys then compare with running with -nomtex. This will cause it to have fewer state changes between draw calls so that the driver can optimize better, as well as batch up it's lightmap updates (for the world, not bmodels which are still slow). Depending on the scene, the extra passes might turn out to be a lower workload.

If the engine itself implemented all of this batching then running with -nomtex would not be necessary.

The D3D9 wrapper takes the supplied glBegin/glEnd calls and attempts to be somewhat agressive about batching them up. It converts everything to indexed triangle lists and concatenates multiple draw calls that don't have state or texture changes between them. It also attempts to merge multiple lightmap updates.

None of this is as efficient as if the engine did it itself, of course. Going into the actual GL code and doing it right from the get-go is always going to give the best results.

Warping The Water

#629 posted by NightFright on 2016/12/08 12:12:07

Well, I wouldn't care whether "authentic" waterwarp is implemented into the DirectX or rather OpenGL build. But the one that would get it would be my personal default. :P

@johhny Law

#630 posted by Baker on 2016/12/09 10:44:21

I'm sizing up Requiem to see what unique things it adds for likely addition ...

I know it can create items (interesting idea), for instance. jdhack had some interesting ideas in there.

A question for you, if you know ...

I can't get Requiem to run on Linux, it says "libGL.so.1 not found". Engines like ezQuake run fine for me on Ubuntu Linux or even super old FitzQuake 0.80 SDL. Could it possibly be expecting a 32-bit .so ?

If you happen to know ...

@nightFright

#631 posted by mh on 2016/12/09 11:04:32

Well, I wouldn't care whether "authentic" waterwarp is implemented into the DirectX or rather OpenGL build. But the one that would get it would be my personal default.

What if both were able to get it? :)

The Eternal Conflict

#632 posted by NightFright on 2016/12/09 11:13:03

That would mean you and Baker found a way to solve your epic "conflict" regarding its implementation? Sounds like a great X-Mas gift to me, actually...!

#633 posted by Baker on 2016/12/09 14:36:12

I wouldn't call it a conflict, hehe.

The DirectX version implementing DirectX features is just natural.

The OpenGL remaining at 1.2 for broad hardware compatibility is not something very bloody likely to stop MH.

To say MH is good at rendering is like saying Isaac Newton was good at calculus or that Einstein was pretty okay at physics ;-)

About MH ...

#634 posted by Bake on 2016/12/09 14:40:12

There's assembly language in his shaders in the RMQ engine.

@Baker - Gamma And Contrast

#635 posted by mh on 2016/12/09 15:11:31

Currently going through the MarkV code to figure how it implements gamma and contrast in the GL renderer.

To be honest, I see absolutely nothing in either that wouldn't work in D3D right now; even version 8.

Gamma just sets adjusted gamma ramps, which will also work in D3D. D3D does have it's own gamma functions too, but you're better off using the native Windows SetDeviceGammaRamp/GetDeviceGammaRamp stuff (in particular the D3D functions don't work at all in windowed modes, whereas the native functions do).

Contrast is just a load of blended quads over the screen. The only thing I see in there that may be a problem with D3D8 is commented-out code enabling scissor test. D3D8 doesn't have scissor test but D3D9 does.

For D3D9 I'm going to do something different.

I'm going to do shader-based gamma and contrast using render-to-texture. This is achievable with D3D shader model 2 shaders, broadly equivalent to OpenGL 1.5 with ARB assembly shaders so compatibility remains good. It will enable gamma and contrast in windowed modes without affecting the entire display, and it won't require gamma-and-contrast-adjusting screenshots.

The interface will be 1 function: BOOL D3D_SetupGammaAndContrast (float gamma, float contrast) which it will be appropriate to call in GL_BeginRendering. Returns TRUE if it was able to do it's thing (or if gamma and contrast are both 1), FALSE otherwise in which case you need to decide on a fallback - either route it through the GL codepath (which should also work) or do nothing. Everything else will be automagic.

#636 posted by Baker on 2016/12/09 16:59:39

Likely future scheme, not that this would have any impact on coding anyway ...

vid_hardwaregamma (*) - following the FitzQuakian cvar scheme that I rather like such as r_lerpmodels 0 (off), 1 (best), 2 (always)...

vid_hardwaregamma (or whatever name becomes)

0 - Never. Use best available non-hardware method

1 - Windowed mode uses non-hardware method (looks better on desktop), fullscreen uses hardware method (faster and hardware method is also brighter, some displays tend towards the dark side no matter what without hardware gamma). Default.

2 - Hardware method always.

(*) bad name because also does contrast?

GL_RGBA

#637 posted by Baker on 2016/12/09 17:28:03

Also: may notice the code is biased towards GL_RGBA and I have to switch the bytes around to be BGRA for various operations. It isn't actually an oversight or an inefficiency I didn't correct, but rather that OpenGLES only has GL_RGBA.

Just wanted to point that out because I know you may see that code and think "This is so wrong."

/The video/input/system code long ago was entirely rewritten in a way to support devices. In some of the files, there is living device code from back in 2014.

"libGL.so.1 Not Found"

#638 posted by Joel B on 2016/12/09 20:48:24

Hmm.

So, my reQuiem test builds were done on CentOS 6.4. Looking at that setup now, ldd says that my reQuiem-debug.glx executable is using /usr/lib64/libGL.so.1.

rpm -qf on that file shows that it came from the package mesa-libGL-9.0-0.8.el6_4.3.x86_64

I don't remember now if that was something that I explicitly installed for reQuiem's benefit.

#639 posted by mh on 2016/12/10 04:51:35

Also: may notice the code is biased towards GL_RGBA and I have to switch the bytes around to be BGRA for various operations. It isn't actually an oversight or an inefficiency I didn't correct, but rather that OpenGLES only has GL_RGBA.

This doesn't actually matter at all in D3D aside from getting the byte ordering right, because you're writing the data directly to the texture rather than relying on the driver to not screw up.

#640 posted by Spike on 2016/12/10 10:28:31

writing device memory using bytes instead of a cacheline-aligned memcpy will be slower, but whatever. modern apis just have you write it into ram and have the gpu itself move it onto the gpu's memory so there's no issues with uncached memory or whatever.
either way, d3d10+(eg gl3.0)/vulkan hardware has proper RGBA texture uploads so its not like modern gpus care. older gpus/apis will still need some sort of conversion but its okay to be lazy and submit only the lightmaps as bgra. streaming is the only time it really matters. oh noes! loading took at extra blinks-duration! *shrug*

Compiling Requiem On Linux

#641 posted by Baker on 2016/12/10 12:06:24

Ok .. first snag ...

#include <sys/cdefs.h> file not found

Solved with: sudo apt-get install -y libc6-dev-i386

Then next issue ...

fatal error: X11/extensions/xf86dga.h: No such file or directory

Does ... sudo apt-get install libxxf86vm-dev -y

But is already installed.

Goes to /usr/include/x11/extension ... no such file as xf86dga.h. Slight Googling turns up ... "xf86dga.h is obsolete and may be removed in the future."

Looks like future is now. See note about warning include <X11/extensions/Xxf86dga.h> instead. on that same page I Googled.

Don't have one of those sitting in /usr/include/x11/extensions either. Hmmm. Hope is not brick wall.

@johnny - I'm posting this for informational purposes. I never expect anyone in particular to assist, just fyi. I'm hoping someone reading this thread that knows what the above could be about may chime in.

Guys

#642 posted by Shamblernaut on 2016/12/10 13:45:27

I'm working on my quakespasm-irc engine thingy, and expanding it with more streamer-features.

One thing that I've been wanting to add are the joequake demo features. Rather than reinvent the wheel, and being that quakespasm and mark v share a bunch of code already, I was wondering if I could have a look at the mark v code to steal... uh, borrow from.

BGRA

#643 posted by mh on 2016/12/10 14:35:10

GL_BGRA was really only ever significant as an Intel performance fix, and even then it also needed GL_UNSIGNED_INT_8_8_8_8_REV (which was probably the most annoying GLenum to type) in order to get the fix; without both it still ran slow.

Both NV and AMD also ran slower without these (with BGRA being by far the most important), but insignificantly so; Intel was catastrophically slower.

This is trivially easy to benchmark. Just do a bunch of glTexSubImage calls in a loop and time them. Adjust parameters and compare.

Both GL_BGRA and GL_UNSIGNED_thing are core since OpenGL 1.2, with the latter being adopted from GL_UNSIGNED_INT_8_8_8_8_EXT in the old GL_EXT_packed_pixels extension. So if you're targetting GL 1.2 you can quite safely use them without compatibility concerns.

Since Microsoft did the world a favour by forcing the hardware vendors to standardise on RGBA in the D3D10 era, I don't believe that any of this stuff is even important for Intel any more.

Basically if it's less than 10 years old it probably has good support for RGBA (if less than 5 make that definite) so you can really just use RGBA and no longer worry about this stuff.

I obviously don't speak for mobile hardware, where the rules may be different, and anyway there are far more interesting formats such as RGB10A2 which lets you do a 4x overbright blend without sacrificing bits of precision and with only 2 unused bits per texel. I never formally benchmarked this format but tests ran well.

What's more important about FitzQuake is that it uses GL_RGB for lightmap uploads. Even in the days of robust RGBA support, that's always going to force a format conversion in the driver. Combine that with hundreds of tiny updates (rather than few large ones) and FitzQuake can still chug down to single digit framerates even on some modern hardware.

No amount of BGRA can fix that, and here's where I believe FitzQuake has done the community a disservice. There are lots of interesting things that mappers can do with dynamic lights and animated lightstyles, but because FitzQuake performed so poorly with them I suspect that much of that early potential was never realized.

NV and AMD both suffer from this, but if all you ever benchmark is timedemo demo1 (or map dm3) with gl_flashblend 1 you'll probably never even notice. Intel suffers from this AND needs BGRA/UNSIGNED_etc.

Again it's trivially easy to demonstrate the perf difference, but to robustly fix in the engine requires more reachitecting than I'm willing to do within the scope of my current work.

@shamblernaut

#644 posted by Baker on 2016/12/10 15:55:22

If you look around in the Quaddicted engines directory, you can find older Mark V versions like this one where I marked things very cleanly ...

... for ease of porting most of the features very easily to Quakespasm.

Current Mark V isn't structured like for many reasons including that the WinQuake software renderer has been combined into Mark V, but the source is on the Mark V page.

First | Previous | Next | Last

You must be logged in to post in this thread.