Find-tuned Looping
#1176 posted by yesterday on 2013/10/20 23:47:14
There's also a pattern at the end for writing find loops, so you don't have to repeat your find code at all. Whether it's an actual improvement or not is up for discussion...
How does it compare to using a for loop to find, instead?
#1177 posted by Preach on 2013/10/21 08:58:43
How does it compare to using a for loop to find, instead?
It generates essentially the same code, so it's just a trade-off in human terms. The for loop is a bit more explicit on what's going on, but the while loop is shorter and better from the Don't Repeat Yourself point of view.
Qccs
#1178 posted by Spike on 2013/10/21 18:52:03
your two loops are not equal. the for loop will ensure initialisation.
the while loop will run slightly faster - the way its written will consume one less instruction (additionally the progs will be a smidge smaller due to the reduced repeats).
modern qccs (including recent fteqcc builds) will be able to optimise more agressively if all locals are initialised before use, reducing the number of temps used (because initialised locals will not have undefined values when recursion or overlapping is used). Plus its good practise if you ever write C code. :P
Also, fteqcc likes to warn about unintended assignments within conditionals, which can be fixed with an otherwise redundant extra parenthesis (consistant with gcc).
Put that all together and the ideal loop is something like:
for(candidate = world; (candidate = find(candidate,classname,"monster_ogre")); )
{
foo(candidate);
}
Yeah, it costs an extra vm cycle, and an extra instruction to go with it, but it potentially saves as many 'globals' as you have locals in your function, including localised temps (assuming the qcc doesn't detect any other uninitialised locals).
It definitely saves a few headaches too, when you use the same variable in multiple loops in the function.
Of course, if you want compatibility, the for loop is an extension. Only while and do while loops are supported in all qccs.
A View Into The Optimisation Process
#1179 posted by Preach on 2013/10/21 22:58:13
That's interesting to know, thanks! So if I explicitly initialise a local I get can get a reduction in stack size. Is that at the cost of additional length in function size? I think it's right that the extra instruction doesn't cost execution time because an extra temp would also mean an extra zeroing-out. Guess there's no benefit if you're not under pressure for temporary variables in that function?
The ability to suppress the warning is a massive boon though! I'm back down to zero warnings and blissful silence...
Random Query
#1180 posted by Preach on 2013/10/21 23:49:18
Are there any occasions where the optimising code takes advantage of "leaf functions" i.e. functions which call no other functions? It seems like a category of function which would be easy to identify at compile time, and where you could go very aggressive in terms of reusing variables as you don't have to worry about recursion or anything needing to be restored later...
#1181 posted by Spike on 2013/10/22 00:35:41
QC stack:
when a function is called, the current locals are copied off to some hidden stack. the 'globals' (globals is definitely the wrong word to use here... in this post this is the only meaning of the word 'global') that hold the locals are not overwritten automatically when the function then runs.
when the function returns, the globals that are used for the function's locals are reset to their original values, thus they are cleared or so again ready for the next time its called.
if you have a recursive function, those locals will thus be set to the value they had in the parent (even if its not the direct parent). which is probably not what you want.
the optimisation the qcc is trying to use here is to overlap all locals from all functions using the same set of globals. this means you don't waste globals on other things. the engine still does the same amount of archiving (to preserve the caller's locals), just it deals with the same block every time.
the requirement that things are manually initialised avoids bugs with legacy code where unitialised variables are expected to be 0 (despite there being cases in vanilla where they are not - recursion).
Where the function looks like it has at least one uninitialised (read-before-write) local, the function's locals are given a unique non-shared location for all of its locals, as the engine supports only a single globals block for the locals stack.
While its possible that better cache utilisataion might make the code faster, the reality is that the actual cpu will need to execute more instructions than it previously would have.
Wow
#1182 posted by Preach on 2013/10/22 09:14:36
That's a fascinatingly weird glitch I'd never heard of before - I'd held the zero initialisation as being a fundamental property of the language! For those who like demonstration code, the following shows off the important properties:
void() Recurse =
{
��local float delta;
// note we use delta here uninitialised on the
// RHS before the assignment happens
��delta = delta + 1;
��dprint("IN: ",ftos(delta),"n");
��if(delta < 5)
����Recurse();
��dprint("OUT: ",ftos(delta),"n");
}
This outputs:
IN: 1
IN: 2
IN: 3
IN: 4
IN: 5
OUT: 5
OUT: 4
OUT: 3
OUT: 2
OUT: 1
The value of delta is retained between calls as we recurse deeper onto the stack, but is restored to the value the caller had when we return to her. No doubt someone can do something creative with this, I'm gonna go audit all my code for correctness again...
Sidenote
#1183 posted by Spike on 2013/10/22 12:40:33
the prerelease had no local stack. Essentually it had no locals at all.
carmack really loved his globals...
Using fteqcc, try:
#pragma warning enable F302
Might have false positives - but let me know if you do spot any false negatives.
Yeah
#1184 posted by ijed on 2013/10/22 13:42:29
I assumed '0' was correct behaviour.
What would coders expect to be correct for an undefined variable? A null value?
#1185 posted by Spike on 2013/10/22 14:21:39
I don't get what you're asking.
An undefined variable would be a compile-time error.
I Only Dabble
#1186 posted by ijed on 2013/10/22 15:13:32
In code.
When defining a local value in a function, it defaults to 0 until I set it, correct?
I might be missing the gist of the post entirely.
#1187 posted by Spike on 2013/10/22 16:28:24
The QCVM initialises locals on return rather than on entry. This means that locals *typically* default to whatever the qcc assigned for them, with 0/null/world if it was uninitialised.
The exceptions are when recursion is involved, in which case it starts with the value it has in the earlier instance of the function on the stack (even when its not the direct caller).
Or when the qcc bugs out and tries to reduce local use even when its unsafe to do so, in which case the locals will have pseudo-random values depending upon the values in the same slot in the caller.
Note that:
float a=5;
is technically an initialisation and not an assignment. This means that if you stored something else into a in a parent function, the value of a will NOT start as 5.
Note that vanilla qcc also then considered 'a' to be a constant and then merges it with other constants with the same value, which can mess stuff up quite a lot and is just unsafe however you look at it.
FTEQCC will internally use assignments instead of initialisers if the local is a variable and not a const. If you want the broken behaviour back, you can just define the local as 'local const float a=5;' instead.
Assuming var and using assignments instead of initialisers also allows 'float speed = vlen(self.velocity);' as valid and well-defined code, which is the main reason for the behaviour change. :P
Ah
#1188 posted by ijed on 2013/10/22 17:21:34
I use FTEQCC and also the broken format, just because that's what I've learned from looking at other code.
#1189 posted by Spike on 2013/10/22 22:55:41
yeah, that last paragraph was to say 'except in fteqcc, initialised locals work as expected', so no worries.
might not be 'standard' (if such a thing even still exists), but its more useful and intuitive.
An Open QC Question...
#1190 posted by Preach on 2013/10/25 00:44:35
I have a suspicion that the answer is going to be no going into this one, but here goes. Does anyone know of an engine-neutral trick to distinguish BSP models from non-BSP models in QC. Apart from the obvious way of trying to make them SOLID_BSP and checking to see if you've crashed...
I can think of a few heuristics that might imply BSP models based on the maxs and mins, but none that would be anywhere close to reliable enough. Mins and maxs have also varied in the past between different custom engines, so it doesn't seem very safe.
You might try incredibly ugly hacks in QC to increment the pointer to the modelpath string one character at the time. By running a few string comparisons at each step you should eventually get a match to one of the file extensions. Apart from the danger of making a mistake in the pointer arithmetic and causing a segmentation fault, this method would fall foul of someone who lied about the extension of their model!
It turns out Quake engines don't actually rely on the extension for determining the model format - instead they read the magic number at the start of the file. This is a piece of good fortune for people who create md3 replacement models for advanced engines, as they can match the names of the original files yet use the replacement format.
I did scour the codebase a little while looking for any places the crucial difference between a .mdl and .bsp format model poke their heads out, but I couldn't see anything. Any thoughts?
Make A Working Model?
#1191 posted by yesterday on 2013/10/25 01:59:53
About the only thing I can think of is testing the model for crashes while not running in a live environment, but that's hardly a helpful solution.
I'm wondering now why this might matter, though. About the only plausible guess I have is that you're trying to make Quoth's .mdl box items lit properly.
@preach
#1192 posted by Spike on 2013/10/25 02:49:42
Vanilla NQ code will *always* set mdl format's size to '-16 -16 -16' '16 16 16'. A poor choice, but lovely and consistant. Failure to do so still breaks mods.
This size is distinct from all vanilla .bsp models, which have a mins of '0 0 0'.
Sadly, third-party engines fix that (no more dodgy frustum culling), but gain qc bugs as a result.
In DP, you'd need to set sv_gameplayfix_setmodelrealbox to 0 first. Its now the default but wasn't in slightly older versions.
In QW, setmodel *only* does setsize for actual bsp models. mdl models are not actually supported/loaded by QW servers. Sadly it uses the extension to see if it should try to load a bsp... But, if its foo.bsp you can use this to check its actual type.
There's a couple of extensions lying around like DP_QC_GETSURFACE's getsurfacenumpoints(ent,0) that will return 0 for non-bsp formats, until it gets extended to work with other formats too... :s
Other than that, a model format is a model format. QC defines the physical size of the object (to avoid bugs), and the model itself provides just the mesh.
The only time you'd need to be able to tell the difference is when you have something like func_trains which expect bsp objects. I suggest you use spawnflags or something to say 'non-solid' instead. You'd need to modify the qc code to add a non-inline precache anyway.
Thanks
#1193 posted by Preach on 2013/10/25 10:25:11
The only time you'd need to be able to tell the difference is when you have something like func_trains which expect bsp objects. I suggest you use spawnflags or something to say 'non-solid' instead.
That's exactly the scenario I had in mind. It seems like it would be annoying to users that even though you've obviously specified a non-BSP model, you also have to manually tell it "this is a non-BSP model" again. It would be nice to have the mod take the hassle out, or at least practice look-before-you-leap, but I suppose the crash is instant at least so people would catch and fix places where they forgot more or less instantly.
I have a plan in mind now that's slightly neater than a spawnflag...watch this space.
#1194 posted by Spike on 2013/10/25 22:56:28
some engines support non-bsp objects as SOLID_BSP pushers (even if its just a bbox). A non-solid flag should thus not be implied by the model type alone.
You wouldn't be saying 'this is a non-bsp model', you'd be saying 'this isn't solid', which might apply for evil trains too (mwahaha). The fact that the engine doesn't support solid trains isn't the mod's fault, its the engine's fault. That said, the mod probably should take efforts to ensure the bbox size is also tweakable if its a solid mdl.
Fair Enough
#1195 posted by Preach on 2013/10/25 23:06:01
I think the plan will work just fine with that - people will be able make something that works in all engines or opt for taking advantage of a new engine's features - if they don't mind they've excluded classic engines in the process.
BSP Tree Traversal
#1196 posted by ALLCAPS on 2013/10/29 21:37:15
I just want to make sure I'm doing this correctly, before I begin my bug-hunt.
To find the leaf the player is in, you take the root node, and see what side of that node's plane the player is on. If they're on the positive side, do the same using node children[0], if they're on the negative side, do the same using children[1]. During this "walking" you'll come to a point where the child you're told to check is a negative number. Discard the negative sign and add one. This is the leaf the player/camera is in, and using that leaf's PVS will render all leafs visible from that point.
Is this right, or am I making a poor assumption or oversimplification somewhere?
@allcaps
#1197 posted by Spike on 2013/10/30 02:48:06
Discard the negative and subtract one, if you're using that order of operations.
-1 maps to leaf 0, not leaf 2.
Remember to decompress your pvs.
Roger
#1198 posted by ALLCAPS on 2013/10/30 06:33:00
Got my bsp walking working. I can tell what leaf the camera is in now. Struggled for hours, then on a whim tried inverting the plane normal, and it now works as it should.
Nearly 100% of the stumbles I've run into have been in converting Quake-space to Unity-space, and the trend continues.
Working on decompressing and using PVS now. So close I can taste it. I keep saying I need to work on lightmaps, but this is more interesting. Lightmaps should (should) be pretty trivial, so blazing through that will hopefully be my reward for getting PVS working.
Conflicting Documentation
#1199 posted by ALLCAPS on 2013/10/30 08:41:05
I'm reading all over trying to get a handle on reading and decompressing PVS data, but I'm having a hell of a time.
I have some embarrassingly elementary questions.
I read out the bytes in the vis data lump, and for the tiny map I'm using to debug this renderer, I get this:
255 10 255 2 207 10 207 10 243 2 243 2 255 10 255 10 0 1 15 255 15 0 1 7 205 11
What is the correct method for decompressing this? I'm not even sure I'm reading it out correctly, because looking at this it looks like there's going to be way too much data. This map only has 13 leafs, so something is wrong. Should I be reading these as shorts/ints instead of raw bytes?
This is an area where examining bspfile.h hasn't really helped, as there's no lump defined for this.
#1200 posted by metlslime on 2013/10/30 09:37:54
look at Mod_DecompressVis in the source, that's where quake decompresses it.
Basically non-zero bytes are uncompressed. Every run of zeros is run-length encoded, a zero followed by a second byte which is a count.
And once uncompressed, there is 1 bit for every leaf.
|