Efficiency
How is this "least efficient"? I don't see a more efficient way to generate triangles from convex polygons.
#1121 posted by ALLCAPS on 2013/10/14 19:41:17
Honestly I just assumed it was the least efficient/elegant way since it was the first solution I thought of. Probably because it's simple. I just figured there'd be a more complex, "better" way.
#1122 posted by Spike on 2013/10/14 20:11:26
each edge has two sides. pick one vertex from each edge based upon the side of the edge you're using. side is determined by whether the edge index is negative or not. this will get you a convex polygon (aka: a triangle fan). more modern renderers can trivially generate triangles from that.
the whole thing is just triangle fans. software kept it like that because its easier to clip+frustum cull, which is part of how it managed to avoid all overdraw from bsps.
remember, these face polygons are never concave. there's no holes or anything. its pretty trivial because of that.
Efficience Of Triangle Fans
#1123 posted by Preach on 2013/10/14 22:04:35
Strictly speaking there may be an inefficiency there in rendering>/i> those triangle fans. According to a half-remembered article I read a while back, long, thin triangles are slightly less efficient to render than evenly proportioned ones. I'd hazard a guess that's due to better cache-coherence properties of the latter.
I wouldn't worry about it though, primarily because it's a tiny difference. Also because almost all the polygons in quake maps will be 6 sides or less, so there's not a great deal of difference between the best and worst choices.
#1124 posted by JneeraZ on 2013/10/15 00:58:42
This is where I've had the nagging thought that I'll bet modern engines could just create large, texture sorted buckets of triangles representing the entire level and just throw it at the video card. Odds are that on even a mid-level machine, it would run fine.
Keep the BSP for collision checks and line-of-sight stuff, but in terms of raw rendering I wonder if parsing through the VIS data is actually a detriment these days.
You Know...
#1125 posted by Spiney on 2013/10/15 01:20:53
Brute forcing winning over the elegance of a BSP traversal algorithm always felt like an aesthetic unfairness to me.
Not that I mind scraping the vis times.
Almost Reads Like A Carmack Tweet
#1126 posted by Spiney on 2013/10/15 01:23:22
I need to get some sleep
BSP
#1127 posted by Preach on 2013/10/15 08:42:09
Remember that visibility tests aren't used exclusively by rendering - they're also used by the ai as the first step of deciding if the player can be seen or not. So you will take a double hit on performance as every monster in the level starts doing traceline visibility test every frame.
Of course, it's pretty easy to test the hypothesis that vis is unnecessary - just build a map, saving a copy of the un-vised BSP file as you go. Then compare performance across the two files. You could even run vis at all the different levels and have multiple points of data. I don't know if there's any hard data on how much of a fps gain you get from making vis more accurate, it might be thaat they highest levels don't provide a good return on investment.
TrenchBroom Does The Brute Force Thing
It just throws texture-sorted triangles at the GPU, and it's pretty fast. The only drawback is that you have to reupload a lot of data to the GPU when the user changes the geometry, but since usually only a few brushes are selected at a time, just uploading the selected geometry is fast enough.
#1129 posted by JneeraZ on 2013/10/15 13:28:07
Preach - You'd need an engine change to test it properly. If it's traversing the BSP tree to gather up the triangles every frame then it's still eating overhead that a bulk renderer wouldn't have.
I wonder about the hit there though, on your monster example. Line traces in a BSP are extremely fast. But pre-computed VIS data is always going to be faster there. The problem is that it would take a lot of work and engine coding to come up with a definitive test. :P
Sleep - That's what ToeTag did as well. I think there was a basic cull for stuff entirely behind you, but that was about it. Everything else got chucked at the card, sorted by texture. I never really saw it slow down at all...
UQuake Performance
#1130 posted by ALLCAPS on 2013/10/15 19:05:30
uQuake leaves all that up to Unity to handle, it discards the bsp tree itself, the vis data, nodes, planes, all that jazz. Unity decides per GameObject what should be rendered and what shouldn't, and with each face/patch its own GameObject it can cull invisible regions and polys very well. In essence I'm throwing the entire level at the engine at once, but the engine is good at throwing only the parts that can be seen right now at the video card. Works well even on an Ouya/HP Touchpad, which are some Android devices with less than amazing GPUs. The Ouya version of uQuake renders and empty level faster than the proper port of OpenArena, even, which I'm pretty sure uses the vis-and-related data to do traditional bsp stuff, as it's a source port.
I am hitting some issues with parsing Quake1 .bsp, though. Perhaps my understanding of variable type and size is not correct, but the file specs I linked above say that a face is thus:
u_short plane_id;
u_short side;
long ledge_id;
u_short ledge_num;
etc...
u_shar light[2];
long lightmap;
Is an unsigned short int not two bytes? is a long not eight bytes, and a char one? I try to read the data out using that assumption and I get garbage. Looking at the offset where faces start It looks like either the specs are not right, or a short is one byte. I did notice the the BSP version in the file is 29 while the spec here is for version 28.
.h
#1131 posted by ALLCAPS on 2013/10/15 21:28:52
comparing with Quakespasm's structs in bspfile.h it looks like they are different. The doc at gamers.org shows a different number of fields than the struct in Quakespasm, with different types as well.
Unity
#1132 posted by Kinn on 2013/10/15 23:39:22
and with each face/patch its own GameObject
Seriously? o_O And Unity handles that ok?
I've been doing similar stuff (parsing a doom3 .proc file into unity meshes) - the great thing about .proc files is that the surfaces are already grouped into the areas that are created by doom 3's visportals. I typically create just a single combined mesh for each of these areas. If I want more granularity to the meshes I'll just stick in more visportals.
Oh Yeah
#1133 posted by ALLCAPS on 2013/10/16 01:11:51
Unity can handle a ton (10k+) of simple gameobjects without issue. I'm not sure how well it'd work if each gameobject had logic/tags/scripts attached, but just using them to render meshes has almost no overhead. I'm pretty sure Unity groups batches of meshes into drawcalls anyway. I've loaded some Return to Castle Wolfenstein SP levels that ended up being about 10k gameobjects, and only got slowdown if I selected worldspawn (so, every single face in the level) in the editor with the game unpaused. Rendering the level was still working great.
Hmmm
#1134 posted by Kinn on 2013/10/16 22:14:04
I guess unity has ways of optimising static gameobjects - I'll have to find out more about what it's actually doing.
On another note - how are you handling the map collision?
"is A Long Not Eight Bytes"
#1135 posted by mwh on 2013/10/17 00:39:06
Probably 4 bytes? Quake was released >5 years before the first amd64 cpu after all :)
#1136 posted by necros on 2013/10/17 01:01:42
when i was parsing mdl files, ints were 4 bytes so longs probably are 8.
Kinn
#1137 posted by ALLCAPS on 2013/10/17 04:36:45
Every GameObject has a mesh collider on it! Each object gets:
- A material that's created at runtime using the texture I ripped from from pak0.pk3 and the lightmap decoded from the .bsp, using a shader I found on the Unity forums.
- A mesh that's generated at run-time using the verts and tris from that face's entry in the .bsp. I call .Optimize and .RecalculateBounds on each mesh as it's generated. I'm not sure if this is needed or not.
- A mesh collider, which is the slowest type of collider Unity has, but since each face is typically small, and is sure to be convex, it's actually pretty fast.
I had planned to implement the bsp tree to help reduce the number of gameobjects, but it looks like Unity does a fine job on it's own. I've run a bunch of the stock Q3 maps and some pretty sizable custom ones on both a crappy laptop with a Core Duo and ancient Intel graphics, and on the Ouya, which has a terrible Tegra3, and it runs very well on both. I suspect Unity does it's own judgement on what objects need to be rendered and what colliders should be checked against, and it seems to scale very well.
Int+long
#1138 posted by ALLCAPS on 2013/10/17 06:35:37
From QuakeOne.com forums:
byte+char are both 1 octet, short is 2 octets, int+long are both 4 octets, long long is never encountered.
Octets in this context being equal to one byte.
ALLCAPS
#1139 posted by Kinn on 2013/10/17 11:13:03
Ah ok, so your collision mesh is just a duplicate of the render mesh essentially.
I was just wondering if you'd found a way to somehow process the bsp collision data, so you could take advantage of clip brushes etc, which would simplify the collision geometry somewhat.
#1140 posted by ALLCAPS on 2013/10/17 15:49:18
In theory you could just skip adding mesh colliders to the faces as you made them, and then create meshes from the bsp brushes in the map, and a collider to them, and skip adding a renderer, so you have an invisible clipping brush. That would be needed for sure on some maps, where simplified collision stops the player getting hung up on detail in the map.
Progress!
#1141 posted by ALLCAPS on 2013/10/17 19:41:35
I have it mostly working. Geometry is recreated, and texture coords are calculated correctly. Or at least I think they're correct, they scale and move like I expect them to playing around with them in trenchbroom.
Getting verts for each face was a little confusing at first with the double-lookup method it requires, but it wasn't that difficult. Same for texture coords; a little confusing at first, but simple in the end.
As a stress-test I loaded up Sock's Ivory Tower, and the result is a CRAZY 32k GameObjects!
THIRTY. TWO. THOUSAND. OBJECTS.
http://imgur.com/sNHWsmf
Marking the objects as static lets Unity batch them together, resulting in only about 2-5 draw calls at any given moment. It's very slow to start up, likely because right now I am making a ton of Lists as I make each object, which was helping me squash bugs. Now that I have the process down I think I'll be able to simplify it and speed up startup.
Getting there!
Nice!
#1142 posted by ijed on 2013/10/17 19:55:29
Congrats on that.
Qunity Quanity Quakity Quakey...
About collisions, I thought the whole saving of visual / hull collision was marginal anyway by modern tech? Does / would it have any discernible impact on modern machines?
@ALLCAPS
#1143 posted by sock on 2013/10/17 20:32:21
If you really want something to stress test your unity map converter program I recommend you try my Map on the Edge of Forever. All of the source files are freely available, it has full HD textures and new environment sounds with spoken dialogue.
If you can get some simple game logic (pushing/shooting buttons) and Q3 style shaders working it would easily run standalone in Unity!
#1144 posted by ALLCAPS on 2013/10/17 21:02:34
@ijed
Really there's no performance need to have simplified colliders anymore, at least not using Unity. The main benefit would be to stop players from getting stuck on detailing. Players love to glide along walls and assume they'll just slide along the wall without getting hung up on the grates and torches.
@sock
Would love to have larger maps like this working well. I'm going to have need some optimization before I get big maps working well.
I replaced all of my Lists with arrays in the object generator, but startup performance with large maps didn't increase much. Premature optimization is the bane of many a project, though, so I think I'll focus on getting smaller maps (like my lame remake of dm_stalwart) working with textures and lightmaps before I start looking to juice performance.
|