A Word Of Caution
#68 posted by negke on 2009/11/05 14:25:49
As it turns out, this multithreaded vis isn't safe. The problem lies with the autosave feature:
Each thread processes one portal at a time, that's why the fullvis is sped up on the whole the more threads a machine has. However, as we know, not all portals take the same amount of time - especially towards the end, their processing time increases greatly, some of them can even take days. The autosave option is geared for single-threaded processing, so it doesn't take into account which portals are done and which are still being processed on the other threads when saving the state. Therefore, pausing the process (ctrl+c) will leave these portals (that were being processed but couldn't be finished yet) open, so to speak. This will eventually lead to a situation where vis reaches the end of the processing line, but can't wrap up the entire thing properly because there're still those 'undone' portals in-between. It will then stop with a "portal not done" warning and the state file will be corrupted.
This means the multithreaded vis tool can only be used (more or less) safely if one leaves the process run from start to finish. Maps that take longer and thus have to rely on the autosave function should not be fullvised with this tool.
#69 posted by JneeraZ on 2009/11/05 15:30:35
Oh. OK, that explains why I wasn't able to VIS your map for you then (my resumes kept failing). Damn.
#70 posted by JneeraZ on 2009/11/05 15:33:24
Well, the source code is on Quaketastic so I'll see about making some time to look into it. It shouldn't be too difficult.
The solution, probably, is to turn off the threading when the autosave is ready to go, wait until the last portal finishes, save, and then start them all up again. We'll see...
Thread Saving
#71 posted by Tuna on 2010/01/11 14:00:54
Ahoi,
aroused by Spirit's coding bounties over at quaddicted.com I thought I might take a look this one.
Honestly I haven't really dived into the code much. However judging by the comments I thought I try some blind thread synchronization upon saving.
Here is my first appalling version:
http://user.cs.tu-berlin.de/~tuna/WVis_thread_test.zip
I'm not even sure how to trigger the issue described. I tried ctrl-c-ing and continuing with the unmodified version and I didn't encounter the "warning" message..
If you have precise instructions how to make it fail please let me know.
Also feedback if the issue has changed or improved with my version is highly appreciated.
#72 posted by JneeraZ on 2010/01/11 14:48:42
Hooray! Thanks man. I took a few shots at it but always came up short.
Awesome
#73 posted by Spirit on 2010/01/11 16:52:38
Could somebody explain how to properly test this?
#74 posted by JneeraZ on 2010/01/11 16:53:36
I think just get a map that takes a long time to VIS and stop/start it repeatedly.
#75 posted by Spirit on 2010/01/11 17:40:34
I tried gmsp3tw and it worked very well.
#76 posted by JneeraZ on 2010/01/11 17:42:05
But can you get a failure with the old version? The old version worked most of the time - until it didn't. :)
Yes
#77 posted by Spirit on 2010/01/11 18:24:14
I set the -savetime to 10 and interrupted it a lot. Got the error in return at the end.
Yes
#78 posted by Spirit on 2010/01/11 18:24:14
I set the -savetime to 10 and interrupted it a lot. Got the error in return at the end.
Tuna
#79 posted by megaman on 2010/01/12 12:03:53
Release the src, please.
#80 posted by Tuna on 2010/01/12 12:26:00
I wanted to wait for more feedback - or at least for no negative results. I will send the patch to Spirit later today or tomorrow when I have access to the machine again. I hope he can review it and decide whether it qualifies for the bounty or not :-)
Test It With Coag3_negke
#81 posted by negke on 2010/01/12 13:08:48
Yay For Tuna
#82 posted by Spirit on 2010/01/17 14:27:21
#83 posted by Spirit on 2010/01/17 17:11:19
#84 posted by Tuna on 2010/01/18 09:35:42
Call for testing! Please re-download the test version from my location above. Its a newer version which should be more efficient and hopefully also has the bug fixed.
#85 posted by Zop on 2010/01/18 10:22:15
When I use -nosave, I don't get %/time interval updates from the "full" part of the vis process.
#86 posted by Tuna on 2010/01/18 12:09:33
Updated the .zip file. I think this bug was also in the original version?
Now there is also the chance that if you have a corrupted .vis file that it gets repaired when it gets loaded.
Yes
#87 posted by negke on 2010/01/18 12:52:15
It indeed loads my corrupted state file and seems to be continuing the process normally.
Tuna
#88 posted by negke on 2010/01/18 15:41:07
Is this the 'idle' or the 'reset' version?
#89 posted by Tuna on 2010/01/18 16:13:10
This should be the 'right' version: No thread synchronization is done. Instead undone portals are marked as such instead upon saving.
If no bugs are found this should become the recommended version.
In the end the fix could probably be done by changing one line. But I think its ok to fix some other things while we are at it.
#90 posted by JneeraZ on 2010/01/18 16:20:33
Nice work, tuna!
I See.
#91 posted by negke on 2010/01/18 16:55:10
I'm currently testing it on my coag map. The ultimate goal would be to get all portals done except for the long one, thus making the PVS data as complete as possible. Problem is that I can't tell what's going on as the -verbose display isn't as clear as in single-thread mode. Or dunno..
#92 posted by Tuna on 2010/01/18 17:47:01
Hm. Not sure what the verbose messages are supposed to output. In theory just keep an eye on the CPU usage. When the "Full" Vis step is only using one core you are there.. Might be difficult to spot when you are on a single core CPU. When the task manager displays the threads per process you might keep an eye on them..
|