MTR Psi Testing
This is a "lab notes" page for my (MikeTheRed) Psi Testing, if anybody wants all the gory details.
Conventions: Numbers such as 95/16 always mean, e.g., psi strength 95, psi skill 16. MC = psionic mind control.
- 1 Background
- 2 Test 1: Basics
- 3 Test 2: Constants and Multiplication
- 4 Test 3: Focus on the Constant
- 5 Test 4: Varying Attack and Defense Strengths at Low MC Percent
- 6 Test 5: Varying Attack and Defense Strengths, including Panic testing
- 7 Test 6: Finding the ceiling
- 8 Test 8: Distance
- 9 Test 9: Fine Tuning
- 10 Psi Equation Finalized
- 11 Psi Testing Tips
The equations governing psionic success have long been a subject of mystery and debate. If they were better understood, players could compare versus alien stats and know exactly how much Psionic Strength and Psionic Skill is needed, both as a minimum, and where one is maxxed out versus all aliens.
The Official Strategy Guide supplies psionic equations, but while they are very intriguing, they make no mathematical sense, as if the math operators are typo'd or something.
As part of my previous Experience testing ca. October 2005, I found that a 95/16 soldier directly next to a 25/0 muton had a MC success rate of 49% (1192/2420=49.26%), and that a 95/44 never failed. But I soon began to directly edit UNITREF.DAT experience counters for my tests, and thus didn't pursue in-game psi tests any more.
Attack Strength (AS) = psi str * psi skill / 50 Defense Strength (DS) = psi str + (psi skill / 5)
Panic Attack chance = 44% + AS - DS Mind Control Attack chance = 24% + AS - DS
For these equations, one is using the attack strength of one party versus the defense strength of the target.
He had a good number of data points, and the results appeared solid. However, they didn't agree with my one highly-tested point. His equation equals 29% for my 95/16 versus 25/0 situation, whereas I got 49% success.
One potentially important difference is that he had an alien MCing his soldiers, whereas I have my soldiers MCing aliens. Anyway, the results remained puzzling to me, but I didn't have time to do psi testing then. Another potential difference is that he's using the WinCE Gold version, while I'm using the DOS 1.4 version (in DosBox). I'd consider the alien-vs-human target-type proposition to be more likely, because as far as folks can tell, the programmers tried to make an exact replica of the DOS version for Windows (a few bugs notwithstanding). And the target-type proposition is testable, as well (if we have the time, fingers crossed!)
Another reason it'd be good to know psi equations, is because effectiveness clearly decreases with distance. Again, how much strength and skill is needed to "totally rule", even across a large map? No one can say until the equations are deduced.
Test 1: Basics
For my test setup, I made a map with 16 mutons, and had 16 soldiers with psi amps that were directly next to and facing the mutons (zip file of it here).
For this first test, I wanted to keep it simple. I made all aliens 25/0, to compare with my earlier finding. Based on Ethereal Cereal's equations (see inset), if one attribute (Strength or Skill) is 100, the other can be up to 50% before the MC success rate is clipped at 100%. To simplify, half my 16 guys had Strength=100, half had Skill=100, then I chose eight equally spaced points within the range that would be applied to Strength or Skill (whichever was not 100 for those 8 soldiers). I chose this "alternating 100" approach because it would test the symmetry and, if it did look symmetrical, the results could be combined to give them more power (a better correlation coefficient on a regression line).
When choosing the eight points from 0 to 50, I also took into consideration how my finding was different from EC's, by 20 points. But I screwed up here and aimed for 0 to 70+... actually my finding was lower than his, so 0 to 50 would've encompassed all concerns. Anyway, I made my soldiers be 9 to 72 in the non-100 attribute.
Very quickly I saw that the higher half of each stick of eight was MCing 100%. So I limited my testing to the lower half of each stick (one attribute = 9, 18, 27, or 36; other attribute 100). I aimed for 50 trials, but went over by one, so N=51 per soldier. Each soldier only made one MC attempt per game turn (otherwise, it's problematical to have the target right next to them). I used the psi experience counter UNITREF.DAT to determine success, because it's much less error prone than counting manually (this work is SO tedious!). But it counts by 1 for a failed attempt and 3 for a successful attempt, so this shows the success rate:
Successes = (UR - Attempts)/2 = (-51)/2 Success rate = Successes/Attempts = Successes/51
Given 51 attempts per soldier, the results were:
UNITREF.DAT Pooled   MC success Pooled Pooled success Soldier Str Skl success rate attrib. success rate 01 9 100 20 39.2% 9 39 38.2% 02 18 100 29 56.9% 18 62 60.8% 03 27 100 48 94.1% 27 96 94.1% 04 36 100 51 100.0% 36 102 100.0% 09 100 9 19 37.3% 10 100 18 33 64.7% 11 100 27 48 94.1% 12 100 36 51 100.0%
Although I suspected that the soldiers with an attribute level of 36 were maxed, I kept testing them, both to be entirely sure they were maxed, and as a check that I counted the number of attempts correctly, my UNITREF approach was working, etc.
The results do appear to be symmetrical, so they can be combined as shown in the pooled results above. N=102 for each of the four points, but the attribute=36 point is probably clipped (above the maximum) so it should be dropped from regression analysis. A regression line then drawn for success rate versus attribute level for the three points above (with the other attribute fixed at 100) gives the following for y=mx+b equations, where y is percent MC success, x is attribute level, and m is slope of line:
Int=0? m b R2 100% Intercept no 3.10% 8.50% 0.9877 29.5 yes 3.51% 0.00% 0.9682 28.5
In other words, when one attribute (psi strength or skill) is held to 100, every point of increase of the other skill increases MC success by 3.1% (if the regression line is not forced to go through success=0 at attribute=0) or 3.5% if the y intercept is set to 0. Either approach (intercept=8.5% or forcing it to zero) is problematic, however, because the game is liable to be clipping if values fall below zero, and Attack Strength minus Defense Strength has the potential to result in negative numbers which get clipped.
The 100% success ceiling is reached when the non-100 attribute equals ~29. But that's only an estimate, because it's an estimate of a line that's not perfectly correlated. The finding of psi 100/36 giving 100% success (N=102) is in agreement, though... it's probably somewhere around 29.
The coefficient of variability (R squared) is high.
Clearly, MC is more effective for me (i.e., lower attributes needed) than predicted by EC's equations. Until such time as it can be further understood, then, it appears that aliens are different somehow versus X-COM soldiers, such that they are less effective for a given psi strength and skill.
Test 2: Constants and Multiplication
For the next test, I wanted to make sure that the underlying assumptions are as we expect them to be:
- What is the constant in the MC equation? (EC believe it's 24%)
- Is multiplication correct for Attack Strength? (Or might psi skill or strength contribute in an additive or subtractive way?)
These were addressed by:
- Setting the muton targets to psi strength of 0. (They're already skill 0.) This has the effect of making any MC attack constant "stand clear" of any potential obfuscation caused by misunderstanding the Defense Strength.
- Setting soldiers' psi strength to 0, and varying their skill. If possible, I would have interlaced strength 0 and skill 0 for soldiers, but a skill of 0 disables psi capability. Anyway, setting one of the two variables in the Attack Strength equation (see Background above) to 0 should make it all equal 0, if it only uses multiplication or division (but not if it uses Skill with addition or subtraction somehow).
The soldiers were put into groups of four based on psi skill, to increase sampling and allow pooling at those points. Psi skills and results were:
Percent Soldier Skill Success Success (30 attempts each) 01 1 12 40.0% 02 1 10 33.3% 03 1 10 33.3% 04 1 11 36.7%_ 05 50 13 43.3% 06 50 12 40.0% 07 50 13 43.3% 08 50 11 36.7%_ 09 100 12 40.0% 10 100 16 53.3% 11 100 15 50.0% 12 100 17 56.7%_ 13 255 11 36.7% 14 255 9 30.0% 15 255 14 46.7% 16 255 15 50.0%_
This can be summarized:
Psi Skl Successes N Min Ave+/-SDs Max 1 43 120 33.3% 35.8% 3.2% 40.0% 50 49 120 36.7% 40.8% 3.2% 43.3% 100 60 120 40.0% 50.0% 7.2% 56.7% 255 49 120 30.0% 40.8% 9.2% 50.0% _ Overall 201 480 30.0% 41.9% 7.7% 56.7%
There is considerable variation; more samples sure would help. But this work is SO tedious!
Although the success rate for the 1-50-100 skill progression hints at a trend, Skill=255 is back down to the level of Skill=50 (and the single lowest soldier success rate, a 9, is in this group). I see no clear difference between the groups, especially if you consider that Skill=255 should be MUCH higher than the lower groups, if Skill were indeed influencing the groups. So, while the data is messy, I buy that it's a multiplicative equation, at least insofar as Skill is concerned.
The best guess at the constant is 41.875%, although it could easily be 40% or something else, with so much variability present. Notice how Test 1 could be taken to hint that the constant is 33.5 (alien psi strength of 25 plus b intercept of 8.5). However, there is considerably more error in an extended regression line... there's a fair amount of variability in the slope, which gets compounded by extending it down to zero. In any event, it also hints at a fairly high constant.
As things become clearer, the data from Test 1 may be able to be pooled with other data in order to pin down the constant better, but for now, 41.875 is the best guess.
Test 3: Focus on the Constant
As I thought about what to do next, it occurred to me that I didn't really like/trust that high constant (42%). So I decided to test it more.
3a: Double-check fails
If the constant is 42%, then mutons at 25/0 should get MCed at least some, even if soldiers are set to Strength 0. Specifically, the equations predict it should be approx. 17% of the time (~42% - DS 25%). So I set half my soldiers to 0/1 and the other half to 0/255, thinking to also double-check whether multiplication matters, as I went.
But it quickly became clear that something was wrong. Not a single MC worked, in a total of 147 MC attempts (split about evenly between the two groups). At 17%, I should've seen ~25 successful MCs. So something is wrong with the equation (or my understanding of it, or my testing). So now I'm backing up to see what's the lowest level for a muton's psi strength, where I am able to MC some. This might give important insights into the equation(s).
3b: Playing with muton Psi Strength, with soldiers at Psi Strength 0
First off: Did setting mutons to strength 0 (in Test 2) somehow totally muck up the equations? To test this, I'm setting some mutons to 1/0, and some to 12/0 (halfway to 25/0). Soldiers will stay at half 0/1 and half 0/255.
These results show that:
- For one thing, MCs ARE seen, for mutons at both 1/0 and 12/0. So Test 3a must have hit a "ceiling" - somewhere between 12/0 and 25/0, soldiers can no longer MC when at psi strength=0. This also means that 0/0 mutons don't do weird things due to being 0/0 (read on)...
- Soldier psi skill continues to appear to not matter if their psi strength is 0: at skill=1, MC% is 31.1% ± 12.0% (N=280); at skill=255, MC% is 34.3% ± 15.0% (N=280). While the MC% is a little higher at skill=255, there's a huge overlap (i.e., little or no difference), whereas psi skill itself has changed in the extreme (from 1 to 255). Further, one of the soldiers with skill=255 actually had the lowest MC% value (10.0% vs 17.5% for skill=1; N=40 per individual soldier). At this point, I'm pretty satisfied that MC Attack uses multiplication, at least insofar as psi skill is concerned. Probably psi strength, too, but you can't make psi skill go to 0 to directly test that.
- Conversely, muton psi strength definitely makes a difference. With mutons at 1/0, we see MC% 44.3% ± 5.1% (range 37.5 to 52.5%, N=280), and at 12/0, MC% is 21.1% ± 5.9% (range 10.0 to 27.5%, N=280). Notice how even the extremes do not overlap - the 12/0 max is 27.5%, and the 1/0 min is 37.5%. Also note how the percent chance is decreasing as 25/0 is approached. The 41.9% ± 7.7% seen with a 0/0 muton (Test 2) is also roughly in agreement with this (it's close to the 44.3% ± 5.1% seen with 1/0).
At first glance, the results seem to be suggesting that muton Defense Strength is increasing by roughly 2 times their psi strength. I say this based on the equation:
MC% = Constant - Defense Strength (Attack Strength is presumably zero when soldiers are Strength 0)
If MC% goes from 40+ to 0 on the way from muton psi strength 0 to 50, but the actual muton psi strength is only 25... you get the idea. A regression line through the three(!) points (0/0, 1/0, and 12/0) gives MC% = -1.8895*Strength +43.932, where Strength = muton Psi Strength (R2=.9715). Although it's only based on three points so far, it looks intriguingly like:
MC% = 44 - 2 * Muton Psi Strength (when Attack Strength is zero)
Perhaps this new wrinkle will help explain some of the differences seen.
Ethereal Cereal's comments
Given the formulas I discovered, with Psi str = 0 (Attack Strength = 0), base MC chance is 24 - Defense strength. So Mutons, at Psi str 25, give chance = -1%, which would explain the ceiling you experienced.
I used your savefile and hacked the soldiers to Str 0 and the Mutons to Str 21. I was able to succeed in 6 of 160 MC attempts: 3.75%, slightly above the 3% my formula would predict. I used a simple testing method: I made sure each of the 16 flanking soldiers MCed exactly twice, and counted the number of controlled Mutons at the end of each turn. That might be a faster method to use than yours while still being fairly error-proof.
(Re-reading your above text, I see that MCing once per soldier is better -- necessary, even -- when success rates are high. But counting controlled Mutons once per turn may be faster than counting Psi Skill counters.)
I suggest you set all soldiers' skill to 50 and str to 51 (or any combo that will produce 2550 when multiplied) and see if you get 50% success rates when MCing Mutons of psi str 25.--Ethereal Cereal 21:03, 3 September 2006 (PDT)
Okay. I took your savegame, hacked all the soldiers to 255 TUs, Psi Str 50, Skill 77, and all the Mutons to Psi Str 100, Skill 0. media:AS77DS100.zip
In 640 MC attempts, I had 13 successes, a 2% success rate, where my formula predicts 1%. When I upped the Mutons' Psi Str to 101, I had 0 successes in 480 attempts. The 1% vs. 2% discrepancy might be attributable to the as-yet-undiscovered distance portion of the equation.
Your formula (MC chance = 42% + AS - 1.75xDS) would predict a 42 + 77 - 175 = -56% chance. Test out the above savegame, tell me what results you experience.
I suggest at some point switching your testing to Panics, as you can repeatedly panic units, unlike MCs. Try 50/31 vs. 25/0 for a predicted 50% rate with Panics.--Ethereal Cereal 16:08, 4 September 2006 (PDT)
Good idea on the Panics, Ethereal... let me finish my current round (4d) with MCs, then switch to Panics, I guess.
Could you do me a BIG favor... try exactly my Test 2, so we can make sure there's not anything different going on for your setup that's different from mine? That would be mutons 0/0 and soldiers strength 0 (and skill shouldn't matter, but why not do half skill=1, half skill=255). I'd really appreciate it, just so we know you get that same 42% (or not). Don't you think your formula would predict 24% there? Try to get at least 200 N, but 400 or more would be really nice. It's real grinding work, though... I love the low MC% approach!
I'll likewise do a direct test of your 50/77 soldiers versus 100/0 mutons savegame, to see what I get.
I recently updated my testbed savegame (same link as it's been, at the beginning of Test 1)... that earliest version had oddities like, Mutons were not facing their same-numbered soldier (if you want to do specific "1 on 1" testing), and even, one of the mutons had a fatal wound that led to him dying after 30 turns or so, lol.
BTW, I didn't realize that quote marks used for italics needed to be closed at the end of a line. I figured they operated line "line specific comments" versus "global comments" in SAS programming (or whatever). I double-checked and you're absolutely right. I'll stop doing that. I wonder if our author guides point this out? Unfortunately, I've already littered many of my posts with non-closed quotes. Anyway. I'll delete this particular paragraph after you have a chance to see it.
---MikeTheRed 17:40, 4 September 2006 (PDT)
3d: Finalizing results for Soldiers at Psi Strength 0
Thanks for the comments, EC. Your data fits right into the graph I've made (inset).
I've done more testing including Test 3c (not shown per se) and 3d. 3c put more points on the graph for a 6/0 muton (MC% 25.8% ± 11.3%, range 10.0% to 45.0%, N=240) and a 19/0 muton (MC% 12.5% ± 3.2%, range 7.5% to 17.5%, N=280). Then I added your point at 21/0 (MC% 3.75, N=160) and, using your cogent comment about multiple MCs, I jacked the soldiers to 255 TUs (allows 10 MCs per turn) and set half the mutons to 23/0 and the other half to 24/0. With the expected MC% very low, I could go to town with the testing. At 23/0, I saw MC% 1.6% ± 1.3%, range 0.0 to 3.4%, N=704. The 24/0 mutons never got MCed (N=704), so that looks real solid.
The 6/0 data point seems low, but what can you say... that's sampling for you.
Doing multiple MCs per turn when the expected rate is very low seems like a great idea, and testing went much faster. When you only do one MC per turn, a large percent of your time is spent simply selecting each soldier, psi amp, and target... this really speeds it up. But in practice, I wonder if it might be problematic. Several times, my soldiers got more than one MC in a turn, which made me use other nearby mutons... but if the success rate is very low, and the decrease due to distance is also real low, it's conceivable that these two might collide and e.g. farther mutons may get what very little percent chance of success they have, obviated by the additional distance.
In order to control for this, one could stop MCing with a particular soldier, if they have a success. Then keep track of how many times they didn't do "all" their MCs, and in the end, subtract those missed chances from the number of attempts for the soldier. This is one way to avoid a potential problem, and still lets you move pretty fast (much more often than not, they don't have any success even with ten attempts, when rates are this low).
All in all, I like the speed boost due to testing "low ends" of expected ranges. So perhaps I can think of ways to use this in my next tests.
Test 3 Summary
It seems pretty solid that the MC Attack equation is using multiplication, at least insofar as the soldiers' psi skill is concerned. No clear effect due to varying psi skill from 1 to 255 was ever seen, when soldiers' psi strength was pegged at zero.
It also seems clear that Defense Strength actually works as a multiple (probably target psi strength times 1.75) instead of being used "straight up" as the initial belief was.
Finally, the constant for MC Attack chance appears to be ~42.
The MC equation can therefore be tentatively modified at this point to be:
Mind Control Attack chance = 42% + AS - 1.75xDS
Although in truth at this point it can't be discerned whether that 1.75 should be placed in the MC equation per se, or directly in the DS equation. The difference being that it would affect Panic success also, if placed in the DS equation.
Ethereal, I'm not sure why you are still calling the constant 24%, when it seems pretty clear that it's higher. (Compare the MC% that 24% gives for AS=0 and DS=0, versus the inset graph.) Let me know if I'm missing something!
Next I guess I will probe the lower edges of the Attack Strength and Defense Strength equations, to verify that they operate as believed. I could shoot for 50% like you suggest EC, but I want to go with the faster low-success approach for now. (It's such a refreshing change!) FWIW, I still prefer using the Unitref psi experience counter, because it also lets you know if you somehow messed up and, e.g., skipped a soldier during testing, or got your number of attempts wrong because of running on for one more (or one less) turn than you wanted. Also, counting MCed aliens every turn introduces some time spent every turn (with a slight potential for error in writing down findings), whereas the Unitref check only happens once. It's not hard to simply leave EDIT running in the target directory, then re-open Unitref again (shrug).
Test 4: Varying Attack and Defense Strengths at Low MC Percent
For these tests, I will vary soldiers' psi strength and skill, and alter mutons' psi strength so that a very low MC% is expected (1 to 3%). Muton psi skill will be kept at 0. A low expected MC% lets me MC much faster.
Starting now, I am skipping the remaining potential MCs for soldiers who successfully MC that turn. I'm noting down how many they skip (i.e., had left to do), and later comparing notes against Unitref counts (subtracting out how many they skipped versus the total potential number), as usual. Skipping remaining turns ensures that no problems creep in due to turning to farther-away mutons, which may confound the results, especially when deliberately aiming for very low success rates.
Otherwise, for these tests, soldiers have 255 TUs and can attempt 10 MCs each turn, which makes for speedy testing.
4a: A stab in the dark
I used my equation above, and pulled soldiers at 10/100 or 100/10 out of the air, versus mutons at 35/0. This should have produced a 1% MC rate.
But testing quickly showed that this produced a higher rate than expected. Specifically, 7.5% ± 5.8% (N=88). This is too high for speedy low-end testing, so I stopped after only one round. Back to the drawing board.
My formula predicts 9% success rate for the above figures.--Ethereal Cereal 16:10, 4 September 2006 (PDT)
4b: Approximating the low end
I kept the same soldier setup, but upped all mutons to 39/0. This still produced a rate that was a little high, 5.6% ± 4.0% (range 0.0 to 10.0%, N=223). So I stopped after two rounds. I'll increase the mutons' psi strength a little more.
4c: Low end found
I increased mutons to 43/0, and kept soldiers at half 100/10, half 10/100. This produced only 25 MCs in 1,492 tries, or 1.68%. That's a nice low rate.
10/100 soldiers MCed 2.35 ± 1.25% of the time (N=732), while 100/10 did 1.05 ± 1.07% of the time (N=760). This might or might not be a real difference. I will move on to testing the "farther reaches" of this low MC% level with mutons at 43/0, and changing the soldiers so they retain a multiplicative of 1,000, but vary psi strength versus skill as much as possible. Since 255 is the limit for those bytes, this means 4/250 and 250/4. This should bring out any non-multiplicative difference that this first test hints at (2.35 versus 1.05%) by pushing strength and skill to opposite extremes.
If groups are combined, the results are 1.68 ± 1.26% (N=1,492).
For the record: My standard deviations are for individual soldiers' results. Sometimes there are 3 or 4 soldiers in a group; sometimes 8 (as for here).
EC, I just now saw your note. In checking, I see that your equation would produce 1% here. But mine predicts -13% (ouch). At this point, I don't know what to make of my findings when soldiers and mutons are zeroed. Is there some way the math can be treated to encompass both findings? See also my reply to your "Test 2".
4d: Finishing out, Attack Strength as a simple multiplier: Confirmed
Soldier Muton Str Skl Str Min Ave ± SDs Max N _ 4 250 43 0.00% 1.43 ± 1.23% 3.57% 1,049 250 4 43 0.71% 1.71 ± 0.92% 2.86% 1,053 Overall 0.00% 1.57 ± 1.06% 3.57% 2,102
As can be seen, soldiers at 4/250 are pretty much identical to soldiers at 250/4, and also identical to 4c's results, 1.68 ± 1.26% (N=1,492). If data from 4c and 4d are pooled, we get a combined average of 1.61% (58/3594), with an SDs of approx. 1.0%.
Altogether, I think these results provide two important confirmations:
- Psi strength and skill are used in a straight multiplicative way when figuring Attack Strength. Neither are disproportionately favored, as psi strength seems to be for Defense Strength.
- The lower end (and presumably higher, too) of MC Attack chance forms an isotonic level (the same chance everywhere), for a given value of strength times skill. Here, it was always the same for strength times skill equals 1,000, regardless of whether formed by 4/250, 10/100, 100/10, or 250/4.
Next I'll run EC's scenario (Under his Test 2, above) to check that we get the same results.
Test 5: Varying Attack and Defense Strengths, including Panic testing
In this section, I'll be trying various AS and DS values to probe the "edges" of the MC space, and also transitioning to Panic attacks. Like Ethereal said, this will make testing easier, especially at high success rates... targets can take multiple Panics, making for high speed whether the success rate is high or low.
5a: Replicated Ethereal's Test 2
As per Ethereal's request, I tested 50/77 soldiers versus 100/0 mutons. My result is 2.16 ± 2.33%, range 0.00 to 8.47%, N=880. EC got 2.03% (13/640), a practically identical result. The weighted average of our results is 2.11% (13+19 / 640+880). Of course, this is nowhere near the -56% predicted by my 42%+AS-1.75*DS (which would actually show up as 0.0%, but anything greater than zero means it's way off). So:
- There's an important confirmation that our setups are in sync (IOW, it's now highly unlikely that DOS 1.4 vs. WinCE Gold edition makes any difference).
- Clearly there's something wrong with my equation, which was based on MCer and MCee having 0 psi strength. But what could be happening at that low end? I don't have any reason to think my results are incorrect. Ethereal, I hope you can find the time to replicate my Test 2, so we know it's not something weird on my end. If you want to, do it with Panic attacks... we're sure to pin down the difference in constants between MC and Panic (it's probably 20% anyway), so that should make testing a lot easier for you.
By the way: If/when we want to test Mutons with psi skill greater than 0, we can give them 24 or less TUs so they can't MC us and mess everything up.
Well, dammit. I did a 0/1 vs. 0/0 panics-only test, and I got 258/320 successes using your Psi experience counter method: 80.625%, where my formula would predict 44%. Did I screw up my data collection? Does the Psi counter do something strange we're not aware of?
You're right, testing in this fashion is tedious. I'm gonna propose one last test: 91/50 vs. 25/0 Mutons, all MCs. It's not good for mass data collection, but if my formula is correct, only about 1 in 10 MCs should fail. Just eyeball the results in-game for a few rounds.--Ethereal Cereal 22:33, 4 September 2006 (PDT)
5b: Comparing the Panic Attack chance
In order to use Panic instead of MC - and just to test it in general - I'll compare it to the findings for 4c and 4d, which give a nice big N for soldiers at 4/250 or 250/4 and mutons at 43/0.
Soldier Muton Str Skl Str Min Ave ± SDs Max N _ 4 250 43 33.33% 36.77 ± 3.04% 42.50% 960 250 4 43 31.67% 35.42 ± 3.65% 40.83% 960 Overall 31.67% 36.09 ± 3.32% 42.50% 1,920
As expected, soldiers at 4/250 seem identical to soldiers at 250/4. So there is probably the same isotone and strength-or-skill equivalency for Panicking as there is for MC.
What's unexpected is that the constant is not 20% higher than the 1.61% ± ~1.0% (58/3594) seen for MC with the same psi settings. The difference is ~34.5%. Ethereal, how much did you test Panicking back in May?
- The way I tested it was to have an Etheral trapped by Psi-strong troops psi a single adjacent psi-weakling. I ran several tests to determine the 1%/0% chance threshold for both MCs and panics. I never saw a result which suggested anything other than 20% difference -- at least at the threshold. The irregularities that are showing up here might be indicative that the formula isn't linear above the 1% mark, although every test I ran with varying AS and DS combos confirmed the formula correct at the 1% mark.--Ethereal Cereal 10:39, 5 September 2006 (PDT)
If the difference is a constant of 34.5%, my approach (42+AS-1.75*DS) would predict ~76.5% Panic Success for 0/1 soldiers versus 0/0 mutons. Your equation (24+AS-DS) would predict ~58.5%. At first glance, I'd say your results support my equation, at least for the 0/1 and 0/0 scenario... but we also see that my equation fails in other scenarios. It's a head scratcher, but I'm sure we'll figure it out.
Some notes on Panicking
While doing testing for 5b, I made notes about how often Berserk or Panic was seen. By the third turn, every single muton was freaking out; only once in 10 turns did one of the 16 mutons not freak. (I tested for 12 turns, but started counting Berserk rates on the third turn.) So they were undoubtedly pegged at Morale 0, or close to it. A check of Unitref after ending the last test turn showed all 16 mutons at Morale 15, as expected after one "recovery" freakout. There were 55 Berserks out of the 159 freakouts, for a rate of 34.6% Berserk versus Panic.
- For more on Berserk versus Panic percents, I summarized all my data on the Berserk page
If 36.09% is the correct Panic percent from test 5b, the chance of at least one successful psi Panic attack being performed by soldiers doing 10 attacks is:
1 - 36.09%10 = 1 - 0.00375% = 99.99625%
Mutons have 80 Bravery, so successful psi Panic attacks immediately reduce their Morale by 30 (110-Bravery). The expected average number of successful attacks per turn in the current scenario (10 attempts/turn) is simply 10 times 36%, or an expected average of 3.6 successful panic attacks. 3.6 attacks would reduce mutons' Morale by 108 (3.6x30) or, in other words, one turn of 10 attacks at a 36% rate knocks a muton to 0 Morale, on average. By two turns, Morale at or near 0 is practically guaranteed.
Mutons (and all other units) recover 15 Morale points after undergoing Panic/Berserk. However, if soldiers are averaging 3.6 successful attacks, Mutons will always be at or near zero after one or two turns to knock out their starting Morale. (All units start combat with 100 Morale.)
When I worked on Experience counters, I didn't study the psi Panic attack counter in depth (Unitref). I only checked briefly, and otherwise always worked with Mind Control (or just directly edited the psi experience counter). I'll take a minute to double-check that the psi experience counter is adding 1 for unsuccessful and 3 for successful panic attacks.
5c: Panic attacks work as expected
I did one round of testing psi panic attacks (16 soldiers X 10 attempts each) using the same stats as 5b (250/4 versus 43/0). Mutons were set to Morale 255, and I verified that their Bravery is 80. Soldiers and mutons are paired (Soldier 1 faces Muton 1, etc.), so it was easy to check soldiers' psi experience counter (Unitref) versus their paired muton's Morale.
Results were exactly as expected. A failed attempt adds 1, and a successful Panic attack adds 3, to Unitref. Every successful attack (per Unitref) was precisely mirrored by a 30-point drop in Morale for the respective paired muton. 'Nuff said about that.
The success rate was comparable to 5b, albeit for a much smaller sample: 64/160 successes equals 40.0% (range: 1 to 7 successes for the 16 soldiers). Pooling this data with 5b's gives a slightly higher combined estimate of 36.39% for the panic success rate at 250/4 versus 43/0 (equals (693+64) / (1920+160) ).
5d: Follow-up for Ethereal
I just ran your suggestion for your 91/50 vs. 25/0 Mutons, all MCs. In 7 turns (1 MC/soldier; 7x16=N of 102), I got 100% MCs across the board. Never a single failure. If I'm doing your equation right, it should've been ~70% successful; clearly it's higher than that, if not actually pegged at 100%.
I just saw your note in 5b about testing the 0%/1% border... I hadn't revisited your old notes in depth, and didn't pick up on that. Maybe that's the problem, then... is it true that my results agree with yours at low percents, but not at higher percents? Hmm.
And I wonder if letting aliens do psi'ing may automate things more. Another hmm. One doesn't have to have Ethereals; the mutons in my savegame will psi all they can, if given the chance.
For right now, I want to run 0/x vs. 0/0 with Panics, just to take another look at that low end (and whether I keep seeing a big difference in the constant).
---MikeTheRed 11:00, 5 September 2006 (PDT)
I ran it myself: 48 tries, 0 failures. It should be 90%: 91 - 25 + 24. I did really extensive testing for my formula back in May at the 0/1% border and I'm positive it's right for all AS/DS at the floor. The problem now is finding the ceiling, which has not been consistent with my formula. I think if we nail down the ceiling next, we'll have an easier time figuring out the points in between.--Ethereal Cereal 11:21, 5 September 2006 (PDT)
- Erk, I had put a constant where I originally put a formula. Right, I get your 90%. --MikeTheRed
Here's a test idea: a single muton with Psi Skill and 25-ish TUs, one psi weakling boxed in opposite him, dozens of turns with differing AS/DS to quickly home in on that ceiling. The only problem is we can't compel the muton to do only MC or panics.--Ethereal Cereal 11:27, 5 September 2006 (PDT)
5e: Re-Test of 0/x vs. 0/0 using Panic
I ran this 10 turns (N=1,600) to try to really nail down this constant. (Panics are so much easier!)
Soldier Muton Str Skl Str Min Ave ± SDs Max N _ 0 1 0 71.00% 77.88 ± 3.87% 83.00% 800 0 255 0 77.00% 79.38 ± 1.92% 83.00% 800 Overall 71.00% 78.63 ± 3.05% 83.00% 1,600
(I still run e.g. 0/1 vs. 0/255 because it shouldn't matter yet it's got to be something... and once again, there's no apparent difference.)
If the PA% goes to 0 - which we have every expectation, based on how we trust your equations at the low end - it means we have two ends of a line, vs. Defense Skill: x1, y1 = 0, 78.63, and x2, y2 = 44, 0. This lets us make another slope equation like was done for MC:
PA chance = 78.625 - 1.787 * DS When AS is zero
It's very similar to the Test 3 (MC) graph, with its slope of -1.75. For me, my two constants differ by 78.625 - 42.013 = ~36.6. Compared to your difference of 20, it (36.6/20) differs by 1.83, i.e., ~1.75 (maybe 1.8... whatever). In any event, I keep seeing this slope.
I'm not sure entirely where I'm going with this... just musing out loud; it should all come together sooner or later.
Working with aliens still sounds problematical to me. Can you think of any way around the fact we can't dictate whether it will MC or Panic attack? (Maybe it will MC if set so that he's way over 100% chance for Panic, but near the ceiling for MC, for his target?) One problem is that you may need to surround the alien with several identical psi-weak soldiers, so that he always has more targets of the same stats, at the same distance. Certainly the 4 tiles direct next ot the alien could be occupied, probably also the 4 corners. (For Explosions, they're considered 1 tile away.) One could also keep track of which were right next to, and which were on the corners, from the alien, to see if there's a difference. (Or just do 4 guys, shrug.) Also the alien's TUs can be played with to limit how many attempts they can make. I guess they could all be boxed in, in the X-COM craft. Hmm. Maybe you can play with that a little? Send me a savegame if you get it workable? I'll be looking for the ceiling... this will probably highlight whatever's giving us trouble. But if we find a workable approach to letting aliens psi, it may both make testing a lot easier, and show whether aliens do work the same way (or not). Ultimately this is important because we're not only concerned about which aliens we can panic or MC - we're also concerned about which stats mean we can no longer be panicked/MCed by them.
- My formula works for the latter case -- it does correctly predict the 0% point, for soldiers vs. aliens and vice-versa. (If memory serves, based on my formula and testing, soldiers need about 90/100+ or 100/50+ to have complete immunity from adjacent superhuman Ethereal Commanders. 80/50 or 90/0 is enough to eliminate all MCs and most panics.) But we're also looking for the 100% point (and everything in between), and I'm pretty much convinced my formula is only correct for the floor. Incidentally, for the floor, my tests pretty solidly showed a 20% difference between MCs and panics.
Test 6: Finding the ceiling
As shown above, Ethereal and I seem to agree about psi chances when they are low, but not when they are high. The next tests will therefore look for the ceiling. This will be done mainly with Panic, because it's so much easier to test with. But once something becomes apparent, we'll check against MC.
(EC: Feel free to rework the wiki Titling in all of Test 6)
6a: 50/50 vs. 25/0. MC succeeded 35 out of 40 times. I think I'll try 10/250 vs 25/0 next to see if I get a similar success rate, which would confirm AS is proportional to str*skill at the ceiling as well as at the floor. Then I'll try it vs. 0/125 to confirm if skill is truly 1/5 as good as str for defense purposes.
6b: 10/250 vs. 25/0. MC succeeded 35 out of 40 times. Strong indication AS is proportional to str*skill at all success percentages.
6c: 52/50 vs. 25/0. 35 out of 40. Sample set's too small. ;-)
6d: 57/50 vs. 25/0. 80 out of 80 successes. I would have expected some failures, but the sample set's still pretty small. I might add a 55/50 test later.
6e: 10/250 vs. 0/125. 34 out of 40. Pretty strong indication Psi Skill is 1/5 Psi Str for defense purposes.
That's my batch for tonight.
--Ethereal Cereal 16:39, 5 September 2006 (PDT)
Ok, I think I've got it all figured out. Let me check a few more things to make sure it's reliable everywhere.
I had Monday and Tuesday off work, but am working the rest of the week, so it won't come as fast. Any experiments you can do with aliens doing the psi are important for ensuring that they work the same (or not)... I am only using soldiers to perform psi. (Are your results above done with soldiers or alien psi'ing? If aliens, can you find a way to make them only MC or only Panic? They will probably Panic only, if the only viable targets can be Panicked, i.e., the tagrets can just barely be Panicked, but can't be MCed. That's one way to force the issue.)
---MikeTheRed 11:17, 6 September 2006 (PDT)
- Well, previously I did all my testing a the "just barely" threshold (aliens on X-COM), and my results were always consistent with the formula I discovered above. It's results above 1% that have (to my surprise) behaved otherwise. As we've noted, it's hard to compel aliens to MC or panic only, which makes it hard to discern whether a psi attempt was successful, and therefore to determine success rates above 1%. At 1% is trivial -- it's clear when they've succeeded, or not.--Ethereal Cereal 11:25, 6 September 2006 (PDT)
Ok, I've gotten everything pinned down very solidly, including distance. The final equations tie every single one of our findings together elegantly. Will do one final torture test of everything tonight (MC, diagonally all the way across map, soldiers elevated by four, Defense Strength uses target psi skill heavily, etc. etc.), then post everything up. Although I haven't tried anything near this complex, I've predicted the needed Attack and Defense Strengths for 0% to 100% and expect the results to be precisely bang on. Maybe. :) ---MikeTheRed 12:08, 8 September 2006 (PDT)
- Excellent. I look forward to it.--Ethereal Cereal 17:09, 8 September 2006 (PDT)
Finally taking to heart the idea that the higher end is the problem, I did range-finding for the ceiling. At this point we were both moving fast.
I defined "lines" within the psi space, lines defined by a low end past which all attempts failed, and a high end past which all attempts succeeded. To spare the details (and individual breakouts), the inset shows a lot of subsequent work...
6a) I first tried soldiers at 100/10 (AS 20) vs. mutons at 0/0, 10/0, 20/0, and 30/0 (160 each, total N 640; this is using Panics). This indicated a line of success near the upper end of of 0% to 100% (0/0 not shown on graph; it's above 100%).
6b) I extended this line with mutons at 7/0, 40/0, 50/0, and 60/0 (soldiers still at 100/10). These were intended to approach but not touch the 0% / 100% edges. N of 160 each; 640 total.
6c) Now I was pretty sure of where the 0% and 100% edges were, and tested them, starting with the lower edge. Did 4/250 or 250/40 soldiers (AS 20) vs. mutons of 8/0 or 9/0. 8/0 were 100% (N 640), 9/0 not (314/320=98.13%). Thus the absolute upper edge of Panic (100%) was found at DS 8. Repeated a focus on lower edge. Found it was 64/0 (N=320, 0% Panic). Made the graph shown.
- Note: The .5 decimals seen on the inset were due to the thought that, at first, I didn't know exactly where X-COM "drew the line"... I only knew it was, e.g., more than the specific DS (here, 8) that always gave 100% Panic and less than the DS (here, 9) that gave <100% Panic, at that time. Further work led me to think it's not "lost in the middle" of 100% vs. <100%. It's more likely to be right at the point where 100% is found, given the high-ish percents found 1 DS away. (They approach being 2% away, not something much smaller or miniscule. 2% suggests the line is close or equal to the 100% DS; read on.)
- Also note: At this point I realized how the equations might work, in general - and I was right.
6d) Noting the difference of DS 64-8=56 in the previous work, I tested the 56 "DS height" at a very different point in the possible psi space. I gave my soldiers high AS (100/80 = AS 160) and guessed (read on) that mutons might need DS 160-190 for ranging (160, 170, 180, 190 muton psi strength, 120 N each one; skill always 0). This indeed was the middle of a graph, so I moved on to end-cap testing. DS 148 (100%, N 320) and 204 (0%, N 240) were the absolute caps. Again, their delta was 56.
6e) I jumped to another line. Soldiers 250/40 (Attack Strength 200) vs. mutons 188/0 and 244/0 (244-188=56). Soldiers at 250/40 test that an extreme imbalance in strength vs. skill doesn't matter for AS (here, 200). 100% success at muton DS 188 (N 240), 0% success at 244 (N 160). Again, "DS width of 56".
- The absoluteness of line-end testing (or end-cap testing) is SO much easier than dealing with the variability in between... if you can show that a) there's a total response (or lack thereof) at one point, and b) a little response one point away, then an "end" is nailed down with a high degree of certainty. You don't need to quantify the "little response", you just need to make sure the 100% (or 0%) response has a high degree of certainty (equals high N), and that one point away, you get a response that's not absolute - and a low degree of certainty is fine for this (a low N). Then draw a line from the 100% end-cap to the 0% end-cap. It's that simple; nothing has indicated they are not straight lines. This GREATLY reduces testing time versus sampling across an entire line, with their high degree of variability.
In my spreadsheet, Test 7 is my first shot at the governing equations, and graphs of the same. It's great for Distance =1, but let's not stop here (and let my lab notes skip a number, so the Test numbers finally align with the wiki's Table of Contents)...
Test 8: Distance
8a: Simple Test
Testing Distance can pose a logistics problem. How do you have a lot of mutons an exact distance away from a lot of soldiers, without the mutons being free to roam around? After a little thought, problem solved, as seen in this savegame. 16 mutons were put in a 4x4 square in the lower right of map, surrounded by nine soldiers not participating in testing. Participating soldiers were put due west in the lower left of the map in a 4x4 array, with each being exactly 46 squares from their matched target in the muton array. Mutons were set to DS 45 (45/0) and, after ranging and end-cap testing, soldiers were found to Panic 100% at AS 102 (160/160) and 0% at AS 46 (0/100; note height of 56 again).
Based on previous testing and tentative hypotheses in my T7 equations, I knew that at a distance of 1 versus DS 45 (mutons 45/0), Panic AS 0% was 1 and AS 100% was 57. At distance 46, Panic AS 0% was 46 and AS 100% was 102. At each point (0% and 100%), what's the difference? +45 at distance 46. And all our previous tests were at distance 1. 46-45=1. The sole effect of distance is to subtract 1 per tile. Mentally you might think of this as Attack Strength decreasing by 1 for each tile's distance from the target. But in the equation, it's just subtracted, shrug.
The constants in everything we have done before are not 24 and 44. They are 25 and 45, with 1 subtracted for distance. :)
8b: Torture Test
At this point, I felt I had a handle on everything there was to know: MC vs. Panic, Slope, AS vs DS, Constants, even diagonal distances and elevation. Elevation, because I presumed Height had no effect, since it doesn't for vision. Diagonals, since I previously determined they use a simple "walking TUs" equation as shown in Explosions.
So I set an ambitious experiment: My soldiers in the opposite corner of a big map from their targets (the farthest distance possible), at elevation 4, doing MCs, an imbalance in attacker psi strength and skill, and defender strength using psi skill as much as possible, while AS and DS were at the most extreme values possible that still boiled down to the simple deltas the psi equations want:
- My 4x4 squad in the upper left of the map, elevation 4, with mutons on ground in lower right (my soliders moved, but mutons stayed where they were as per Test 8a)
- My Walking TUs map (to be posted) shows that the lower right to upper left corners of the map are distance 75, but for 4x4 matched soldiers-to-target (which don't allow farthest corner to farthest opposite corner, in order to speed testing), the diagonal distance in Walking TUs is 70
- Max possible Defender Strength (DS) is 255+255/5=306
- Muton targets set to 245/250 = 245+250/5 = 245+50 = 295 for a little extra room (potential flexibility). This is for their 0% (highest strength) point, of course.
- If distance subtracts 70, then Attacker Strength (AS) must be +70. 295+70=365.
- For 0% Mind Control, subtract 25. 365 -> 340 (250/68). (Subtract because we're solving for AS given a DS; the equation is rearranged.)
- The Defense Strength will be varied instead of Attack Strength, because the values needed to make the AS (250/68) don't lend themselves to precise changes.
- The lower end (0% MC) of the line should be 245/50 (DS 295), and the upper end (100% MC) should be 189/250 (DS 239). 295 - 239 equals the expected DS range of 56.
The low end (0% MC) was confirmed as 245/250: 244/250 (DS 294) = 5.1% success (N 8/154); 245/250 (DS 295) = 0.67% (N 2/300). This is not actually 0%, of course, but very very close... I figure there must be slight rounding or truncation effects, with the high AS and DS being used.
But the high end (100% MC) eluded me. It's expected to be 189/250, and I got 100% there (N 100). But I also got 100% at 190/250 (N 60) and 191/250 (N 80). Frustrated, I jumped to 193/250 and got 4/50 (4 misses, that is), backed up to 192/250 and got 2/50, then to 191/250 and now missed twice: 2/128 (pooled N). Performing 100% MC testing across the entire map and four elevations is very tedious... I may test 190/250 more in the future, but that's good enough for now. Given how the low end was shifted a little to the right of where it was expected (245/50 did produce 2 successes in 300 attempts), the upper end may be, as well. Possibly due to truncation, again.
8c: Direct Test of Elevation
It's sad that 8b wasn't exactly as predicted, so I'm focussing on Elevation effects. If Elevation doesn't matter, this means that soldiers directly over their targets should have a distance of 0 (!) in the equation.
Soldiers at elevation 4 were put directly over targets at ground level. Mutons had DS 50 (0/250); the equation predicts endcaps of AS 5 (0%) and 61 (100%) for Panic attacks at distance 0. Soldiers had psi skill of 50, and psi strengths of 5, 6, 80, or 61. Results were bang on: AS 5 = 0/240, 6 = 2/240 (0.83%), 60 = 232/240 (96.7%), 61 = 240/240.
In truth, the line looks shifted a little, doesn't it? AS 6 just barely got a response, whereas AS 60 was considerably more than 1.8% from 100% (100/56; 100-96.7= 3.3%). Maybe elevation does have a very small effect... or maybe having DS determined by so much psi skill in Tests 8b and 8c has slightly moved it over. Note that 8b erred on the side of being lower than expected (it intruded on the lower endcap), whereas 8c found the opposite (it seemed shifted away from the lower endcap). Anyway, the equations are still extremely sound.
8d: A couple more diagonal tests
In retrospect, the distance in Test 8a was along a "straight line", leaving 8b the only test of using "walking TUs" for counting tile distance. But 8b was on a long, pure diagonal which approaches Pythagoras' theorem due to the distance (70 tiles). These two small tests attempt to confirm that the Walking method is used. They also provide a little more testing of distance:
First Trial, 1 diagonal away: The Walking TUs method differs the most from Pythagoras' theorem when a target is 1 tile away. (If a soldier is boxed in by 8 aliens, the 4 aliens on the corners are "1 diagonal away" from him.) Walking TUs predicts it is 1 tile away. Pythagoras predicts it is 1.4 tiles away (the square root of 2). Thus, this is the best place to try to demonstrate which of the two is used.
Assuming it's 1 tile away, mutons at 50/0 were used to test against soldiers using the endcap method. At distance 1, 0% Panic is AS 6 (6/50) and 100% is AS 62 (62/50). Low values were used to prevent the skewing from extreme values that might have happened in Test 8b. Results were: AS 6 = 0/240, 7 = 4/160 (2.5%), 61 = 159/160 (99.375%, gulp), 62 = 240/240. Actually the difference between 1.4 and 1.0 may be "below the level of detection", but in any case, the psi equation and the Walking TUs method hold up just fine.
Second Trial, 2 diagonals away: Another small test of diagonals is to go "up" 2 squares and "over" 2 squares to arrive at a spot directly 2 diagonals away. Walking TUs predicts it's 3 tiles away; Pythagoras, 2.83. The previous trial was repeated except to increase distance to 3 and endcaps to 8 and 64. Endcaps only increased by 2 even though the distance is 3 because the difference in their distance to the target is 2 (3-1). Results: AS 8 = 0/320, 9 = 2/160 (1.25%), 63 = 159/160 (99.375% again!), 64 = 320/320.
At first glance, these two small tests don't necessarily prove Walking TUs are used, although they do support that the psi equation is more than good enough for gameplay predictions. In the next section on fine-tuning the numbers involved, however, there is a little more support that Walking TUs are used instead of Pythagoras' Theorem.
Test 9: Fine Tuning
The psi equations do a great job of attack success. As more data is added, however, we see that it's slightly off in at least one situation (8b), and even if it weren't, there is still some uncertainty to pin down.
While the endcap method shows that the "effective height" for 0% to 100% success is a difference of 56, does that line with slope 100/56 cross very close to the endcaps or not? For example, in 8d's first trial (directly above), we believe the slope crossed the "AS line" at a value greater than 6 (for 0%). But did it cross at 6.1 or 6.9 or... ? The same could be asked about the other end.
These fine points may affect the constants (25 and 45) and the slope (100/56) a little. Also known as, I may as well see if I can get a little more accuracy out of the equations, before I hang up these tests forever. Plus, what happened with 8b?
9a: Toward A Refined Equation (Maybe)
Although the endcap method has proven excellent for speedily determining the psi equation to within two significant figures (that is, the attack success percent, to the left of the decimal place), the simple expedient of looking for a qualitative response at the endcaps is not enough to pin it down further. For that, we need bigger quantities. And what better place to sample, than the endcaps themselves? Determining the response points just inside the endcaps (the largest span possible) to a high degree of accuracy lets us draw a line across the entire attack-success space with more confidence, which adds strength to conclusions.
Mutons at 45/0 were used for Panic testing at Distance 1 (directly adjacent). Low soldier values were chosen: AS 1 (1/50) and 57 (57/50). A very high number of samples were taken, with the following results: AS 1: 0/2000, AS 2: 1.60% ± 0.83% (range 0.80 to 2.80%, 32/2000), AS 56: 98.05% ± 1.01% (range 98.80 to 100.00%, 1961/2000), AS 57: 2000/2000. Ranges and standard deviations (of the sample) are for individual soldiers (8 soldiers, each doing 250 Panic attacks). Also note that although I don't extend decimals later in this section, for the results in this paragraph (all but the SDs) there are no more decimals... 1 in 2000, the smallest measurable difference, equals 0.05%.
The slope of this line is 1.78611, as determined by: X range = 54 (56-2); Y range = 96.45 (98.05-1.60); slope (attack percent per AS) = 1.78611 (96.45/54). This value is extremely close to the expected 1.78571 (100/56); 100/1.78611 = 55.9876. Thus, if X-COM is using a slope composed of very simple integers, 100/56 seems to be a no-brainer shoe-in. (Given what we know about X-COM, there's no reason to believe it's not something simple.) Therefore, I'll assume the slope is defined as 100/56 from here on out... unless you want to get 10k+ samples to see if it's something else. :)
The 0% and 100% intercepts are another matter. If a line is drawn through the two points (1.60 and 98.05), it intercepts the 0% level at 1.1042 and the 100% level at 57.092. Or in other words, it does not intercept at exactly AS 1 and AS 57 - on a plot where the X axis is Attack Strength and the Y axis is Success Percent, it's actually shifted to the right by about 0.1 (i.e., the intercepts are about 1.1 and 57.1). But also remember that I tested AS 57 extensively above (N=2000), and didn't miss a single panic; I got 100% Panic. If the 100% intercept was indeed 57.092, the expected success rate at 57.0 would be 99.836%, which would result in 3.3 misses out of 2000 trials, on average.
For more on this line equation, see 9e.
9b: Re-Test of Psi Skill
It was established long ago by Ethereal Cereal that Psi Skill divided by 5 determines how much Psi Skill contributes to Defense Strength. This test sought to determine whether this aspect - dividing Psi Skill by 5 - caused the skewing of results seen in Test 8b, the Torture Test.
Mutons at 0/250 (DS 50) were end-capped at AS 6 and 62. Distance 1, Panic testing. Qualitative results: No skewing. AS 6 = 0/240, 62 = 240/240.
This also reconfirms that psi skill is divided by 5.
9c: Approximation of Extremes
Another thing that may have thrown off test 8b, was the approximation of extreme DS and AS values. So for this test, first, an extreme DS value was chosen, then extreme AS values that would work with it. Values cannot be quite as extreme as in 8b for two reasons: Distance is 1, not 70 (-70). Panic is used, not MC (-20).
DS 304 (253/255) used for all mutons, vs. soldiers at AS 319 (200/79) or 260 (200/65). No skewing: AS 260 = 0/240, 316 = 240/240. The extreme values probably didn't cause the anomaly seen in 8b.
The cause of this very minor anomaly will remain a mystery, unless someone wants to do a lot more work. It could have been a combination of things, such as how the small amount of testing on diagonals suggested they may slightly skew the results to the right.
9d: Fractional Values
According to the equation, Psi Skill is divided by 5 for the defenders. This gives us a chance to test fractional values.
57/50 (AS 50) soldiers were used against 45/2 (DS 45.4) mutons (distance 1, Panic). The equation predicts a 99.29% success rate for these settings. But in 560 trials, there was not a single miss. 4.0 misses were predicted, which is not a particularly high number.
Muton psi skill was increased to 45/4 (DS 45.8). We should definitely see something here; the predicted rate is 98.57%. But once again, there was not a single miss... 6.9 were expected out of 480.
As a final check that nothing was wrong, their skill was increased to 45/5 (DS 46). Expected rate 98.21% (8.6 misses). Actual rate 98.54% (473/480; 7 misses). Pretty much as predicted.
Apparently, Defense Strength is truncated (psi skill fractions are dropped). DS must equal INT(PST+(PSK/5)).
Attack Strength can also have fractional values, due to how the term has a divisor of 50. However, you have to play with the PST and PSK numbers to get a desired fraction. For example, if one term (PST or PSK) is 250, then 250/50=5, and thus each increment of the other term changes AS by 5. Conversely, if one term is 5, then each change of the other term increments AS by 0.1 (5/50).
240/12 (AS 57.6) soldiers were used against 46/0 mutons, predicted rate 99.29% (4.0 misses out of 560). Actual rate 97.86% (548/560; 12 misses a.k.a. 2.14% misses). This is a higher miss rate than expected. However, it's not a true test of whether an INT might be at work because INT(57.6) would still produce misses... AS 57 vs DS 46 has a 98.21% success rate (10.0 expected misses). Note, though, how the results are close to this.
192/12 (AS 47.04) soldiers versus 36/0 mutons; expected success rate is 98.29% (8.2 expected misses in 480). These values are just barely over a vertical distance of 55 (55.04). Results: 97.92% (470/480); the 10 misses are a little high. As can be seen, there are definitely misses.
On to the high end, just under 100%: 208/12 (AS 49.92) versus 38/0, a 99.86% rate (0.67 expected misses in 480). Actual rate: 98.75% (474/480), a 1.25% miss rate. Again, it's high.
Wait, doi - I was checking for truncation at the high endcap, when AS truncation (for a qualitative yes/no response) should be checked at the low endcap (the opposite of DS)... truncating a fraction of the high end still leaves you under the high (100%) endcap, where you'll still get some misses.
On to the low cap, then: Mutons 45/0 vs. soldiers 5/18 (AS 1.8): NO HITS out of 480 tries. Without truncation, 1.43% (6.9) hits were expected. I've chosen numbers much like the well-tested DS 45 vs. AS with endcaps of 1 to 57, used for the big test in 9a... and we see that AS 1.8 is getting truncated to AS 1, the 0% endcap.
A more extreme test, 45/0 vs. soldiers 9/11 (AS 1.980!), again NO HITS out of 480. Without truncation, we expected 1.75% hits (8.4/480).
Finally, a quick check that nothing's wrong with my setup: 45/0 vs. 5/20 (AS 2.0), expect 1.75% (5.6/320): Got 0.625% (2/320). Oh well, at least the major test is borne out (there were some hits), even if that does seem low.
So, Attack Strength is truncated, too.
Next up: Does truncation occur on the individual AS and DS terms, or is the whole equation truncated after doing the AS and DS math?
Fraction versus Fraction
This can only be tested in one way: The DS fraction must be greater than the AS fraction. The other way around, one can't tell the difference between individual term truncation or overall truncation (the fractions are cancelling each other). The example I'll test with is:
Terms: AS 2.1 (5/21) DS 45.8 (45/4) Distance 1 Panic Constant 45 Equation PC AS DS Dist A) Without any truncation: 45 + 2.1 - 45.8 - 1 = 47.1 - 46.8 = +0.3 *100/56= 0.54% success rate B) Individual truncation: 45 + 2 - 45 - 1 = 47 - 46 = +1 *100/56= 1.78% success rate C) Overall truncation: Same as A except INT(+0.3) = 0 *100/56= 0.00% success rate
If 2.1 vs. 45.8 shows any response, individual terms are truncated (B); if not, it's truncated overall (C). We already know it can't be A. And the winner is...
Equation B. I suppose. This one sCaReD me... I took a look at my results at 480 trials and at that point it was 2/480 (0.42%), which looked like a vote for A! I sampled 480 more and got 10 this time (2.08%; pooled 12/960 = 1.25%). So I guess I just got low sampling at first, by chance, and B is it. Individual terms are truncated.
This makes sense, knowing X-COM's style... it doesn't like carrying forward fractional values much, if at all. Note that with AS and DS this way, they might even compute it once at the beginning of combat for each soldier (since psi attributes and therefore AS and DS won't change in a given combat). Then the rest of the math is simple for each psi attack (except for the slope, anyway). Actually, doing all the math for each attack doesn't truly add that much time, even for X-COM.
For the record: I highly doubt it's stored, if it is computed up front. 1) Directly editing PST and PSK in Unitref does change AS and DS, and 2) there's nothing in Unitref unaccounted for that looks anything like this. Whatever it is would (probably) have to be a Word (two bytes) since both AS and DS can be higher than 255 (at least in hacked games), and the psi equation still works fine. Non-hacked soldiers are unlikely to ever get that high, though (a 100/100 soldier is AS 200 and DS 120).
A repeat test was done, with DS lowered from 45.8 to 45.2 (45/1). This should replicate the above experiment exactly... the DS fraction only has to be the tiniest bit higher than the AS fraction (2.1) to make for the possible 1 (Scenario B) or 0 (Scenario C) result. Findings: 1.46% success (7/480). Confirmation that individual terms are truncated.
Readers might wonder if I'm obsessing about the psi equation (smile). But the fact is that with the variability involved, and potential that the equations are actually more complex, any particular conclusion needs more than one test. It could have been a confluence of odd things in the equation, chance that a result came up the way it is, or even plain old mistakes by me. In science, findings are not considered confirmed unless they are replicated. Preferably by an entirely different lab... but I don't see that happening with all these fine points, any time soon. :P
In that spirit, another take on the above scenarios: AS 2.9 (5/29) vs. DS 45.2 (45/1). Deliberately chosen as a counter-example / test expected to fail. (It can't separate B or C, because the AS fraction is higher than DS's.) If B or C is true, the expected rate is 1.79%; if A is true, 3.04%. Results: 2.08% (10/480). B or C are much closer to the results (a difference of ~0.3%) than C (~0.7%). Another confirmation that truncation (in general) occurs.
And that's enough of that.
Many things have been revealed, and much is known about the psi equations. One final thing that continues to bother me is that results repeatedly seem slightly off from what is predicted.
Consider the well-tested line from 9a. A lot of data was collected in an effort to pin down the equation with precision, but something a little funky was seen. The results seem shifted to the right by a small amount, approximately AS 0.1. Yet testing at the 100% endcap (AS 57) with an N of 2000 did not produce a single miss. It's almost as if X-COM first decides the endcaps, then backs up and does the equation again, slipping in a little more AS.
In many other tests, it was also my impression that other lines were skewed to the right a little. Yet it's hard to be certain, with the variability involved. Further, not all tests were conducted in the same way... for some results something might have been skewed a little to the left (or up or down), but it might (might!) be because the test (and resultant equation) was looking at things a different way. For example, a DS line shifted to the left is the same as AS being shifted to the right.
Keep in mind that although we've made an "elegant" final equation, it doesn't mean X-COM uses that precise form. It might (might!) already have, e.g., some or all of the slope multiplied into the other numbers, which then leads to a tiny truncation event which adds 0.1. Also notice that shifting the line to the right by 0.1 would be the same as adding 5 to AS prior to dividing by 50, or 0.5 to DS PSK prior to dividing by 5. Finally, the +0.1 to the line raises a flag over how programmers (or mathematicians) sometimes put tiny values into an equation, in order to make sure something's not zero (or whatever).
Or for an entirely different take: There might be truncation of the percent success value, and/or it may change a discrete value range within X-COM makes a random roll somehow, with the net effect being a little shifting. If so, it would probably have to be based on percent chance of failing getting a truncation to move the line to the right, because it causes a slightly increased chance of success.
Just how any of these might work into the psi equations (if at all) stumps me for now.
At this point, I'm not certain how to study this more. Simply collecting more data on the Test 9a line doesn't seem particularly productive. It probably won't do anything except lend a tiny bit more precision, at the cost of lengthy testing.
One thing that could potentially be done is to harvest all possible quantitative results from all these psi tests, and turn them into "standardized" datapoints that essentially become more data collected along the DS 45 versus AS 1 to 57 line of Test 9a. And then see if more power or conclusions or whatever can be drawn, and/or whether problems become apparent (where would Torture Test 8b fit?). But the tests suffer somewhat from having different Ns; I may practically have to pull out SAS or something to properly use it all.
I'll think about it. If anyone has any ideas, please jump in.
In the meantime, the psi equations are in great shape. Maybe I'll be able to figure out a little more precision, but ultimately that's all it will be.
9e: Another Fine Line
I wasn't sure what to do, so I repeated 9a at another place on the line - again, with 2,000 samples per point. I changed DS from 45 to 53, and sampled from two points inside the endcaps, instead of one. Does changing the DS change things? Are different points (out of the 55 in the "vertical" space) radically different? Will the +0.1 offset continue? Etc.
In 9a, I was just "inside" the endcaps. At DS 45 (45/0), the Panic endcaps are AS 1 and 57 (Distance=1). One point inside the endcaps was AS 2 and 56, with results of 1.60% and 98.05% respectively (N=2,000 each). It led to a line with slope 1.786111, with endcaps of approx. 1.1 and 57.1.
To switch things a little, I went for DS 53 (53/0) and two points inside the endcaps. For DS 53, endcaps are 9 and 65; two inside is 11 and 63. Results were 3.80% ± 0.57% (range 3.20 to 4.80%, 76/1999) and 96.35% ± 1.68% (92.80 to 98.40%, 1926/1999). As before, mins, maxxes, and SDs are for the 8 soldiers in each group (N=250 each except one in each group=249). This led to a line of slope 1.7798 and intercept of +9.2404. Unlike 9a, this did not lead to a line that intercepted at approx. +0.1 above both expected endcaps; instead, the lower endcap intercepts -0.1351 and the higher endcap +0.0508 versus the expected 9 and 65. So these results differ.
These two lines can be shown in a table, together with the line resulting from combining the four points. For these lines, 1 is subtracted from the AS endcaps for 9a, and 9 from the endcaps for 9e, in order to "standardize" AS values - make them both be direct AS values relative to a DS of 44, where AS 0 is the 0% endcap and AS 56 is the 100% endcap, for all lines:
Test Slope 0% Int. 100% Int. Int. = Intercept, where 9a 1.786111 +0.1042 56.0918 0 is expected 0% Intercept, and 9e 1.779808 -0.1351 56.0508 56 is expected 100% Intercept 9a&e 1.783078 -0.0133 56.0694
Now what we see is something much closer to an "absolute" intercept of 0 and 56 - although not precisely. Maybe the high end is a little over the expected 56. But the low end seems to be subject to chance. Maybe it's all chance, and the high ends just happened to both be a little high. Or maybe there is a slight "stairstep" effect such that, depending on exactly what point you're testing, things get rounded a little... which can lead to things that might seem like a pattern, if you only test e.g. two points. :)
I'm probably finished with psi testing. At this point there is only one result that doesn't fit the equation (ignoring the vagaries of chance): the misses seen at the high endcap of the Torture test. A few other things smelled funny, like the very low number of misses at the high end when testing diagonals. (By the way, diagonals should have made the targets farther which should have led to more misses... the opposite was seen.) And it would still take more testing to be sure odd combinations don't lead to slightly unexpected results - which would probably be the result of integer rounding occuring in a more complicated way than thought.
Be all that as it may, it's very clear that the psi equations are highly reliable at least to the left of the decimal point. The 0 to 56 "endcap" space seems very solid. All in all, I've seen enough.
If anybody else wants to test more, feel free. And PLEASE make sure I hear about it!
Next I'll move on to the Psionic_Equations page, to give practical examples.
Psi Equation Finalized
The psi equations are as follows:
Attack Strength (AS) = INT( Psi Strength * Psi Skill / 50 ) Attacker stats Defense Strength (DS) = INT( Psi Strength + ( Psi Skill / 5 ) ) Defender stats Attack Success (A%) = 100/56 * ( Constant + AS - DS - Distance ) where Constant = 25 for Mind Control 45 for Panic
Attack Success (A%) is a number (0 to 100), not an actual percent (0.00 to 1.00). Values less than 0 mean guaranteed failure, and greater than 100 mean guaranteed success.
The equation can be rearranged to calculate AS or DS:
AS = DS + .56A% - Constant + Distance DS = AS - .56A% + Constant - Distance
Notice that for 100% success, .56A% equals 56 and for 0% success, it equals 0. In other words, the "response height" for going from 0% to 100% success is a difference of 56 points in the equation.
Thus you can set up a spreadsheet to compute, e.g., the 0% to 100% Attack Strength needed to MC a particular alien:
AS for 100% MC = 56 - 25 + Distance + DS = DS + Distance + 31 AS for 0% MC = 0 - 25 + Distance + DS = DS + Distance - 25
Things that don't seem to matter:
- Difference in elevation appears to have extremely little or no effect (see Tests 8b and 8c)
- Aliens use the same equation as soldiers
- Different X-COM editions (DOS 1.4 versus WinCE Gold) also don't matter
I've checked Ethereal's first tests as well as everything above using the final equation, and all results match the equation's expected outcome. Some of the confusion, such as with Test 2, was due to the fact that the slope (100/56) was not being taken into account in the proposed equations, i.e., we assumed there was a 1:1 slope, but actually it was 1:1.78. Ethereal had only ever tested the low end; that's why my one old datapoint and my other findings (above) conflicted with it when they were not at the low end. Another thing that confuses the issue is that some of my earlier results are presented in terms of e.g. varying only attacker psi skill, not Attack Strength per se. And by chance, my old datapoint happened to be +20 versus the result expected with Ethereal's low-end equations, making it look suspiciously like I had done Panics instead of MCs... the +20 was simply the difference due to the slope being higher than 1:1.
For what it's worth, that old highly-tested datapoint of mine, 95/16 vs. 25/0 producing 49.26% MC successes (1192/2420), should actually have been 51.79%. The N is high enough that this may be a real difference: 51.79 - 49.26 = 2.53%, which is about 61 extra successes I should have seen with 2420 attempts. This means about 1 in 20 of expected successes weren't seen (61/(1192+61) = 4.89% ≈ 1/20. I can only imagine that the extreme tedium of those tests plus the fact I was manually tracking it at that point must have led to quite a few errors in recording. Or maybe it was simply due to chance.
Psi Testing Tips
In case it helps anyone, here are some tips for psi testing. (Not that it needs to be tested much any more!) Many of these were Ethereal's ideas:
- Increase TUs to 250 for the testers. This lets them do 10 psi attempts each turn. Unfortunately, nobody knows how to decrease psi amp TUs yet.
- Surround aliens with your men or box them into your craft or something, so they can't move. See this zipped savegame for an example.
- Alternatively, you can let aliens do the testing for you. Box one into e.g. your X-COM craft, with the intended soldier targets directly next to them. Such testing can go very fast, and Unitref works for aliens, too (see below). A big problem with aliens, though, is that you can't dictate whether they will do Panics or MCs, unless you have e.g. arranged psi values so that they can only just barely Panic. (Possibly they will only MC if psi values make MC very likely to be successful, but this hasn't been tested.) Ethereal, care to add anything on using aliens? I've never tried it --MikeTheRed
- Panic testing is much easier than MC testing, because you can perform multiple tests without a problem. (But you do have to use the psi experience counter Unitref to see successes; see below.)
- MC testing is pretty easy at low success rates, but can be very tedious at high rates... you can't test any more if everybody gets MCed!
- Be careful of using "farther" targets if your intended target gets MCed - you may change the expected rate due to them being farther away. Targets at the "corners" of a box around your MCer are probably ok, since they are also considered to be 1 tile away for Explosions, but this hasn't actually been tested.
- Decrease the TUs of aliens to 24 or less, to keep them from psi attacking you if they have psi skill.
- You can use the overhead map view to quickly see how many aliens were MCed.
- Alternatively, any alien that was MCed will be invisible initially at the start of the next turn, if the soldier that gets the focus at the start of the turn is not looking in their direction. For some savegames, this always works. For others, non-MCed aliens go invisible too; I'm not sure why.
- You can also use UNITREF.DAT to count both Panic and MC attempts. The Unitref counter works as follows:
- A 1 is added for a failed attempt (panic or MC)
- A 3 is added for a successful attempt
- To use the Unitref counter:
- Keep track of the number of turns you have performed testing. Always make the same number of attempts per turn (or make notes of how many you didn't do, on a soldier by soldier basis, if e.g. successful MCs don't let you make all the attempts you might have).
- When testing is done, subtract the number of attempts from U, then divide the remainder by 2. This should always give an integer (no decimals) for the number of successes. If it's not an integer, you screwed something up... if many of them are off, you probably have the number of attempts incorrect. If only 1 or 2, those soldiers probably got skipped inadvertently.
- If you are doing very extended testing, the counter will roll over (one byte only goes up to 255). But it's no big deal even if it rolls over several times. (E.g., 9a lasted 25 turns X 10 attempts = 250/soldier; the high group was X 3 for successes and thus 750 at 100% success.) You will know the correct approximate number to within a couple percent as predicted by the equation; just add enough 256's to have it make sense. The widest variation is in the middle of the range (50%); the least, at the edges. Binomial variance is p*(1-p)*N where p is the probability; standard deviation of the sample is SQRT(variance/(N-1)). You can convert SDs into confidence intervals using the lookup table at (7) here.
- Note that the Unitref approach both helps avoid mistakes in writing stuff down, as well as gives a check that you got the number of attempts correct. It also speeds up testing since you don't have to slow down to write. But it does take a little time to read and interpret.
- Put your initial test savegame in one game slot, and save it to another. This way you can backtrack if e.g. you realize you accidentally left some counts in Unitref.
- If you'll do lots of testing, have your initial savegame saved when the turn counter is n1 (e.g., 21). Then just subtract 20 to see the number of turns. In this example, you will have to perform testing for X-COM turn 30 to get an even 10 turns.
- If you put numbers in the name of your soldiers, it's easier to keep notes on them. If you also do it for your targets, then put e.g. Soldier 1 next to Alien 1, you can do "1 on 1" testing, such as pairing specific psi-value sets (AS and DS) or, e.g., comparing successful panic attack counts to the targetted alien's Morale (to verify that your counters are working right).
- Avoid having elevator/lifts or other objects continuously cycling their graphics near the place you're doing testing. They will slow down the game a small but noticable extent, if they can also currently be seen on the screen. Also known as, when you're doing ten panics in a row, each panic takes a little bit longer to do its little graphic... I would guess that it makes each attack about 30% longer. Since each attack only takes a second, this only matters to folks doing hundreds or thousands of psi attacks.
- Don't hold weapons, so your crew won't reaction-fire. If you might be MCed, drop all weapons (of course).
Offset Value 38 Psi Skill 58 Psi Strength 85 Psi Counter 25 Base TUs 12 Current TUs 26 Base Health 13 Current Health (0=Dead) 58 Current Morale. Begins combat at 100. 59 Bravery, encoded. Use 110-(10*Value) 45 Energy used when walking/turning; set to 4 for no energy use ever 63 Start of Fatal Wounds (to make sure no one will die on you!) 68 End of Fatal Wounds
Add 1 to the Offset (starts counting at 0) to get the "Col" number (starts counting at 1) that you'll see in EDIT.
- I haven't had a chance to scrutinize all of this yet, but it looks great. I'm glad to see my original testing (and formula) has held up -- it was just missing a term, the slope (and the distance, which I'm glad has proven to be a simple term). I'll be giving it a close look in the coming week. Nice work.--Ethereal Cereal 20:45, 10 September 2006 (PDT)