Soldier Name Stats
Name Sets
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on SOLDIER.DAT byte 67), denoted by an asterisk:
American Set Austin Bradley * Barbara Bryant Calvin Carr Carl Crossett * Catherine Dodge Clarence Gallagher Donald Homburger Dwight Horton Ed Hudson * Evelyn Johnson Kevin Kemp Lester King Mark McNeil Oscar Miller * Patricia Mitchell Samuel Nash * Sigourney Stephens Spencer Stoddard Tom Thompson Virgil Webb British Set Adam Bailey Alan Blake * Andrea Davies Arthur Day Brett Evans Damien Hill David Jones Frank Jonlan * Helen Martin James Parker * Jane Pearce John Reynolds * Maria Robinson Michael Sharpe Neil Smith Patrick Stewart Paul Taylor Robert Watson * Sarah White Scott Wright French Set Armand Bouissou Bernard Bouton Claude Buchard * Danielle Coicaud Emile Collignon Gaston Cuvelier Gerard Dagallier Henri Dreyfus * Jacqueline Dujardin Jacques Gaudin Jean Gautier Leon Gressier Louis Guerin Marc Laroyenne Marcel Lecointe * Marielle Lefevre * Micheline Luget Pierre Marcelle Rene Pecheux * Sylvie Revenu German Set * Christel Berger Dieter Brehme Franz Esser Gerhard Faerber * Gudrun Geisler Gunter Gunkel Hans Hafner * Helga Heinsch Jurgen Keller * Karin Krause Klaus Mederow Manfred Meyer Matthias Richter Otto Schultz Rudi Seidler Siegfried Steinbach Stefan Ulbricht * Uta Unger Werner Vogel Wolfgang Zander Japanese Set Akinori Akira Isao Fujimoto Jungo Ishii Kenji Iwahara * Mariko Iwasaki Masaharu Kojima Masanori Koyama * Michiko Matsumara Naohiro Morita * Sata Noguchi Shigeo Okabe Shigeru Okamoto Shuji Sato * Sumie Shimaoka Tatsuo Shoji Toshio Tanida Yasuaki Tanikawa Yataka Yamanaka * Yoko Yamashita Yuzo Yamazaki Russian Set Anatoly Andianov Andrei Belov * Astra Chukarin Boris Gorokhova Dmitriy Kolotov * Galina Korkia Gennadi Likhachev Grigoriy Maleev Igor Mikhailov Ivan Petrov Leonid Ragulin * Lyudmila Romanov Mikhail Samusenko Nikolai Scharov * Olga Shadrin Sergei Shalimov * Tatyana Torban Victor Voronin Vladimir Yakubik Yuri Zhdanovich
Columns show first and last names for each set of 20. There is no association per se between a particular first name being next to a last name (above) - I'm simply presenting each set sorted alphabetically, and used two columns to conserve space. Any first name within a given set is liable to be combined with any last name in that set.
Test Set
20 batches of 100 recruits (total N=2,000) were used as a sample. Not all possible 2,400 first and last name combinations appeared, of course, but first and last names were always associated as shown above. Thus you may see an Adam Bailey, but will never see an Adam Bradley.
510 of 2,000 soldiers were female (25.50%), almost exactly the expected 500 (25%).
No duplicate names were observed within a given batch of 100, but numerous duplicates were observed across batches. There were 969 unique names in the 2,000, with the most-duplicated name appearing 8 times. X-COM probably uses a simple method for avoiding duplicates within a batch, such as using a random pointer into the name table (based on how many soldiers you've just recruited) and then walking through the name table (instead of repeatedly randomly sampling it). In any event, regardless of how they did it, there were no duplicates within a batch of recruits, but were duplicates across batches.
Freq Count Sum 1 496 496 2 181 362 3 131 393 4 93 372 5 40 200 6 20 120 7 7 49 8 1 8 ----- ------ 969 2000
Thus, 1,431 of the possible 2,400 name combinations (2400-969) did not appear.
Frequency by nationality for the 2,000:
Nationality Frequency B1 359 A 316 F 335 G 365 J 284 R 341
It is not known why many combinations didn't show up, while others showed up multiple times. Also e.g. why there were 284 Japanese and 365 Germans, when the expected value is 333 (2000/6) for each set. Perhaps these results are due to random chance, or perhaps the name sampler has some sort of bias that makes certain combinations or nationalities more likely than others. Or maybe my 20 batches were simply not a big enough sample, particularly if the name selector does something odd when trying to avoid duplicates. For the complete dataset (including counts), see Media:X-COM Soldier Names.xls. If anyone knows how to do statistical testing for possible biases, feel free. Probably a much larger sample (10,000 recruits?) will give a clearer picture... but it would require 100 recruit batches, bleh. -MTR
Duplicates
While playing a fairly average game (less than 100 soldiers ever generated so far), I manually put stat strings on my soldiers' names. I currently have two Yoko Fujimotos with various stat strings. Considering that duplicate names aren't seen within a single game while testing names, but that I did see a second Yoko Fujimoto get generated after I changed the name of my original to Yoko Fujimoto-xs, I'm going to go ahead and say "The game probably just avoids duplicates by comparing the name it generates to each existing soldier's name". I predict that if you hire a female Russian soldier, and change her name to "Austin Bradley", you will never see another soldier generated with the name "Austin Bradley" but you might see a new soldier with her original name. This would be very tedious to test. --Sowelu 14:15, 16 September 2008 (PDT)
- Check no further, I had a look at the code and indeed it checks against every entry in SOLDIER.DAT and will regenerate a new name in case of collision (up to ten times, after that it'll give up and use the duplicate one). There is no check to see if the entry is valid so it should also remember dead soldiers as long as their entries are not overwritten. Seb76 15:22, 16 September 2008 (PDT)