I am trying to write a program to select a random name from the US Census last name list. The list format is
Name Weight Cumulative line
-----
I'd say an array (vectors if you prefer) would be best to hold them. As for the weighted average, find the sum, pick a random number between zero and the sum, and pick the last name whose cumulative value is less. (e.g. here, <1.006 = smith, 1.006-1.816 = johnson, etc.
P.S. it's Cumulative.
The "easiest" way to handle this would be to keep this in a list.
You could then just use:
Name GetRandomName(Random random, List<Name> names)
{
double value = random.NextDouble() * names[names.Count-1].Culmitive;
return names.Last(name => name.Culmitive <= value);
}
If speed is a concern, you could store a separate array of just the Culmitive
values. With this, you could use Array.BinarySearch
to quickly find the appropriate index:
Name GetRandomName(Random random, List<Name> names, double[] culmitiveValues)
{
double value = random.NextDouble() * names[names.Count-1].Culmitive;
int index = Array.BinarySearch(culmitiveValues, value);
if (index >= 0)
index = ~index;
return names[index];
}
Another option, which is probably the most efficient, would be to use something like one of the C5 Generic Collection Library's tree classes. You could then use RangeFrom
to find the appropriate name. This has the advantage of not requiring a separate collection
I've created a C# library for randomly selected weighted items.
Some example code:
IWeightedRandomizer<string> randomizer = new DynamicWeightedRandomizer<string>();
randomizer["Joe"] = 1;
randomizer["Ryan"] = 2;
randomizer["Jason"] = 2;
string name1 = randomizer.RandomWithReplacement();
//name1 has a 20% chance of being "Joe", 40% of "Ryan", 40% of "Jason"
string name2 = randomizer.RandomWithRemoval();
//Same as above, except whichever one was chosen has been removed from the list.
Just for fun, and in no way optimal
List<Name> Names = //Load your structure into this
List<String> NameBank = new List<String>();
foreach(Name name in Names)
for(int i = 0; i <= (int)(name.Weight*1000); i++)
NameBank.Add(name.Name)
then:
String output = NameBank[rand(NameBank.Count)];