Class WeightedStringsFromCSV

  • All Implemented Interfaces:
    java.util.function.LongFunction<java.lang.String>

    public class WeightedStringsFromCSV
    extends java.lang.Object
    implements java.util.function.LongFunction<java.lang.String>
    Provides sampling of a given field in a CSV file according to discrete probabilities. The CSV file must have headers which can be used to find the named columns for value and weight. The value column contains the string result to be returned by the function. The weight column contains the floating-point weight or mass associated with the value on the same line. All the weights are normalized automatically. If there are multiple file names containing the same format, then they will all be read in the same way. If the first word in the filenames list is 'map', then the values will not be pseudo-randomly selected. Instead, they will be mapped over in some other unsorted and stable order as input values vary from 0L to Long.MAX_VALUE. Generally, you want to leave out the 'map' directive to get "random sampling" of these values. This function works the same as the three-parametered form of WeightedStrings, which is deprecated in lieu of this one. Use this one instead.
    • Constructor Summary

      Constructors 
      Constructor Description
      WeightedStringsFromCSV​(java.lang.String valueColumn, java.lang.String weightColumn, java.lang.String... filenames)
      Create a sampler of strings from the given CSV file.
    • Method Summary

      Modifier and Type Method Description
      java.lang.String apply​(long value)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • WeightedStringsFromCSV

        public WeightedStringsFromCSV​(java.lang.String valueColumn,
                                      java.lang.String weightColumn,
                                      java.lang.String... filenames)
        Create a sampler of strings from the given CSV file. The CSV file must have plain CSV headers as its first line.
        Parameters:
        valueColumn - The name of the value column to be sampled
        weightColumn - The name of the weight column, which must be parsable as a double
        filenames - One or more file names which will be read in to the sampler buffer
    • Method Detail

      • apply

        public java.lang.String apply​(long value)
        Specified by:
        apply in interface java.util.function.LongFunction<java.lang.String>