Tag Archives: Median

Statistics and C# (Part 2)

In a previous post, Statistics and C# (Part 1), I introduced the concept of the sample mean as one of the measures of central tendency used in basic statistical analysis.  Included with that group of measurements are the median and mode.

Median

The median provides the middle value of a set of numbers if that set has an odd cardinality, or it is the mean of the middle two numbers if the set has an even cardinality

Formula for odd cardinality:

\(median = \frac{n+1}{2}\) (where n is the number of values)

Formula for even cardinality:

\(x = \frac{n+1}{2}\) (where n is the number of values)  which generates a fractional value for the index variable x.

\(median = \frac{(x + .5) – (x – .5)}{2}\)

 

In code, the method will have to combine and test for these two cases. Also, implicit in the definition is the notion that the set is ordered, so the code must sort the set in order to calculate the correct result.

public static double Median(this IEnumerable<double> data)
{
    if (data.IsNullOrEmpty()) return default(double);

    var sorted = data.OrderBy(d => d);
    var count = sorted.Count();
            
    var isEven = count % 2 == 0;
    var middle = sorted.Skip((count - 1) / 2).Take(isEven ? 2 : 1);

    var median = middle.Average();
    return median;
}

Mode

The mode measurement returns the number the occurs most often in the set. If multiple values with the same occurrence are found, the mode returns a subset of the sequence containing those numbers.

Implementing the mode in LINQ is fairly interesting.

public static IEnumerable Mode(this IEnumerable data)
{
    if (data.IsNullOrEmpty()) return Enumerable.Empty();

    var mode = data.GroupBy(d => d).Select(g => new
                                    {
                                        Value = g.Key,
                                        Count = g.Count()
                                    });

    var max = mode.Max(d => d.Count);
    var groupedModes = mode.OrderByDescending(d => d.Value);
    var filtered = groupedModes.Where(d => d.Count == max && max > 1);

    var modes = filtered.Select(g => g.Value);
    return modes.ToList();
}

The code for this project can be found on codeplex. NumSkull