# Finding Modes Using Kernel Density Estimates

Examples of finding the mode of a univeriate distribution in R and Python.

Robert M Flight
2018-07-19

## TL; DR

If you have a unimodal distribution of values, you can use R’s `density` or Scipy’s `gaussian_kde` to create density estimates of the data, and then take the maxima of the density estimate to get the `mode`. See below for actual examples in R and Python.

## Mode in R

First, lets do this in R. Need some values to work with.

``````library(ggplot2)
set.seed(1234)
n_point <- 1000
data_df <- data.frame(values = rnorm(n_point))

ggplot(data_df, aes(x = values)) + geom_histogram()
`````` ``````ggplot(data_df, aes(x = values)) + geom_density()
`````` We can do a kernel density, which will return an object with a bunch of peices. One of these is `y`, which is the actual density value for each value of `x` that was used! So we can find the `mode` by querying `x` for the maxima in `y`!

``````density_estimate <- density(data_df\$values)

mode_value <- density_estimate\$x[which.max(density_estimate\$y)]
mode_value
``````
`` -0.04599328``

Plot the density estimate with the mode location.

``````density_df <- data.frame(value = density_estimate\$x, density = density_estimate\$y)

ggplot(density_df, aes(x = value, y = density)) + geom_line() + geom_vline(xintercept = mode_value, color = "red")
`````` ## Python

Lets do something similar in Python. Start by generating a set of random values.

``````import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt

values = np.random.normal(size = 1000)

plt.hist(values)``````
``````(array([  5.,  19.,  70., 145., 233., 254., 173.,  86.,  10.,   5.]), array([-3.40466956, -2.73500426, -2.06533895, -1.39567364, -0.72600834,
-0.05634303,  0.61332228,  1.28298759,  1.95265289,  2.6223182 ,
3.29198351]), <a list of 10 Patch objects>)``````
``plt.show()`` And then use `gaussian_kde` to get a kernel estimator of the density, and then call the `pdf` method on the original values.

``````kernel = stats.gaussian_kde(values)
height = kernel.pdf(values)

mode_value = values[np.argmax(height)]
print(mode_value)``````
``0.05435206135566903``

Plot to show indeed we have it right. Note we sort the values first so the PDF looks right.

``````values2 = np.sort(values.copy())
height2 = kernel.pdf(values2)

plt.clf()
plt.cla()
plt.close()

plt.plot(values2, height2)
plt.axvline(mode_value)
plt.show()`````` ### Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

### Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/rmflight/researchBlog_distill, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

### Citation

`Flight (2018, July 19). Deciphering Life: One Bit at a Time: Finding Modes Using Kernel Density Estimates. Retrieved from https://rmflight.github.io/posts/2018-07-19-finding-modes-using-kernel-density-estimates/`
```@misc{flight2018finding,