A2oz

What is IQR in Python?

Published in Statistics 2 mins read

The Interquartile Range (IQR) is a statistical measure that represents the spread of the middle 50% of your data. It's calculated by subtracting the first quartile (Q1) from the third quartile (Q3). In Python, you can easily calculate the IQR using libraries like NumPy and Pandas.

Calculating IQR in Python

Here's how to calculate the IQR using Python:

1. Using NumPy:

import numpy as np

data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Calculate the IQR using NumPy's percentile function
iqr = np.percentile(data, 75) - np.percentile(data, 25)

print(f"The IQR is: {iqr}")

2. Using Pandas:

import pandas as pd

data = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Calculate the IQR using Pandas' quantile function
iqr = data.quantile(0.75) - data.quantile(0.25)

print(f"The IQR is: {iqr}")

Understanding IQR

The IQR is a valuable measure for understanding the distribution of your data:

  • Outlier Detection: The IQR helps identify potential outliers. Values that fall outside of 1.5 times the IQR below Q1 or above Q3 are considered outliers.
  • Data Variability: A larger IQR indicates greater spread or variability in the data.
  • Box Plots: The IQR is the length of the box in a box plot, visually representing the middle 50% of the data.

Practical Insights

  • The IQR is often used in conjunction with the median to understand the central tendency and spread of data.
  • When dealing with skewed data, the IQR is a more robust measure of spread than the standard deviation.

Related Articles