Since there is only one categorical variable and the Chi-square test of independence requires two categorical variables, we add the variable size which corresponds to small if the length of the petal is smaller than the median of all flowers, big otherwise: dat <- irisĭat$size <- ifelse(dat$Sepal.Length < median(dat$Sepal.Length), This dataset is the well-known iris dataset slightly enhanced. The Chi-square test of independence works by comparing the observed frequencies (so the frequencies observed in your sample) to the expected frequencies if there was no relationship between the two categorical variables (so the expected frequencies if the null hypothesis was true).įor our example, let’s reuse the dataset introduced in the article “ Descriptive statistics in R”. Knowing the value of one variable helps to predict the value of the other variable
INDEPENDENCE PRO TUTORIAL HOW TO
To learn more about how the test works and how to do it by hand, I invite you to read the article “ Chi-square test of independence by hand”. This article explains how to perform the Chi-square test of independence in R and how to interpret its results.