Common Functions Used with Factors with R

Common Functions Used with Factors

levels()

The levels() function is used to get or set the levels of a factor. Levels are the distinct categories that a factor can take.

Getting Levels 

# Create a factor
data <- factor(c("High", "Low", "Medium", "Medium", "High", "Low"))
# Get the levels of the factor
levels(data)
# Output:
# [1] "High"   "Low"    "Medium"

Setting Levels

You can set the levels of a factor to a new set of values.

# Set new levels for the factor
levels(data) <- c("Low", "Medium", "High", "Very High")
# Print the factor with updated levels
print(data)
# Output:
# [1] High   Low    Medium Medium High   Low

Levels: Low Medium High Very High

nlevels()

The nlevels() function returns the number of levels in a factor. 

# Number of levels in the factor
nlevels(data)
# Output:
# [1] 4

as.factor()

The as.factor() function converts a vector into a factor. This is useful when you want to convert a character vector or numeric vector into a factor. 

# Convert a character vector to a factor
char_vector <- c("Red", "Green", "Blue", "Green", "Red")
factor_char_vector <- as.factor(char_vector)
# Print the factor
print(factor_char_vector)
# Output:
# [1] Red   Green Blue  Green Red

 Levels: Blue Green Red

summary()

The summary() function provides a summary of a factor, showing the frequency of each level. 

# Summary of the factor
summary(factor_char_vector)
# Output:
# Blue  Green    Red
#   1      2      2

table()

The table() function creates a frequency table of the factor levels. This function is useful for seeing how many observations fall into each category. 

# Frequency table of the factor
freq_table <- table(factor_char_vector)
print(freq_table)
# Output:
# factor_char_vector
# Blue Green   Red
#    1     2     2

relevel()

The relevel() function changes the reference level of a factor. This is useful in modeling when you want to change which level is used as the baseline. 

# Relevel the factor to set "Blue" as the reference level
relevel_factor <- relevel(factor_char_vector, ref = "Blue")
# Print the relevel factor
print(relevel_factor)
# Output:
# [1] Blue  Green Red   Green Red

 Levels: Blue Green Red

fct_reorder()

From the forcats package, fct_reorder() reorders the levels of a factor based on another variable. This is useful when you want to order levels by some numeric summary. 

# Install and load the forcats package if not already installed
# install.packages("forcats")
library(forcats)
# Create a data frame
df <- data.frame(
  category = factor(c("A", "B", "C", "B", "A", "C")),
  value = c(10, 20, 30, 40, 50, 60)
)
# Reorder levels of 'category' based on the mean of 'value'
df$category <- fct_reorder(df$category, df$value, .fun = mean)
# Print the reordered factor
print(df$category)
# Output:
# [1] A B C B A C

Levels: A B C

fct_recode()

Also from the forcats package, fct_recode() allows you to rename the levels of a factor. 

# Recode the levels of a factor
df$category <- fct_recode(df$category,
                          "Group 1" = "A",
                          "Group 2" = "B",
                          "Group 3" = "C")
# Print the recoded factor
print(df$category)
# Output:
# [1] Group 1 Group 2 Group 3 Group 2 Group 1 Group 3

 Levels: Group 1 Group 2 Group 3

fct_collapse()

fct_collapse() is another function from the forcats package that allows you to combine levels into broader categories. 

# Collapse the levels of the factor
df$category <- fct_collapse(df$category,
                            "Group A" = c("A", "B"),
                            "Group B" = "C")
# Print the collapsed factor
print(df$category)
# Output:
# [1] Group A Group A Group B Group A Group A Group B

Levels: Group A Group B

fct_expand()

fct_expand() ensures that all levels specified are included in the factor, even if they are not present in the data. 

# Expand the factor to include all specified levels
df$category <- fct_expand(df$category, "Group A", "Group B", "Group C")
# Print the expanded factor
print(df$category)
# Output:
# [1] Group A Group A Group B Group A Group A Group B

 Levels: Group A Group B Group C

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Facebook
Twitter
LinkedIn
WhatsApp
Email
Print