Extracting Sub-Data Frames with R

Extracting Sub-Data Frames

Extracting Rows and Columns

You can extract sub-data frames by selecting specific rows and columns.

Extracting Rows

To extract specific rows from a data frame, you can use indices or logical conditions.

Example: Extraction by Indices 

# Create a data frame
df <- data.frame(Name = c("Alice", "Bob", "Charlie", "David"),
                 Age = c(25, 30, 35, 40),
                 City = c("Paris", "London", "Berlin", "New York"))
# Extract rows 1 and 3
subset_rows <- df[c(1, 3), ]
print(subset_rows)
# Output:
#      Name Age    City
# 1   Alice  25   Paris
# 2 Charlie  35  Berlin

Example: Extraction by Condition 

# Extract rows where Age is greater than 30
subset_age <- df[df$Age > 30, ]
print(subset_age)
# Output:
#      Name Age    City
# 1 Charlie  35  Berlin
# 2   David  40 New York

Extracting Columns

To extract specific columns, you can use indices or column names.

Example: Extraction by Column Names 

# Extract the "Name" column
name_column <- df["Name"]
print(name_column)
# Output:
#       Name
# 1    Alice
# 2      Bob
# 3  Charlie
# 4    David

Example: Extraction by Indices 

# Extract the first column
first_column <- df[, 1]
print(first_column)
# Output:
# [1] "Alice"   "Bob"     "Charlie" "David"

Extraction with Logical Conditions

Logical conditions allow you to extract subsets based on specific criteria.

Example: Extraction with Multiple Conditions 

# Extract rows where Age is greater than 25 and City is "Paris"
subset_condition <- df[df$Age > 25 & df$City == "Paris", ]
print(subset_condition)
# Output:
#    Name Age  City
# 1 Alice  25 Paris

Extraction Using subset()

The subset() function allows you to filter data based on conditions.

Example: Extraction with subset() 

# Extract rows where Age is less than 35
subset_df <- subset(df, Age < 35)
print(subset_df)
# Output:
#      Name Age    City

# 1   Alice  25   Paris

# 2     Bob  30  London

Extraction Using dplyr Functions

The dplyr package provides powerful functions for manipulating and extracting subsets of data.

Example: Extraction with filter() and select() 

# Load the dplyr package
library(dplyr)
# Extract rows where Age is greater than 30 and select "Name" and "City" columns
subset_dplyr <- df %>%
  filter(Age > 30) %>%
  select(Name, City)
print(subset_dplyr)
# Output:
#       Name    City
# 1   Charlie Berlin
# 2   David New York

Extraction Using slice() for Row Ranges

The slice() function from dplyr allows you to select specific ranges of rows.

Example: Extraction of Row Ranges 

# Extract rows 2 to 4
subset_slice <- df %>%
  slice(2:4)
print(subset_slice)
# Output:
#       Name Age    City
# 1      Bob  30  London
# 2  Charlie  35  Berlin
# 3    David  40 New York

Extraction with which() for Logical Indices

The which() function can be used to get indices corresponding to a logical condition.

Example: Extraction with which() 

# Get indices of rows where Age is greater than 30
indices <- which(df$Age > 30)
# Use indices to extract sub-data frames
subset_which <- df[indices, ]
print(subset_which)
# Output:
#      Name Age    City
# 1 Charlie  35  Berlin
# 2   David  40 New York

Extraction Using Negative Indices

Negative indices allow you to exclude specific rows or columns during extraction.

Example: Excluding Rows or Columns 

# Exclude row 2
subset_exclude_row <- df[-2, ]
print(subset_exclude_row)
# Exclude the "City" column
subset_exclude_col <- df[, -3]
print(subset_exclude_col)
#       Name Age    City
# 1    Alice  25   Paris
# 2 Charlie  35   Berlin
# 3   David  40 New York
# Output (for columns):
#       Name Age
# 1    Alice  25
# 2      Bob  30
# 3  Charlie  35
# 4    David  40

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Facebook
Twitter
LinkedIn
WhatsApp
Email
Print