Data Stucture: Data Frame

Create a data frame named employees with the columns Name (character), Department (character), Age (numeric), and Salary (numeric). Include data for at least five employees.

Solution:

employees <- data.frame(
  Name = Name,
  Department = Department,
  Age = Age,
  Salary = Salary
)
employees

Filter the employees data frame to include only employees who are older than 30.

Solution:

Don’t use dplyr
older_than_30 <- employees[employees$Age > 30, ]
older_than_30
Use dplyr
older_than_30 <- employees |> 
                 filter(Age > 30)
older_than_30

Add a new column to the employees data frame named Experience, where values are “Senior” if Age is above 35 and “Junior” otherwise.

Hint: use ifelse() function

Solution:

Warning
employees$Experience <- ifelse(employees$Age > 35, "Senior", "Junior")
employees
Use dplyr
employees <- employees |> 
             mutate(Experience = ifelse(Age > 35, 
                                  "Senior", "Junior"))
employees

Update the Salary column in the employees data frame to increase each employee’s salary by 10%.

Solution:

Don’t use dplyr
employees$Salary <- employees$Salary * 1.10
employees
Use dplyr
employees <- employees |> 
             mutate(Salary = Salary * 1.10)
employees

Calculate the mean and median Salary of employees using basic R functions.

Solution:

Don’t use dplyr
mean_salary <- mean(employees$Salary)
mean_salary

median_salary <- median(employees$Salary)
median_salary
Use dplyr
mean_salary <- employees |> 
               summarise(mean_salary = mean(Salary))
mean_salary               
median_salary <- employees |>  
               summarise(median_salary = median(Salary))

median_salary

Remove employees who have a Salary less than 50,000 from the employees data frame.

Solution:

Don’t use dplyr
employees <- employees[employees$Salary >= 50000, ]
employees
Use dplyr
employees <- employees |> 
             filter(Salary >= 50000)
employees

Rename the Department column in the employees data frame to Dept.

Don’t use dplyr

Solution:

colnames(employees)[which(names(employees) == "Department")] <- "Dept"
employees
Use dplyr
employees <- employees |> 
              rename(Dept = Department)
employees

Select only the Name and Salary columns of employees who belong to the “Finance” department.

Solution:

Don’t use dplyr
finance_employees <- employees[employees$Dept == "Finance", c("Name", "Salary")]
finance_employees
Use dplyr
finance_employees <- employees |> 
                     filter(Dept == "Finance") |> 
                     select(Name, Salary)
finance_employees

Sort the employees data frame by Salary in ascending order.

Don’t use dplyr

Solution:

finance_employees <- employees[employees$Dept == "Finance", c("Name", "Salary")]

finance_employees
Use dplyr
employees_sorted <- employees |> 
                    arrange(Salary)
employees_sorted

Identify the employee with the highest Salary and display their Name and Department.

Don’t use dplyr

Solution:

max_salary <- max(employees$Salary)
top_employee <- employees[employees$Salary == max_salary, c("Name", "Dept")]

top_employee
Use dplyr
top_employee <- employees |> 
                filter(Salary == max(Salary)) |>
                select(Name, Dept)
print(top_employee)