Warning Midterm Exam (Part 2 of 2) on Jan 9, 2025, 3:30–5:30 PM. at RB3208 Total Score: 15% (5% Multiple Choice, 10% Writing R Code). Topics: - Data Structure: Data Frame - Data Wrangling - Data Structure: XTS Project 1: 10% Create your CV using Quarto and upload the HTML and related files to CMU’s personal web hosting. 1. Submit your HTML file to your CMU personal web hosting. 2. Zip your project folder. The zip file should include the .qmd file, any image files, and CSS files (if used). I will render your files to compare them with your URL. Send your URL and the zip file to my email: somsak.c@icdi.cmu.ac.th before midnight on January 5, 2025. Content Requirements: Your CV must include: - Your picture - Your birthday - Your education - Your hobbies - Links to at least 2 social media platforms - Your future plans Criteria: 1. The HTML file and .qmd file must match. 2. Use a theme. 3. Use CSS. 4. Include a table. 5. Include a tabset 6. Include a video. 5. Use icons from Font Awesome (e.g., Facebook, YouTube, LINE, WeChat, etc.). 8. Modify the layout creatively.
Additional Note: Some of your score will be determined by votes from your peers.
Data Wrangling
Before beginning the exercises, create a sample data frame named employees for use in the exercises.
For employees with more than 10 years of experience, create a new column called Seniority with values “Senior” if Years_Experience is greater than 10, otherwise “Junior”. Select only the Name, Years_Experience, and Seniority columns.
Calculate the total Salary for each department and then create a new column Total_Budget that is 1.1 times this total (representing a 10% budget increase).
Add a new column Age_Group that categorizes employees as “Young” if Age is less than 30, and “Experienced” if 30 or older. Then filter for employees in the “Experienced” age group.
Add a new column Adjusted_Salary that is 95% of the current Salary for employees in “HR” and “Finance”. Then calculate the average Adjusted_Salary for each department.
Create a column Experience_Ratio that is Years_Experience divided by Age. Filter for employees with a Experience_Ratio greater than 0.3, then arrange by Experience_Ratio in descending order.
For each department, calculate the average Salary, and then add a column Above_Avg_Salary that marks TRUE if an employee’s Salary is above their department’s average, and FALSE otherwise.
Filter for employees in the “IT” department, add a column called Tenure_Level that labels them as “Experienced” if they have 10 or more years of experience, otherwise “New”. Select only the Name, Tenure_Level, and Salary.
Filter for employees in “HR” or “Marketing”, group by Department, calculate the average Salary and Years_Experience in each department, and arrange by average Salary in ascending order.
Perform a left join on employees2 and salaries based on EmployeeID. Show all columns, and observe which employees have missing salary information.
Solution:
employees2 |>left_join(salaries, by ="EmployeeID")
22. Right Join:
Perform a right join on employees2 and salaries based on EmployeeID. Check which employees in the salaries data frame do not have corresponding information in employees.
Solution:
employees2 |>right_join(salaries, by ="EmployeeID")
23. Inner Join:
Use an inner join to combine employees2 and salaries by EmployeeID. Display only the records where both employee information and salary information are available.
Solution:
employees2 |>inner_join(salaries, by ="EmployeeID")
24. Full Join:
Use a full join to combine employees2 and salaries based on EmployeeID. Identify any employees or salary records without a match in the other table.
Solution:
employees2 |>full_join(salaries, by ="EmployeeID")
25. Join with Filtering:
Perform a left join on employees2 and salaries based on EmployeeID, then filter to show only employees with salaries above 60,000.
Solution:
employees2 |>left_join(salaries, by ="EmployeeID") |>filter(Salary >60000)
26. Multiple Joins:
First, perform a left join between employees2 and salaries based on EmployeeID. Then, join the resulting data with departments to match each employee with their department manager based on the Department column.
Solution:
employees2 |>left_join(salaries, by ="EmployeeID") |>left_join(departments, by =c("Department"="DepartmentName"))
27. Summarize After Join:
Perform an inner join between employees2 and salaries based on EmployeeID. Then, calculate the average salary for each department.
Solution:
employees2 |>inner_join(salaries, by ="EmployeeID") |>group_by(Department) |>summarise(avg_salary =mean(Salary, na.rm =TRUE))
28. Self-Join:
Use left_join to join the employees2 data frame to itself on the Department column to compare employees within the same department.
Solution:
employees2 |>left_join(employees, by ="Department", suffix =c("_1", "_2")) |>filter(EmployeeID_1 != EmployeeID_2)
29. Join with Different Key Columns:
Use full_join to combine employees2 and departments by matching the Department column in employees2 with the DepartmentName column in departments.
Solution:
employees2 |>full_join(departments, by =c("Department"="DepartmentName"))
30. Identify Missing Data with Outer Join:
Perform a full join on employees2 and salaries based on EmployeeID. Filter the result to show only the records where either Name or Salary is NA to identify employees without salary data or vice versa.
Solution:
employees2 |>full_join(salaries, by ="EmployeeID") |>filter(is.na(Name) |is.na(Salary))