International College of Digital Innovation, CMU
November 29, 2024
In R, objects and functions are two fundamental concepts, but they serve different purposes and have different characteristics.
Object
An object
in R is a data structure that stores values or data. Objects can be of various types, such as vectors, lists, matrices, data frames, or even more complex structures like models.
Purpose: Objects are used to hold data that you can manipulate, analyze, or pass as inputs to functions.
Function
A function
in R is a set of instructions or code designed to perform a specific task. Functions take inputs (arguments), execute a series of commands, and often return a result.
Purpose: Functions are used to perform operations, calculations, or transformations on data (objects).
In R, characters are a basic data type used to represent text.
They are typically stored as character vectors.
In R, numeric data types are used to represent numbers. This includes both integers and floating-point (decimal) numbers.
In R, integers are a specific type of numeric data that represent whole numbers.
We can define integers with the L
suffix:
In R, logical data types are used to represent boolean values, which can be either TRUE
or FALSE
(T
or F
).
Logical values are essential for controlling the flow of programs through conditional statements and loops, and they are also useful for indexing and subsetting data.
Operators in R are used to perform various operations on variables and values.
They can be categorized into several types:
Arithmetic Operators
Comparison Operators
Logical Operators
Assignment Operators
etc.
In R, arithmetic operators are used to perform basic mathematical operations on numeric values.
These operators are fundamental to performing calculations and are applied element-wise when used with vectors.
Basic arithmetic operations:
Addition (+
)
Subtraction (-
)
Multiplication (*
)
Division (/
)
Exponentiation (^
)
Modulus (%%
)
Integer Division (%/%
)
In R, comparison operators are used to compare two values or variables. The result of a comparison operation is a logical value (TRUE
or FALSE
).
\(~\)
Greater than (>
):
Less than (<
):
Equal to (==
):
\(~\)
Greater than or equal to (>=
):
Less than or equal to (<=
):
Not equal to (!=
):
Logical operators are used to perform logical operations, often in the context of conditional statements or when working with Boolean values (TRUE
or FALSE
).
Logical operators
&
: Element-wise logical AND
|
: Element-wise logical OR
!
: Negation (Unitary operator for negating a logical value)
These operators are essential for making decisions and controlling the flow of your code.
Logical operator AND (&
)
Example
Logical operator OR (|
)
Example
Logical operator Negation of Not (!
)
Example
Example
In R, assignment operators are used to assign values to variables. They are fundamental for storing data, defining variables, and setting up computations.
<-
: The most commonly used assignment operator in R. It assigns the value on the right to the variable on the left.
=
: Also used for assignment but is generally preferred for specifying arguments in function calls rather than variable assignment.
Example
or
Remark
When we assign any data structure to an object name, R does not display the value on your screen.
In R, reserved words (or keywords) are special words that have a specific meaning within the language.
These words cannot be used as identifiers (such as variable names
, function names
, etc.) because they are part of the language syntax.
if
, else
repeat
while
function
for
in
next
break
return
TRUE
FALSE
NULL
NA
(and variants like NA_integer_
, NA_real_
, NA_complex_
, NA_character_
)Inf
(represents infinity)NaN
(Not a Number)NA
(Not Available or missing value)...
(Ellipsis, used to pass additional arguments to functions)~
(Tilde, used in model formulas)Reserved Words in Context:
if
, else
, for
, while
, repeat
, break
, next
, and return
are used to control the flow of execution in an R program.TRUE
, FALSE
, and NULL
are used to represent logical values and the absence of any value.Inf
and NaN
represent mathematical concepts (infinity and an undefined result, respectively).NA
is used for missing data, which is very common in data analysis.function
keyword is used to define new functions in R.Importance of Reserved Words:
Syntax Rules: Reserved words form the core syntax of R, and their correct usage is essential for writing valid R code.
Naming Restrictions: Since reserved words have specific functions within the language, you cannot use them as variable or function names, which helps avoid confusion and errors in the code.
Understanding reserved words in R helps in writing clear, error-free code and avoids conflicts in naming conventions.
In R, the c()
function is one of the most fundamental functions.
It is used to combine or concatenate elements to create a vector.
The c()
function can take multiple arguments and combine them into a single vector.
The vector of number
The vector of character
The vector of logical
Multi element type is character
In R, you can create sequences of numbers using various functions.
The most common ways to generate sequences are by using
the :
operator.
the seq()
function.
the rep()
function.
1. Using the :
Operator
The :
operator generates a simple sequence of integers.
These methods allow you to generate sequences easily and are fundamental for data manipulation and iteration in R.
Example
Arithmetic Operation examples
Important
In R, e
is a way to express numbers in scientific notation. Specifically:
1e2
means (1 ^2), which equals 100.e
in 1e2
stands for “exponent,” so:
1e2
is equivalent to 1 * (10 ^ 2)
Here are a few more examples of using scientific notation in R:
2e3
is equal to \(2 \times 10^3\), which equals 2000.5e-2
is equal to \(5 \times 10^{-2}\), which equals 0.05.3.14e1
is equal to \(3.14 \times 10^1\), which equals 31.4.This notation is particularly useful for dealing with very large or very small numbers in calculations.
Note: ou cannot use the notation e
alone without a number before and after it, as shown in the example below.
Example
Greater than
Less than
Equal to
Greater than or equal to
Less than or equal to
Not equal to
Merge/Access/Replace value in vector
Merge
Warning
Merge vector A and vector B
Merge vector B and vector A
Note: merge A and B != merge B and A
Access
From D, show a value at position 1,2,3,4 and 5
Or
From D, show the value at even position
From D, don’t show the value at position 5.
From D, show the value at position 9 and 1 respectively.
Replace
From D, change the value in position 1 to 21.
From D, change every value in position 1 until position 5 equal 25.
From D, change the value in position 1 and position 10 to 30 and 35 respectively.
typeof() and class() functions
typeof()
: This function tells you the internal storage mode or type of the object, which is how R internally represents the data. It focuses on the low-level storage type.
class()
: This function returns the class or high-level type of an object, which often corresponds to how the object is treated by R’s methods.
Common types returned by typeof() include:
“logical”
“integer”
“double”
“complex”
“character”
“list”
“NULL”
Some common classes are:
“numeric”
“factor”
“data.frame”
“matrix”
“lm” (linear model)
Is the object a character/numeric/logical/integer?
In R, is.xxxx()
functions are a family of functions used to check if an object is of a particular type or class. These functions return TRUE
if the object matches the specified type or class, and FALSE
otherwise.
is.character()
: Checks if an object is of type character.
is.numeric()
: Checks if an object is of type numeric (either integer or double).
is.logical()
: Checks if an object is of type logical (TRUE or FALSE).
is.integer()
: Checks if an object is of type integer.
Assign NULL to an object
We can remove the object from memory or the environment by assigning a NULL value to the object.
check
Use the rm() function
the rm()
function is used to remove objects from the environment.
check
Note: you can use <anyname>
.RData
The sample()
function in R is used to generate a random sample of elements from a specified set of data, with or without replacement.
Parameters:
x: A vector of elements from which to choose.
size: The number of items to choose.
replace: Logical; if TRUE
, sampling is done with replacement (elements can be selected more than once). Default is FALSE
.
prob: A vector of probability weights for obtaining the elements of the vector being sampled.
Examples
The paste()
and paste0()
functions in R are used to concatenate strings or other objects into a single string. While they serve similar purposes, they have some key differences in how they handle separators.
Examples
The length()
function in R is used to determine the number of elements in an object.
It returns the count of elements present in vectors, lists, arrays, or other objects in R.
For instance, if you have a vector containing numbers or strings, length()
will provide the count of elements present within that vector.
my_vector
.my_vector
, access the third element in the vector.my_vector
.my_vector
.my_vector2
with the values 1, 2, 3, 4, and 5. Add my_vector
and my_vector2
element-wise.my_vector
are greater than 25.my_vector
to only include elements that are greater than
my_vector
with 99.The 5 exercises focusing on the seq()
, rep()
, paste()
/paste0()
, and sample()
functions in R:
seq()
function to create a sequence of numbers from 5 to 50 with a step of 5. Assign the result to a variable named my_seq
.rep()
function to create a vector that repeats the elements 1, 2, and 3, each five times.paste()
function that combines the string “Day” with the numbers from 1 to 7. The result should be c("Day 1", "Day 2", ..., "Day 7")
.sample()
function to generate a random sample of 5 unique numbers from the sequence you created in Exercise 1 (my_seq
). Assign the result to my_sample
.paste0()
function. The IDs should be in the format “ID1”, “ID2”, …, “ID10”. Use paste0()
and the seq()
function to create this vector.