print("hi")
[1] "hi"
"hello"
[1] "hello"
print("Hello World!", quote = FALSE)
[1] Hello World!
In programming there are certain standard concepts one need to know irrespective of the domain in which one is working. You can think of these concepts as like alphabets in a human language. We need to have a good understanding to the basics before go into applying our coding skills for data analysis. This chapter introduces some of the fundamental concepts in R; which we’ll be using in the rest of this book.
A package in R is a collection of functions (more about it later) and sample datasets. When we install R, a specific set of packages get installed. These include basic functions for input, output, file handling, etc. In fact, the package with these basic functions is called base
. Another frequently used package that is part of the default installation is utils
. All packages are installed in the library folder within the R installation path. To get a list of installed packages run library()
.
We can always install additional packages using install.packages(“<package name>”)
. Once a package is installed, load it in the R environment using library(<package name>)
.
The print
function, as expected, prints the argument on the console. It take a single argument. Note that print
is one of the the ‘default’ functions in R i.e. when a variable or an object is executed without any function, the print
function is called with that object as an argument. This results in printing of the value of the variable or the object (as is) on the standard output (screen). When printing some text, a logical keyword argument quote
(default is True
) indicates whether to print with or without quotes.
print("hi")
[1] "hi"
"hello"
[1] "hello"
print("Hello World!", quote = FALSE)
[1] Hello World!
To print multiple items, cat
(concatenate and print) should be used. Note that print
add a newline after printing while cat
doesn’t. So, to print on the next line with cat
, explicit use of the newline character ("\n")
is required. There are a couple of differences between the default behaviour of print
and cat
— unlike print,
the cat
function doesn’t prefix the line numbers to the output and there are no quotes around characters, by default, when using cat
.
<- 5
x print("Hello World")
[1] "Hello World"
print(x)
[1] 5
cat ("Hello", "World", "\n")
Hello World
cat (x)
5
The sep
keyword argument for the cat
function specifies the separator to use when printing different elements. The default separator is a blank space character.
<- c(1:10)
nums cat(nums,"\n")
1 2 3 4 5 6 7 8 9 10
cat(nums, sep = ",")
1,2,3,4,5,6,7,8,9,10
A variable can be assigned a value using an operator. R offers five different operators to do this task — <-
, <<-
, ->
, ->>
, and =
. The code below shows the syntax for assigning the value to five variables (x1 to x5) using different assignment operators. <-
is the most preferred assignment operator.
<- 5 #using <-
x1 <<- 6 #using <<-
x2 7 -> x3 #using ->
8 ->> x4 #using ->>
= 9 #using =
x5 cat(x1,x2,x3,x4,x5)
5 6 7 8 9
There are certain rules when it comes to naming a variable in R.
Each variable has some characteristics associated with it such as its name, value, data type,and memory location. The name and value aspects we have discussed above. Now let’s talk about the data type. For a variable, its data type indicates the kind of data that variable stores. There can be a variable that stores a number (as shown in the code above) or its can hold some text value. These different type of data have specific names in programming languages. For example, when we say x = 5
, R implicitly assigns this variable a numeric
data type. To check the data type for a variable, class
function can be used. Note that the typeof
, mode
, and storage.mode
functions also return the data type of a variable and are helpful when working with advanced data types such as matrices and arrays.
= 5
x = "hello"
y class(x)
[1] "numeric"
class(y)
[1] "character"
Quiz
What would be the output of the following code?
= "y"
x print(class(x))
= "y"
x print(class(x)) # character
Symbol | Meaning | Example |
---|---|---|
+ |
Addition | 2 + 2 = 4 |
- |
Subtraction | 5 - 2 = 3 |
* |
Multiplication | 2 * 5 = 10 |
/ |
Division | 8 / 2 = 4 |
%% |
Modulus | 8 %% 4 = 0 |
%/% |
Floor division | 5 %/% 2 = 2 |
** |
Exponent | 2 ** 3 = 8 |
^ |
Exponent | 2 ^ 3 = 8 |
print(8%%4) # Remainder after division
[1] 0
print(5%/%2) # Quotient after division
[1] 2
print(2**3) # Raise to power
[1] 8
print(2^3) # Raise to power
[1] 8
Symbol | Meaning | Example |
---|---|---|
> |
Greater than | 3 > 2 is True |
< |
Less than | 3 < 2 is False |
== |
Equal to | 2 == 3 is False |
!= |
Not equal to | 2 != 3 is True |
>= |
Greater than or equal to | 4 >= 2 is True |
<= |
Less than or equal to | 5 <= 2 is False |
Symbol | Meaning |
---|---|
and / & |
True if both operands are true |
or / | |
True if either of the operands is true |
not / ! |
True if operands are false |
R has certain unique operators that are not there in other programming languages e.g., the pipe operator, %>%
(used to forward a value to the next function) and %in%
(used to perform matching) etc. These operators are used to do specialized tasks. We’ll discuss the uses of these operators in the relevant context, once we have covered the basics.
The operators ::
and :::
are used to access variables in a namespace and require a string on both left and right of the operator. Similarly, $
and @
operators are used to access specific attributes associated with some of the datatypes.
{r} base::print}
The readline
function is used to get input from the user in an interactive manner. The prompt
keyword argument can be used to display some message to the user.
<- readline(prompt = "Enter you name: ")
name print(name)
By default, an input from the console is assigned a class character
. In case you would like to have numeric input form the user then you need to explicitly coerce the input to numeric
type using as.numeric
(see below for details).
is.<data type>
To check whether an object is of a particular data type or not, we can use the is.<type>
function, which returns a Boolean value. E.g. to check whether an object is numeric, use is.numeric
similarly to check if an object is of class character, use is.character
.
<- 4
x <- "4"
y is.numeric(x)
[1] TRUE
is.numeric(y)
[1] FALSE
is.logical(y)
[1] FALSE
as.<data type>
There are occasions when we need to change the data type of an object to another data type. This can be achieved using the as.<type>
function. The process of conversion of datatypes in R is called coercion. For instance, to convert a numeric data type to character, as.character
function is used. Note that we cannot convert any data type to any other data type. There are certain rules that govern the process of coercion.
<- 4
x cat(x, class(x),"\n")
4 numeric
<- as.character(x)
y cat(y, class(y))
4 character
The code below would throw an error since a string having alphabets cannot be coerced into number; which, of course, doesn’t make sense.
<- "hello"
x as.numeric(x)
Similarly, a vector can be coerced in a dataframe using as.data.frame
.
<- c(1:5)
a <- as.data.frame(a)
b print(a)
[1] 1 2 3 4 5
print(b)
a
1 1
2 2
3 3
4 4
5 5
The use of as.<type>
to convert data type comes under explicit coercion. There are certain functions in R that do coercion, if required, implicitly. We’ll look such functions in the subsequent chapters.
Quiz
What will be the output of the following code?
<- as.character(4)
y print(is.logical(is.numeric(y)))
<- as.character(4)
y print(is.logical(is.numeric(y)))
# TRUE