R语言入门

Step into the R world

发布于

2025年9月13日

1 Preface

1.1 What is R

R logo

1.2 What Is RStudio

  • RStudio is an integrated development environment (IDE) for R and Python.
  • It includes a console, syntax-highlighting editor that supports direct code execution, and tools for plotting, history, debugging, and workspace management.
  • RStudio is available in open source and commercial editions and runs on the desktop (Windows, Mac, and Linux).
  • RStudio website, https://posit.co/products/open-source/rstudio/

1.3 Install R and RStudio

1.4 Usage of R and RStudio

  • Edit multi-line codes in the Source panel.
  • Run codes in the Source panel from top to end line by line.
  • Check variables in the Environment panel.
  • Check print in the Console panel.

1.5 How to Learn R

2 Value and Type

2.1 Double (real, or decimal)

x <- 3
is.integer(x)
## [1] FALSE
x <- 3.1415926
x
## [1] 3.141593

2.2 Integer

x <- 3L
is.integer(x)
## [1] TRUE
x
## [1] 3

2.3 Character

(x <- "Educational psychology")
## [1] "Educational psychology"

2.4 Factor

# c() function combines Values into a Vector
x <- c("grade1", "grade2", "grade3")
x
## [1] "grade1" "grade2" "grade3"
x <- factor(x)
x
## [1] grade1 grade2 grade3
## Levels: grade1 grade2 grade3
levels(x)
## [1] "grade1" "grade2" "grade3"
labels(x)
## [1] "1" "2" "3"
as.numeric(x)
## [1] 1 2 3
x <- ordered(x)
x
## [1] grade1 grade2 grade3
## Levels: grade1 < grade2 < grade3

2.5 Date

x <- strptime("2024-11-14", format = "%Y-%m-%d")
x
## [1] "2024-11-14 CST"

2.6 Logical

x <- (1 == 2); y <- (1 < 2)
x; y
## [1] FALSE
## [1] TRUE

2.7 Complex

x <- 1 + 2i
x
## [1] 1+2i

2.8 Special values

# empty value
NULL
# missing value
NA
# missing character value
NaN
# infinite
Inf

3 Symbols, signs, and operators

3.1 # Comment

# 加法
# 1 + 1
1 + 1
## [1] 2

3.2 : Generate regular sequences (interaction)

-5:5
##  [1] -5 -4 -3 -2 -1  0  1  2  3  4  5

3.3 ; Separate multiple command in one line,

1 + 1 ; 2 * 3
## [1] 2
## [1] 6

3.4 <- Assign

x <- 2 * 3
x
## [1] 6

3.5 $, [],[[]] subset, extract

See below.

3.6 - Negative numbers

x <- 2 * (-3)
x
## [1] -6

3.7 Logic operator

TRUE & FALSE
## [1] FALSE
TRUE | FALSE
## [1] TRUE
!FALSE
## [1] TRUE

1 == 2
## [1] FALSE
1 != 2
## [1] TRUE
1 > 2
## [1] FALSE
1 < 2
## [1] TRUE
1 >= 2
## [1] FALSE
1 <= 2
## [1] TRUE

De Morgan’s law of logical operation:

!(x & y) == (!x | !y)
!(x | y) == (!x & !y)

3.8 Arithmetic Operators

1 + 2; 1 - 2; 2 * 3; 2/3
## [1] 3
## [1] -1
## [1] 6
## [1] 0.6666667
(2024 - 1949)/30
## [1] 2.5
2^3
## [1] 8
5 %% 2
## [1] 1
5 %/% 2
## [1] 2

4 Variable and Vector

  • A vector is similar to a variable.
  • c() combines its arguments to form a vector.
x <- 2
is.vector(x)
## [1] TRUE
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
x
##  [1]  1  2  3  4  5  6  7  8  9 10
x <- c("莫", "听", "穿", "林", "打", "叶", "声")
x
## [1] "莫" "听" "穿" "林" "打" "叶" "声"
x <- paste0(x, collapse = "")
x
## [1] "莫听穿林打叶声"

4.1 Subset a vector

x <- c("何", "妨", "吟", "啸", "且", "徐", "行")
x[3]
## [1] "吟"

5 List

id <- 1:10 + 100; gender <- rep(0:1, 5); depression <- sample.int(4, 10, TRUE)
(dat_list <- list(id = id, gender, depression))
## $id
##  [1] 101 102 103 104 105 106 107 108 109 110
## 
## [[2]]
##  [1] 0 1 0 1 0 1 0 1 0 1
## 
## [[3]]
##  [1] 1 4 3 2 4 3 1 3 4 2

5.1 Subset the list

# return a vector
dat_list$id
##  [1] 101 102 103 104 105 106 107 108 109 110
# return a list
dat_list[2]
## [[1]]
##  [1] 0 1 0 1 0 1 0 1 0 1
# return a vector
dat_list[[2]]
##  [1] 0 1 0 1 0 1 0 1 0 1

6 Data frame

dat_data.frame <- data.frame(id, gender, depression)
head(dat_data.frame)
##    id gender depression
## 1 101      0          1
## 2 102      1          4
## 3 103      0          3
## 4 104      1          2
## 5 105      0          4
## 6 106      1          3
str(dat_data.frame)
## 'data.frame':    10 obs. of  3 variables:
##  $ id        : num  101 102 103 104 105 106 107 108 109 110
##  $ gender    : int  0 1 0 1 0 1 0 1 0 1
##  $ depression: int  1 4 3 2 4 3 1 3 4 2

View the whole data set:

View(dat_data.frame)

6.1 rownames and colnames

rownames(dat_data.frame)
##  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
colnames(dat_data.frame)
## [1] "id"         "gender"     "depression"
(colnames(dat_data.frame) <- c("id", "x", "y"))
## [1] "id" "x"  "y"
colnames(dat_data.frame)[2] <- "gender"
colnames(dat_data.frame)[3] <- "depression"
colnames(dat_data.frame)
## [1] "id"         "gender"     "depression"

6.2 Subset the data.frame

# return a vector
dat_data.frame$gender
##  [1] 0 1 0 1 0 1 0 1 0 1
# return a data.frame
dat_data.frame[2]
##    gender
## 1       0
## 2       1
## 3       0
## 4       1
## 5       0
## 6       1
## 7       0
## 8       1
## 9       0
## 10      1
# return a vector
dat_data.frame[,2]
##  [1] 0 1 0 1 0 1 0 1 0 1
# return a data.frame
dat_data.frame[2,]
##    id gender depression
## 2 102      1          4
dat_data.frame[2,2]
## [1] 1
dat_data.frame[1:3, c(1,3)]
##    id depression
## 1 101          1
## 2 102          4
## 3 103          3
dat_data.frame[1, "depression"]
## [1] 1
dat_data.frame["1", "depression"]
## [1] 1
dat_data.frame[c(2,4), "depression"]
## [1] 4 2
dat_data.frame[c("2","4"), "depression"]
## [1] 4 2
dat_data.frame[dat_data.frame$gender == 0,]
##    id gender depression
## 1 101      0          1
## 3 103      0          3
## 5 105      0          4
## 7 107      0          1
## 9 109      0          4
dat_data.frame[dat_data.frame$depression > 2,]
##    id gender depression
## 2 102      1          4
## 3 103      0          3
## 5 105      0          4
## 6 106      1          3
## 8 108      1          3
## 9 109      0          4

7 Matrix

dat_byrow <- c(101, 1, 2,  102, 0, 5,  103, 1, 4,  104, 1, 1,  105, 0, 3,
               106, 0, 5,  107, 0, 2,  108, 1, 4,  109, 0, 2,  110, 1, 2)
mat <- matrix(dat_byrow, byrow = TRUE, nrow = 10,
              dimnames = list(0:9, c("id", "gender", "depression")))
mat
##    id gender depression
## 0 101      1          2
## 1 102      0          5
## 2 103      1          4
## 3 104      1          1
## 4 105      0          3
## 5 106      0          5
## 6 107      0          2
## 7 108      1          4
## 8 109      0          2
## 9 110      1          2

8 Function

8.1 What is a function

  • A function is a reusable piece of code designed to perform a specific task.
  • A function is like a cooker: Input your ingredient into a function, then the function process them, and finally outputs delicious food.

8.2 Use a function

Search the documents of the function and learn.

8.3 Use a function: An example

seq(from = 1, 
    to = 1, 
    by = ((to - from)/(length.out - 1)),
    length.out = NULL, 
    along.with = NULL, 
    ...)
  • function name: seq.
  • arguments: from, to, by, length.out, along.with, ....
  • default arguments: from = 1, length.out = NULL.

8.4 Use a function: An example

  1. Input: Input your data (numbers, text, vectors, data frames, etc.) through arguments (parameters).
  2. Processing: It performs a series of operations using the input arguments.
  3. Output: It returns a result (output). This could be a single value, a vector, a list, a plot, or even nothing (though functions usually do something useful!).

8.5 Two ways of passing arguments

  1. Through argument names
seq(from = 1, to = 7, by = 0.1)
##  [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
## [20] 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7
## [39] 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6
## [58] 6.7 6.8 6.9 7.0
  1. Through argument position
seq(1, 7, 0.1)
##  [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
## [20] 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7
## [39] 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6
## [58] 6.7 6.8 6.9 7.0

8.6 Use default arguments

  1. You can omit the default arguments.
seq()
## [1] 1
  1. Pass other arguments through argument names.
seq(to = 7)
## [1] 1 2 3 4 5 6 7

8.7 Common fucntions

8.7.1 Type Checking Functions

is.numeric()
is.integer()
is.double()
is.character()
is.logical()
is.factor()
is.ordered()
is.matrix()
is.data.frame()
is.list()
is.vector()
is.na()
is.null()

8.7.2 Type Conversion Functions

as.numeric()
as.integer()
as.double()
as.character()
as.logical()
as.factor()
as.ordered(x)
as.matrix()
as.data.frame()
as.list()
as.vector()
as.null()

9 Package

9.1 What Are R Packages

Basically, an R package is a collection of functions.

9.2 Install Packages

  • Official version

    install.packages("devtools")
  • Developing version

    • The 1st way of using a function
    devtools::install_github(repo = "qyaozh/Keng",
                             dependencies = TRUE,
                             build_vignettes = TRUE)
    • The 2nd way of using a function
    library(devtools)
    install_github(repo = "qyaozh/Keng",
                   dependencies = TRUE,
                   build_vignettes = TRUE)

10 Working directory

  • Working directory is the file folder R works (read and write files).
    • getwd() gets the current working directory, setwd() sets the working directory.
  • The way R and Windows write file path differs, R uses “/”, but Windows uses “\”.
  • You could use RStudio’s file panel to :
    • Go To Working Directory
    • Set As Working Directory

10.1 Advice on the working directory

  • Use getwd() or RStudio to get and go to R’s working directory.
  • Create a new folder (directory) named with your name and school ID under this directory (e.g., qingyaozhang2024321).
  • Go into the qingyaozhang2024321 file folder.
  • Use setwd() or RStudio to set the new folder qingyaozhang2024321 as your working directory.

11 Import and Export Data

11.1 Use base R

  • Export the data using write.csv().
library(Keng)
data("depress")
write.csv(
  depress,
  # 文件拓展名.csv是必须的
  file = "dat.csv", 
  row.names = FALSE
)
  • Import the .csv data using read.csv()
    • Step 1, Enter the data into Excel or SPSS.
    • Step 2, Save the data as the .csv file.
    • Step 3, Import the data into R using read.csv().

read .csv file from R’s working directory:

dat <- read.csv("dat.csv")
head(dat)

read .csv file from a directory other than R’s working directory, e.g.,

dat <- read.csv("C:/Users/Yao/Documents/qingyaozhang2024321/dat.csv")
head(dat)

11.2 Other file formats and packages

  • .xlsx file, readxl package, read_excel(), read_xls(), read_xlsx()
  • .sav file, haven package, read_spss(), read_sav(), write_sav()
  • .dta file, haven package, read_stata(), read_dta(), write_dta()