Data Analytics (ECMP 5005B)
Esam Mahdi
School of Mathematics and Statistics
Master of Engineering - Engineering Practice
Carleton University
Wednesday, September 6, 2023
By the end of this chapter, you should be able to do the following:
Source: Robert I. Kabacoff. R in Action: Data analysis and graphics with R and Tidyverse. 2nd ed., Manning, 2022.
Do not trust all of these packages!
One problem that we usually face when we load and attach some libraries in R is that these libraries might have different masked functions share the same namespace. For example, the function lag() is masked by both stats and dplyr packages. It performs a different tasks in both. Thus, you need to be careful if you are using lag() in R while the package dplyr.
set.seed(1) #set reproducible results
x <- rnorm(5) #generate 5 observations from the standard normal distribution N(0,1)
stats::lag(x, 2) #shift the time base back by 2 (keep 1st & 2nd observations)
[1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078
attr(,"tsp")
[1] -1 3 1
[1] NA NA -0.6264538 0.1836433 -0.8356286
Let’s start with the following R code
# create two numeric vectors, each with 8 observations
wt <- c(60,70,63,55,48,49,58,58)
age = c(20,17,23,24,19,19,16,26) #note "<-" symbol can be replaced by "="
# get a random sample (without replacement) of 8 observations
set.seed(5) #set seed to reproduce the same random sample
z=sample(150:190, size = 8)
# replicat a string "Male" 3 times & get a vector of characters
Male=rep("Male", times = 3)
# replicat a string "Female" 5 times & get a vector of characters
Female=rep("Female", times = 5)
# combine the two categorical variables into one nominal variable
s = c(Male, Female)
# create an ordinal categorical variable
income=c("Low","High","Low","Low","Middle","Middle","Middle","High")
# stores categorical values as vector of integers (factors)
sex=factor(s)
income=factor(income,order=TRUE,levels=c("Low","Middle","High"))
# create a data frame and name the variables
mydata=data.frame(id=1:8,weight=wt,age=age,z=z,sex=sex,Sex=s,income)
'data.frame': 8 obs. of 7 variables:
$ id : int 1 2 3 4 5 6 7 8
$ weight: num 60 70 63 55 48 49 58 58
$ age : num 20 17 23 24 19 19 16 26
$ z : int 151 164 160 170 179 156 168 152
$ sex : Factor w/ 2 levels "Female","Male": 2 2 2 1 1 1 1 1
$ Sex : chr "Male" "Male" "Male" "Female" ...
$ income: Ord.factor w/ 3 levels "Low"<"Middle"<..: 1 3 1 1 2 2 2 3
mat1 <- matrix(wt, nrow = 5, ncol = 2) #create a matrix of dimension 5x2
mat2 <- matrix(age, nrow = 2, ncol = 5) #create a matrix of dimension 2x5
mat3 = cbind(wt,age,sex,income) #combine vectors by columns. Exercise: Type the code: rbind(wt,age,sex,income) and explain the outcome!
mylist <- list(wt,age,sex,income) #create a list of 4 vectors
id weight age z sex
Min. :1.00 Min. :48.00 Min. :16.00 Min. :151.0 Female:5
1st Qu.:2.75 1st Qu.:53.50 1st Qu.:18.50 1st Qu.:155.0 Male :3
Median :4.50 Median :58.00 Median :19.50 Median :162.0
Mean :4.50 Mean :57.62 Mean :20.50 Mean :162.5
3rd Qu.:6.25 3rd Qu.:60.75 3rd Qu.:23.25 3rd Qu.:168.5
Max. :8.00 Max. :70.00 Max. :26.00 Max. :179.0
Sex income
Length:8 Low :3
Class :character Middle:3
Mode :character High :2
id weight age z sex Sex income status0
1 1 60 20 151 Male Male Low grade1
2 2 70 17 164 Male Male High grade2
3 3 63 23 160 Male Male Low grade3
4 4 55 24 170 Female Female Low grade4
5 5 48 19 179 Female Female Middle grade5
6 6 49 19 156 Female Female Middle grade6
7 7 58 16 168 Female Female Middle grade7
8 8 58 26 152 Female Female High grade8
id weight age z sex Sex income status0 status1
1 1 60 20 151 Male Male Low grade1 grade 1
2 2 70 17 164 Male Male High grade2 grade 2
3 3 63 23 160 Male Male Low grade3 grade 3
4 4 55 24 170 Female Female Low grade4 grade 4
5 5 48 19 179 Female Female Middle grade5 grade 5
6 6 49 19 156 Female Female Middle grade6 grade 6
7 7 58 16 168 Female Female Middle grade7 grade 7
8 8 58 26 152 Female Female High grade8 grade 8
# Recoding variables: recode age 20 by a missing value
mydata$age[mydata$age == 20] <- NA
mydata[1:4,] #display the first 4 rows
id weight age z sex Sex income status0 status1
1 1 60 NA 151 Male Male Low grade1 grade 1
2 2 70 17 164 Male Male High grade2 grade 2
3 3 63 23 160 Male Male Low grade3 grade 3
4 4 55 24 170 Female Female Low grade4 grade 4
id weight age sex Sex income status0
1 1 60 NA Male Male Low grade1
2 2 70 17 Male Male High grade2
3 3 63 23 Male Male Low grade3
4 4 55 24 Female Female Low grade4
5 5 48 19 Female Female Middle grade5
6 6 49 19 Female Female Middle grade6
7 7 58 16 Female Female Middle grade7
8 8 58 26 Female Female High grade8
par(mfrow = c(2, 2)) #create a 2 x 2 plotting matrix
plot(wt,age); plot(mydata$weight, mydata$age) #type ?plot to get help about the function plot()
plot(wt,age, xlab = "Weight", ylab = "Age", col = "red")
plot(density(rnorm(500)),col="blue") #plot a density distribution of 500 random data from Gaussian
After setting up R environment with Rstudio, you can import the data from different structures.
> read.table
function (file, header = FALSE, sep = "", quote = "\"'", dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"), row.names, col.names, as.is = !stringsAsFactors, tryLogical = TRUE, na.strings = "NA", colClasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill = !blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#", allowEscapes = FALSE, flush = FALSE, stringsAsFactors = FALSE, fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)
> read.csv
function (file, header = TRUE, sep = ",", quote = "\"", dec = ".", fill = TRUE, comment.char = "", ...)
> read.csv2
function (file, header = TRUE, sep = ";", quote = "\"", dec = ",", fill = TRUE, comment.char = "", ...)
Bitcoin daily price (in US dollars) from January 22, 2020 to September 1, 2021 (during COVID-19).
# change the working directory to a different location on your computer
dat1 <- read.table("data/BTC.txt",header = T, fill=TRUE)
dat2 <- read.csv("data/BTC.csv", header = TRUE)
str(dat1)
'data.frame': 589 obs. of 5 variables:
$ Date : chr "1/22/2020" "1/23/2020" "1/24/2020" "1/25/2020" ...
$ Price: num 8664 8404 8447 8354 8622 ...
$ Open : num 8734 8669 8404 8447 8351 ...
$ High : num 8800 8669 8522 8447 8622 ...
$ Low : num 8581 8297 8248 8280 8304 ...
Use skim() function from skimr package to get a useful summary statistics.
Exercise:
The readr package provides functions to read rectangular data with extension .csv, .txt or .tsv.
# Read from a path specifies the location of a data on your computer
name_data <- read_csv("file_data - Sheet1.csv") # import data from a comma delimited file
# Read from a remote path (e.g., mtcars data set from GitHub website)
name_data <- read_csv("https://github.com/tidyverse/readr/raw/main/inst/extdata/mtcars.csv")
name_data <- read_tsv("file_data.txt") # import data from a tab delimited file separated by tabs
name_data <- read_tsv("file_data.tsv", sheet=1) # import data from a tab delimited file
The readxl package can import tabular data from Excel workbooks. Both xls and xlsx formats are supported.
The haven package can import data with .sav, .dat, and .sas7bdat extensions.
#first, make sure you that have the package "haven" installed on you computer,
#if not installed, you need to install it
install.packages("haven")
library(haven) # Load the package
name_data <- read_sav("file_data.sav") # import data from SPSS
name_data <- read_dat("file_data.dat") # import data from Stata
name_data <- read_sas("file_data.sas7bdat") # import data from SAS
Import the BTC data set using the functions read_tsv() and read_csv() from the package readr.
library(readr)
dat3 <- read_tsv("data/BTC.txt")
dat4 <- read_csv("data/BTC.csv")
head(dat3) #same results using head(dat4)
# A tibble: 6 × 5
Date Price Open High Low
<chr> <dbl> <dbl> <dbl> <dbl>
1 1/22/2020 8664. 8734. 8800. 8581.
2 1/23/2020 8404. 8669. 8669. 8297.
3 1/24/2020 8447. 8404. 8522. 8248.
4 1/25/2020 8354. 8447. 8447. 8280
5 1/26/2020 8622. 8351. 8622 8304.
6 1/27/2020 8912 8622 9002. 8585.
Note that the head() prints differently from before because it’s a tibble. Tibbles are rectangular data frames, but slightly tweaked to work better in the tidyverse package that we will discuss later!
The excel sheet BTC2 has two sheets named BTC and BTC2. The data BTC2 is stored in columns G7:G38-K7:38. The first cell A1 provides a quick description of this data. Data has some missing values.
# A tibble: 6 × 5
Date Price Open High Low
<dttm> <dbl> <dbl> <dbl> <dbl>
1 2021-01-01 00:00:00 29346 28933 29498 28932
2 2021-01-02 00:00:00 32185 29346 33168 29192
3 2021-01-03 00:00:00 32971 32183 34253 32110
4 2021-01-04 00:00:00 NA NA NA NA
5 2021-01-05 00:00:00 33996 32020 33996 30979.
6 2021-01-06 00:00:00 36755 33986 36755 33901
Many popular packages, such as readr, tidyr, dplyr, and purr, save data frames as tibbles. When you are using the package tibble to import data be aware of the following properties:
Function | Use | Syntax |
---|---|---|
mutate() | Transform or recode variables | dataframe <- mutate(dataframe, new_varibles = expression) |
select() | Select variables/columns | dataframe <- select(dataframe, select_variables) |
filter() | Select observations/rows | dataframe <- filter(dataframe, expression) |
rename() | Rename variables/columns | dataframe <- rename(dataframe, new_varaibles_names = old_varaibles_names) |
recode() | Recode variable values | variable <- recode(variable, old_values = new_values) |
arrange() | Order rows by variable values | dataframe <- arrange(dataframe, sort_varaibles) |
group_by() | Group by one or more variables | dataframe <- group_by(varaibles to group by) |
library(dplyr)
dat5 <- mutate(dat5,BeforeClose=dplyr::lag(Price), returns=log(Price)-log(BeforeClose)) #use the BTC2 dataset
dat5[1:2,]
# A tibble: 2 × 7
Date Price Open High Low BeforeClose returns
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-01-01 00:00:00 29346 28933 29498 28932 NA NA
2 2021-01-02 00:00:00 32185 29346 33168 29192 29346 0.0923
dat5 <- dat5 %>%
mutate(Date = lubridate::mdy(Date), #parse dates with month, day, and year components using the function mdy() from the "lubridate" package
BeforeClose=dplyr::lag(Price),
returns=log(Price)-log(BeforeClose))
dat5[1:2,] #note that the first returns is missing "NA". To remove "NA", use the code: %>% tidyr::drop_na()
# A tibble: 2 × 7
Date Price Open High Low BeforeClose returns
<date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 NA 29346 28933 29498 28932 NA NA
2 NA 32185 29346 33168 29192 29346 0.0923
id weight age sex income status0
1 1 60 NA Male Low grade1
2 2 70 17 Male High grade2
3 3 63 23 Male Low grade3
4 4 55 24 Female Low grade4
5 5 48 19 Female Middle grade5
6 6 49 19 Female Middle grade6
7 7 58 16 Female Middle grade7
8 8 58 26 Female High grade8
# Select all females with age 19 or weight greater than 59
mydata %>% filter(sex == "Female" &
age == 19 | weight > 59)
id weight age height sex Sex income status0 status1
1 1 60 NA 151 Male Male Low grade1 grade 1
2 2 70 17 164 Male Male High grade2 grade 2
3 3 63 23 160 Male Male Low grade3 grade 3
4 5 48 19 179 Female Female Middle grade5 grade 5
5 6 49 19 156 Female Female Middle grade6 grade 6
Note that the first age is missing (“NA”). This value is associated with low income. Thus, the average age for those who have low income is missing (“NA”).
Exercise: How do you solve this issue?
mutate_data <- mydata %>%
select(id, age, height, weight,gender,income,status0) %>%
mutate(height_foot = 0.033 * height) %>%
rename(status = status0) %>%
filter(income == c("Low","Middle")) %>%
arrange(age, income) # "income" is an ordinal variable
mutate_data
id age height weight gender income status height_foot
1 6 19 156 49 Female Middle grade6 5.148
2 3 23 160 63 Male Low grade3 5.280
Exercise:
[dpqr] abbreviation name of distribution
, where each letter of [dpqr]
refers to the aspect of the distribution returned:
d
= Densityp
= Distribution functionq
= Quantile functionr
= Random generationDistribution | Syntax | Distribution | Syntax | Distribution | Syntax | |||
---|---|---|---|---|---|---|---|---|
Beta | beta() | Binomial | binom() | Cauchy | cauchy() | |||
Chi-squared | chisq() | Exponential | exp() | F | f() | |||
Gamma | gamma() | Geometric | geom() | Hypergeometric | hyper() | |||
Lognormal | lnorm() | Logistic | logis() | Multinomial | multinom() | |||
Negative binomial | nbinom() | Normal | norm() | Poisson | pois() | |||
Wilcoxon signed rank | signrank() | T | t() | Uniform | unif() | |||
Weibull | weibull() | Wilcoxon rank sum | wilcox() | |||||
z = 2.1
?
pnorm(2.1)
to get 0.9821356
.qnorm(0.95, mean =100, sd = 20)
to get 132.8971
rnorm(300, mean =80, sd = 10)
to get the simulated series.The Posit Cheatsheets website suggests some favorite data science packages to use!
> install.packages("tidyverse")
> library(tidyverse)
── Attaching core tidyverse packages ─────────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.2 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.2 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.1
── Conflicts ───────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package to force all conflicts to become errors
Warning messages:
1: package ‘tidyverse’ was built under R version 4.3.1
2: package ‘readr’ was built under R version 4.3.1
Function | Layers | Options |
---|---|---|
geom_point() | Scatterplot | color, alpha, shape, size |
geom_line() | Line graph | colorvalpha, linetype, size |
geom_jitter() | Jittered points | color, size, alpha, shape |
geom_bar() | Bar chart | color, fill, alpha |
geom_boxplot() | Box plot | color, fill, alpha, notch, width |
geom_histogram() | Histogram | color, fill, alpha, linetype, binwidth |
geom_smooth() | Fitted line | method, formula, color, fill, linetype, size |
geom_density() | Density plot | color, fill, alpha, linetype |
geom_hline() | Horizontal lines | color, alpha, linetype, size |
geom_vline() | Vertical lines | color, alpha, linetype, size |
geom_rug() | Rug plot | color, side |
geom_violin() | Violin plot | color, fill, alpha, linetype |
geom_text() | Text annotations | see the help for this function |
Source: https://nbisweden.github.io/RaukR-2019/ggplot/presentation/ggplot_presentation.html#1.
See also https://clauswilke.com/dataviz/directory-of-visualizations.html
Let’s use our first graph to answer the following questions about the mpg data frame available from the package ggplot2:
# A tibble: 6 × 11
manufacturer model displ year cyl trans drv cty hwy fl class
<chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa…
2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa…
3 audi a4 2 2008 4 manual(m6) f 20 31 p compa…
4 audi a4 2 2008 4 auto(av) f 21 30 p compa…
5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compa…
6 audi a4 2.8 1999 6 manual(m5) f 18 26 p compa…
The first argument is the dataset that you need to use in the plot. The result of this code is an empty graph (default theme used by ggplot2 is theme_gray()).
Now you can add one or more layers to ggplot(). The function geom_point() adds a layer of points (scatterplot) to your plot.
The plot shows a negative relationship between the car engine size (in liters) and the car’s fuel efficiency on the highway (in miles per gallon). The bigger the size of the engine, the less efficient it is in consuming fuel.
The aes() (stands for aesthetics) function is used to map variables to the visual characteristics of a plot.
Mapping class to the size aesthetic display more clear information about the (outliers) in this data.
The visualization of the plot is not clear. We can fine tune the appearance of the graph using themes and improved visualization.
Here, I will use the theme theme_bw() (for black and white).
Exercise: Why are the points not blue?
Note: Instead of using the character name of the color “blue”, you can use the “#0000FF” hex code of this color.
Exercise: The points are blue now! Why?
Scale functions (which start with scale_) allow you to modify default scaling provided by ggplot2
Function | Syntax |
---|---|
scale_x_continuous() | Scales the x-axis for quantitative variables. Options include breaks for specifying tick marks, labels for specifying tick mark labels, and limits to control the range of the values displayed |
scale_y_continuous() | Same as above for y-axis |
scale_x_discrete() | Same as above for x-axis representing categorical variable |
scale_y_discrete() | Same as above for y-axis representing categorical variable |
scale_color_manual() | Specifies the colors (with option values) used to represent the levels of a categorical variable |
facet_wrap() and facet_grid() are used to partition a plot into a matrix of panels (side-by-side graphs), particularly useful for categorical variables.
Function | Syntax |
---|---|
facet_wrap(~var, nrow = r) | Partition plots for each level of variable (var) arranged into r rows |
facet_wrap(~var, ncol = c) | Partition plots for each level of variable (var) arranged into c columns |
facet_grid(row_var~col_var) | Partition plots for combination of rows variable (row_var) and columns variable (col_var) |
facet_grid(rows = row_var) | Partition plots for for each level of rows variable (row_var), arranged as a single column |
facet_grid(cols = col_var) | Partition plots for for each level of columns variable (col_var), arranged as a single row |
Note: The default argument scales = “fixed” is used if x and y scales are fixed across all panels; scales = “free_x” if x scale is free and y scale is fixed; scales = “free_y” if y scale is free and x scale is fixed; and scales = “free” if x and y scales vary across panels.
+ theme(legend.position = "right")
# the default+ theme(legend.position = "left")
+ theme(legend.position = "top")
+ theme(legend.position = "bottom")
suv <- mpg %>% filter(class == "suv")
p <- ggplot(suv, aes(displ, hwy, color = drv)) +
geom_point(size = 4) + theme_bw()
p + labs(title = "Fuel economy data",
subtitle = "Suv cars",
x = "Engine displacement, in litres",
y = "Highway miles per gallon",
color = "Type of drive train") +
scale_color_manual(labels = c("4wd", "Rear wheel drive"),
values = c("blue", "red")) +
theme(legend.position="bottom",
legend.key.size = unit(1.4, "cm"),
legend.key.height=unit(0.5, "cm"),
legend.key = element_rect(fill = "gray90", color = "red"),
text=element_text(family="serif"))
Or even you can create your own theme. The source code of the following theme theme_bluewhite() can be found from the link https://www.datanovia.com/en/blog/ggplot-themes-gallery/
Helps visualize whether a distribution of a data set is symmetric or skewed due to unusual observations (outliers). The grapgh displays the five numbers summary (minimum, maximum, median, first and third quartiles).
# Load "tidyquant" and "tidyverse" packages
library(tidyquant)
library(tidyverse)
# Get daily stock prices of Apple from the web in a tibble format
Apple <- tq_get("AAPL",from="2010-01-04",
to="2018-12-31",get="stock.prices")
# mutate returns series named as "ret"
Ap <- Apple %>%
mutate(Date = ymd(date),
Beforeclose = dplyr::lag(close),
ret = log(close) - log(Beforeclose)) %>%
drop_na(ret) #remove "NA"
# Plot log-returns series
P1 <- ggplot(Ap)+
geom_line(aes(x=Date,y=ret),color="gray30")+
labs(y="Log Returns", x="") +
scale_x_date(date_labels="%Y %b",
date_breaks="12 months") +
theme_bw()
# Plot histogram
P2 <- ggplot(Ap)+
geom_histogram(aes(ret),binwidth=0.004,
col="gray30",fill="gray80")+
annotate("text",x=c(-0.1,-0.1),y=c(70,60),
label=c("Skewness:-0.1738",
"Ex.kurtosis:3.5783"),
color=c("gray30","gray30"))+
labs(y="", x="Log Returns") +
theme_bw()
# Load "gridExtra" package
library(gridExtra)
# Place the two plots on one page
grid.arrange(P2, P1, nrow=1,
top="Apple, Inc. stock price from
January 04, 2010 to December 31, 2018")
?mtcars
?volcano
To plot the surface, use the following command:
plot_ly(z=~volcano) %>%
add_surface()
library(ggiraph)
p <- ggplot(iris,aes(x=Sepal.Length,
y=Petal.Length, colour=Species))+
geom_point_interactive(aes(tooltip=
paste0("<b>Petal Length:</b>",
Petal.Length,"\n<b>Sepal Length:</b>",
Sepal.Length,"\n<b>Species:</b>",
Species)),size=1)+
theme_bw()
tooltip_css <- "background-color:#f8f9f9;
padding:10px;
border-style:solid;
border-width:2px;
border-color:#125687;
border-radius:5px;"
ggiraph(code=print(p),
hover_css="cursor:pointer;
stroke:black;
fill-opacity:0.3",
zoom_max=5,
tooltip_extra_css=tooltip_css,
tooltip_opacity=0.9,
height_svg=4,width_svg=4,
width=1)
R package highcharter is a wrapper around javascript library highcharts.
library(highcharter)
p <- iris %>%
hchart("scatter",
hcaes(x="Sepal.Length",
y="Sepal.Width",group="Species")) %>%
hc_xAxis(title=list(text="Sepal Length"),
crosshair=TRUE) %>%
hc_yAxis(title=list(text="Sepal Width"),
crosshair=TRUE) %>%
hc_chart(zoomType="xy",inverted=FALSE) %>%
hc_legend(verticalAlign="top",align="right") %>%
hc_size(height=500,width=500)
htmltools::tagList(list(p))
Consider the gapminder data set on life expectancy, GDP per capita, and population by country.
library(gganimate)
library(gapminder)
p <- ggplot(gapminder,
aes(x=gdpPercap,
y=lifeExp,
size=pop,
color=country)) +
geom_point(show.legend=F,
alpha=0.7) +
scale_color_viridis_d() +
scale_size(range=c(2, 12)) +
scale_x_log10()+
theme_bw() +
labs(x="GDP per capita",
y="Life expectancy")
p +
transition_time(year) +
labs(title="Year: {frame_time}")
Consider the same previous data set in the previous slide. Here, we use the package gapminder to compare by continents.
p <- ggplot(gapminder,
aes(x=gdpPercap,
y=lifeExp,
size=pop,
color=country)) +
geom_point(show.legend=F,
alpha=0.7) +
scale_color_viridis_d() +
scale_size(range=c(2, 12)) +
scale_x_log10()+
theme_bw() +
labs(x="GDP per capita",
y="Life expectancy") +
facet_wrap(~continent)
p +
transition_time(year) +
labs(title="Year: {frame_time}")
The package networkD3 allows the use of interactive network graphs from the D3.js javascript library.
The package leaflet provides R bindings for javascript mapping library; leafletjs.
R package crosstalk allows crosstalk enabled plotting libraries to be linked. Through the shared key
variable, data points can be manipulated simultaneously on two independent plots.
invisible(lapply(c("crosstalk","htmltools"), library, character.only = TRUE))
shared_quakes <- SharedData$new(quakes[sample(nrow(quakes), 100),])
lf <- leaflet(shared_quakes,height=300) %>%
addTiles(urlTemplate='http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png') %>% addMarkers()
py <- plot_ly(shared_quakes,x=~depth,y=~mag,size=~stations,height=300) %>% add_markers()
div(div(lf,style="float:left;width:45%"),div(py,style="float:right;width:45%"))
R Markdown is a powerful tool to write up a good-looking report by combining R code chunks, analysis, and reporting into the same document.
This document is prepared by R Markdown.
© Esam Mahdi (2023)