Analysis in R: Intuitive understanding of trends! Creating Violin Plots with ggplot2

RAnalytics
スポンサーリンク

There are a number of charts that show trends in access, including bar charts, line charts, and, once you get used to them, the convenient box-and-whisker chart. But while these charts are very useful for analysis, they are difficult to get used to for showing trends. So we will show you how to create a fiddle chart in ggplot2 that is easy to see and easy to understand access trends.

スポンサーリンク

Preparation for Analysis

*If you do not have ggplot2 installed, please run install.packages(“ggplot2”) with R first.

Execute command

The format of the Excel data to be covered is as follows. From left to right: date, time zone, number of sessions (VISITS). The time zone is set between 00 and 23.

Run the following command.

###Loading the library#####
library("XLConnect")
library("tcltk")
library("ggplot2")
########

###Data loading#####
sheetSelect <- 1 #Enter the sheet number to be read
selectABook <- paste(as.character(tkgetOpenFile(title = "Select xlsx file",filetypes = '{"xlsx file" {".xlsx"}}',initialfile = "*.xlsx")), sep = "", collapse =" ")
MasterAnaData <- loadWorkbook(selectABook)
AnaData <- readWorksheet(MasterAnaData, sheet = sheetSelect)

UniqueDate <- unique(AnaData[, 1]) #Date unification
ViolinVisitsData <- NULL #To store visit data for plotting
ViolinPlotData <- NULL #To save data for plotting

###Data Processing#####
for (n in seq(UniqueDate)){
  #Extract data by date
  SubAnaData <- subset(AnaData, AnaData[, 1] == UniqueDate[n]) 
  
  #The time series of the data is not always 24 hours, so use "for" just in case.
  for (i in seq(nrow(SubAnaData))){
    
    hourVisits <- SubAnaData[i, 3] #Get visits by time
    
    if(identical(all.equal(hourVisits, 0), TRUE)) 
    {
      
      #Do not process when the "visit" is 0. 
      
    }else{
      
      ViolinVisitsData <- rep(type.convert(SubAnaData[i, 2]), hourVisits) 
      ViolinPlotData <- rbind(ViolinPlotData, cbind(UniqueDate[n], as.numeric(ViolinVisitsData)))
    }}
  
}

ViolinPlotData <- as.data.frame(ViolinPlotData) 
ViolinPlotData[, 2] <- type.convert(as.character(ViolinPlotData[, 2])) 

#Plot preparation and fill color can be set with FILL.
p <- ggplot(ViolinPlotData, aes(factor(ViolinPlotData[, 1]), ViolinPlotData[, 2], fill = factor(ViolinPlotData[, 1])))

#vaiolinplot setting
p <- p + geom_violin(scale = "count")
#If you want to adjust the figure and Y axis to time, delete the last + coord_flip().
p <- p +
  coord_cartesian(ylim = -0.5:24.5) +
  labs( x = " ", y = " ") +
  #scale_y_continuous(0:23) +
  theme(axis.text.x = element_text(colour="black", size = 13),
        plot.background = element_rect(fill = NA, colour = NA), #生成り色"#fbfaf5"
        panel.background = element_rect(linetype = "solid", colour = "black", fill = NA), #絹鼠"#dddcd6"
        panel.grid.major = element_line(color = NA),
        panel.grid.minor = element_line(color = NA),
        axis.title.x = element_text(size = 13),
        axis.title.y = element_text(size = 13,angle = 90),
        axis.text.y = element_text(colour="black", size = 11)) +
  coord_flip()

print(p)

Output Examples

The time series of the data is not all These are some of the number of sessions (VISITS) on the same day and the next day that were introduced by Hatena Bookmark. Although the numbers cannot be read from the figure, the trend is clear at a glance. I think it will be sufficient as a study material by adding the necessary number of sessions and interpretation.ays 24 hours, so use “for” just in case.

バイオリンプロット

I hope this makes your analysis a little easier !!

Amazon audibleの登録の紹介

プライム会員限定で2024年7月22日まで3か月無料体験キャンペーン開催中です。無料体験後は月額1,500円で聞き放題です。なお、聞き放題対象外の本はAudible会員であれば非会員価格の30%引きで購入することが可能です。

Amazon audibleはプロのナレーターが朗読した本をアプリで聞くことができるサービスで、オフライン再生も可能です。通勤や作業のお供にAmazon audibleのご登録はいかがでしょうか。

・AmazonのAudible

https://amzn.to/3L4FI5o

Copied title and URL