Analysis in R: Commands related to data frames

The use of R for web analysis is growing, but I think it is not yet widespread. This section introduces commands related to R data frames that are useful to those considering using R for web analysis. Once you have a handle on data frame operations, you will be able to perform more advanced tasks.

The following data is an example.

###Creating Data Examples#####
Day <- c("5&#26376;07&#26085;", "5&#26376;08&#26085;", "5&#26376;09&#26085;", "5&#26376;10&#26085;", "5&#26376;11&#26085;",
         "5&#26376;12&#26085;", "5&#26376;13&#26085;", "5&#26376;14&#26085;", "5&#26376;15&#26085;")
Sales <- c(5, 2, NA, 4, NA, 5, 6, 7, 8)
Cost <- c(2000, 400, 0, 800, 0, 2000, 2200, 2400, 2600)
########

###Conversion of data into data frames#####
AnaDate <- data.frame(Day, Sales, Cost)
########

AnaDate
      Day Sales Cost
1 5&#26376;07&#26085;     5 2000
2 5&#26376;08&#26085;     2  400
3 5&#26376;09&#26085;    NA    0
4 5&#26376;10&#26085;     4  800
5 5&#26376;11&#26085;    NA    0
6 5&#26376;12&#26085;     5 2000
7 5&#26376;13&#26085;     6 2200
8 5&#26376;14&#26085;     7 2400
9 5&#26376;15&#26085;     8 2600

Introduction of data frame processing functions

Check data contents: summary

summary(AnaDate)

      Day        Sales            Cost
 5&#26376;07&#26085;:1   Min.   :2.000   Min.   :   0
 5&#26376;08&#26085;:1   1st Qu.:4.500   1st Qu.: 400
 5&#26376;09&#26085;:1   Median :5.000   Median :2000
 5&#26376;10&#26085;:1   Mean   :5.286   Mean   :1378
 5&#26376;11&#26085;:1   3rd Qu.:6.500   3rd Qu.:2200
 5&#26376;12&#26085;:1   Max.   :8.000   Max.   :2600
 (Other):3   NA's   :2  

Check the number of missing values in the data: sum of totals and is.na for NA value check

sum(is.na(AnaDate))

 [1] 2

#It indicates that there are two missing values in the data.

Reference to specified row or column: parenthesis operator [i, j].

*i is the column and j is the row.

Extract the second row of data.

AnaDate[complete.cases(AnaDate), ] 

   Day &#12288;Sales Cost
1 5&#26376;07&#26085; &#12288;5 &#12288;2000
2 5&#26376;08&#26085;&#12288; 2 &#12288; 400
4 5&#26376;10&#26085;   4    800
6 5&#26376;12&#26085;   5   2000
7 5&#26376;13&#26085;   6   2200
8 5&#26376;14&#26085;   7   2400
9 5&#26376;15&#26085;   8   2600

Data will be displayed with columns 3 and 5 containing missing values deleted.

Merge data frames: rbind for horizontal merging and cbind for vertical merging

rbind(AnaDate, AnaDate)

     Day &#12288;Sales Cost
1  5&#26376;07&#26085;     5 2000
2  5&#26376;08&#26085;     2  400
3  5&#26376;09&#26085;    NA    0
4  5&#26376;10&#26085;     4  800
5  5&#26376;11&#26085;    NA    0
6  5&#26376;12&#26085;     5 2000
7  5&#26376;13&#26085;     6 2200
8  5&#26376;14&#26085;     7 2400
9  5&#26376;15&#26085;     8 2600
10 5&#26376;07&#26085;     5 2000
11 5&#26376;08&#26085;     2  400
12 5&#26376;09&#26085;    NA    0
13 5&#26376;10&#26085;     4  800
14 5&#26376;11&#26085;    NA    0
15 5&#26376;12&#26085;     5 2000
16 5&#26376;13&#26085;     6 2200
17 5&#26376;14&#26085;     7 2400
18 5&#26376;15&#26085;     8 2600

cbind(AnaDate, AnaDate) 

  Day Sales Cost Day   Sales Cost
1 5&#26376;07&#26085; 5 2000 5&#26376;07&#26085; 5   2000
2 5&#26376;08&#26085; 2  400 5&#26376;08&#26085; 2    400
3 5&#26376;09&#26085; NA   0 5&#26376;09&#26085; NA     0
4 5&#26376;10&#26085; 4  800 5&#26376;10&#26085; 4    800
5 5&#26376;11&#26085; NA   0 5&#26376;11&#26085; NA     0
6 5&#26376;12&#26085; 5 2000 5&#26376;12&#26085; 5   2000
7 5&#26376;13&#26085; 6 2200 5&#26376;13&#26085; 6   2200
8 5&#26376;14&#26085; 7 2400 5&#26376;14&#26085; 7   2400
9 5&#26376;15&#26085; 8 2600 5&#26376;15&#26085; 8   2600

Batch processing for rows or columns: apply command

Changing the mean in the function allows for different processing. In the example, the average is calculated.

#Processing for columns
apply(AnaDate[, 2:ncol(AnaDate)], 1, mean, na.rm=TRUE)
[1] 1002.5  201.0    0.0  402.0    0.0 1002.5 1103.0 1203.5 1304.0

#Processing for rows
apply(AnaDate[, 2:ncol(AnaDate)], 2, mean, na.rm=TRUE)
      Sales        Cost
   5.285714 1377.777778 

I hope this makes your analysis a little easier !!

Prices and shipping availability may change. Please refer to the product page at time of purchase.
Content displayed on this site is provided by Amazon and may be updated or removed.
Amazon Associate, karada-good earns income through qualifying sales.
Copied title and URL