matrix, df -> transactions, transactions 데이터 살펴보기

250x250

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

여정의 기록

matrix, df -> transactions, transactions 데이터 살펴보기 본문

공부/R

matrix, df -> transactions, transactions 데이터 살펴보기

Chelsey 2021. 9. 16. 16:16

728x90

matrix type -> transactions type

mat <-matrix(c(1,1,1,0,0,
         1,1,1,1,0,
         0,0,1,1,0,
         0,1,0,1,1,
         0,0,0,1,0),nrow = 5, byrow=TRUE)

nrow = 5 이면 5행으로 만들어준다. (옵션 없이 하면 data갯수 행 , 1열로 나옴)

byrow 옵션이 없으면 열단위로 데이터가 들어감. 행단위로 넣으려면 byrow=TRUE

(TRUE | T)

열과 행 이름 지정

rownames()

# paste("row",1:5)  # row 1 , row 2 , row 3 , row 4 , row 5 (공백이 있음)
rownames(mat) <- paste0("row",1:5) # paste0 공백이 없음

colnames()

colnames(mat) <- letters[1:5] # letters : alphabet 모음

row : 한 사람의 데이터를 의미할 수 있다.

col : 데이터의 종류들

str(mat)
# - attr(*, "dimnames")=List of 2 : 저장된 행과 열의 이름 
class(mat) # matrix, array

matrix data -> transactions data로 변환

mat.trans <- as(mat, "transactions")

dataframe data -> transactions data

경고가 있다면 numeric 수치화에서 logical data로 변경한다.

df <- as.data.frame(mat)
df.trans <- as(df, "transactions")

# warning
# logical ( TRUE/FALSE)
# numeric data of dataframe -> logical data of dataframe
# have to change
df <- as.data.frame(sapply(df, as.logical))
df.trans <- as(df, "transactions")

list data -> transactions data

mylist <- list(row1=c("a","b","c"),
               row2=c("a","d"),
               row3=c("b","e"),
               row4=c("a","d","e"),
               row5=c("b","c","d"))
# list 구조로 row1...벡터가 저장되어있다.

mylist.trans <- as(mylist, "transactions")

transactions file 뜯어보기

data <- read.transactions("data.csv", sep=",")

9835행 169열

거래횟수 9835회, 상품갯수 169가지

summary(data)

density of 0.02609146 대부분의 열이 값이 0이어서
density 값이 낮게 나옴

size 1 : 9835건 중에 2159건이 1개의 물건만 샀다.
size : 한 행의 총 합

# 데이터 확인 함수
inspect(data)
inspect(data[1:10])

itemFrequency(data) # 전체항목의 판매 비율
itemFrequency(data, type='absolute') # 전체항목의 판매 건수

itemFrequency(data[,1:3]) # 관심있는 항목 지정

시각화

itemFrequencyPlot(data)
itemFrequencyPlot(data, type='absolute')

itemFrequencyPlot(data, topN=20, type='absolute') # 가장 많이 팔린 20가지 상품 시각화
itemFrequencyPlot(data, support=0.1). # 지지도가 0.1 이상인 상품 시각화

728x90

저작자표시 비영리 변경금지 (새창열림)

'공부 > R' 카테고리의 다른 글

[R] 변수(rename, rm, save, load, paste), qplot, hist, mpg(ggplot2), readxl, dataframe(구조), dplyr(%>%), na.rm (0)	2021.09.18
read.csv가 아닌 read.transactions 형식으로 파일 불러오기 (0)	2021.09.16
[R] list and vector (0)	2021.09.13

'공부/R' Related Articles

여정의 기록

matrix, df -> transactions, transactions 데이터 살펴보기 본문

matrix, df -> transactions, transactions 데이터 살펴보기

matrix type -> transactions type

열과 행 이름 지정

dataframe data -> transactions data

list data -> transactions data

transactions file 뜯어보기

'공부 > R' 카테고리의 다른 글

티스토리툴바