3-4

E-commerce Data 분석

3-4

가는중 2023. 5. 21. 16:14

## Environment : Anaconda-navigator
## Programming Language : Python 3
## Import Pandas
## import seaborn as sns
## import Numpy as np
## import matplotlib pyplot as plt

## from datetime import timedelta

## from datetime import datetime, date
## from scipy.sparse import csr_matrix
## from math import sqrt
## from tqdm import tqdm_notebook as tqdm
## from sklearn.metrics.pairwise import cosine_similarity
## from sklearn.model_selection import train_test_split
## from sklearn.metrics import mean_squared_error
## import plotly.express as px

1. 데이터를 불러오기 및 데이터 전처리
2. 데이터 분석
3. 추천시스템

기존 3-3 데이터를 이어서 하겠습니다.

https://datapractice0815.tistory.com/248

3-3

## Environment : Anaconda-navigator ## Programming Language : Python 3 ## Import Pandas ## import seaborn as sns ## import Numpy as np ## import matplotlib pyplot as plt ## from datetime import timedelta ## from datetime import datetime, date ## from scipy

datapractice0815.tistory.com

2-7분석을 기반으로 새로운 카테고리 feature를 만들어 봅니다.

sample_df['cat2_feature'] = np.where((sample_df['category2_name'] == '상의')
                                        |(sample_df['category2_name'] == '하의')
                                        |(sample_df['category2_name'] == '가방')
                                        |(sample_df['category2_name'] == '신발')
                                        |(sample_df['category2_name'] == '주방용품')
                                        |(sample_df['category2_name'] == '언더웨어')
                                        |(sample_df['category2_name'] == '원피스/점프슈트') , 1, 0)

epoch_loss = fit(X_sparse, y_data.values, config)

plt.plot(epoch_loss)
plt.title('Loss per epoch')
plt.show()

다른 feature를 추가하여 FM의 성능을 관찰해 보겠습니다.