1. Employee Contacts
| Data Frame column: Month-Key, Date-Key, Employee ID, Contact channel, Department
Rows:2019-09-01,2019-09-25,’E12345’,Email,Ship
1. Find how many times an employee contacts each month
2. Find the latest contact date in each month for each employee
3. Create a new column 'status' when employee contacts more than 10 times per months
->"hi”,otherwise->"lo”
4. lf you build a classification model to predict whether the employee will contact in the
next month, how will you split the data to train, test, validation?
Ans:
iteration1: train: data between 2020-01 to 2020-05: test: data of 2020-06
iteration2: train: data between 2020-01 to 2020-06; test: data of 2020-07
iteration3: train: data between 2020-01 to 2020-07: test: data of 2020-08
iteration4: train: data between 2020-01 to 2020-07: test: data of 2020-09
Rows:2019-09-01,2019-09-25,’E12345’,Email,Ship
1. Find how many times an employee contacts each month
2. Find the latest contact date in each month for each employee
3. Create a new column 'status' when employee contacts more than 10 times per months
->"hi”,otherwise->"lo”
4. lf you build a classification model to predict whether the employee will contact in the
next month, how will you split the data to train, test, validation?
Ans:
iteration1: train: data between 2020-01 to 2020-05: test: data of 2020-06
iteration2: train: data between 2020-01 to 2020-06; test: data of 2020-07
iteration3: train: data between 2020-01 to 2020-07: test: data of 2020-08
iteration4: train: data between 2020-01 to 2020-07: test: data of 2020-09
2. Evaluate Model
How will you evaluate model
3. Coeficient of Regression
| The coeeficient difference between logistic regression and linear regression
4. 降维
什么是降维?PCA和NMF的区别在哪里?降维一般用的什么方法来Visualize? tSNE和UMAP的区别在哪里?
5. Loss
目标函数是什么?