1. Billing Status Edge Cases and Sorting
In a Billing Status problem, you are asked to consider various edge cases, such as whether the data needs to be sorted by time and if monetary columns might change. The first task is to address these factors, which may take some time. The second task involves dealing with an output issue after an overwrite, where you need to check a long and complex log. The interviewer may provide hints if there are points not implemented correctly from the first task, such as sorting by time. The third task is to discuss your approach, given a limited time frame of about five minutes.
2. ML Case Study: Ad Click Modeling
How would you design a model for Reddit's advertisement click prediction? Note that the focus is on modeling rather than system architecture.
3. Classification Task Based on User Engagement Data
You are given a dataset for a classification task. The dataset includes the following columns: time spent watching category A, time spent watching category B, time spent watching category C, a category presented (A, B, or C), and a binary indicator (1 if the user clicked the presented category, else 0). The dataset consists of several thousand rows, is balanced, and contains no missing values. Your task is to perform classification to predict whether a user will click on the presented category based on the time spent watching the different categories. Discuss your approach, the model you would choose, and how you would evaluate its performance.
4. Binary Prediction Model Creation and Deployment
Given a small dataset with a few thousand entries and four columns, including one categorical column, describe how you would perform basic data processing and use a package to create a binary prediction model. Additionally, explain how you would deploy and monitor the model in a production environment.
5. Recommendation Strategy for Ads
Describe how you would recommend ads to users.