<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KVGHS6G" height="0" width="0" style="display:none;visibility:hidden"></iframe>

Reddit数据相关面试真题

职位分类
全部
数据相关
计算机科学
人工智能
产品经理
BQ
面试题
全部(26)
SQL(3)
Coding(7)
ML basics(9)
Stats(1)
Product Case(4)
高频题(0)
Other(1)
全部(26)
SQL(3)
Coding(7)
ML basics(9)
Stats(1)
Product Case(4)
高频题(0)
Other(1)
1.Billing Status Edge Cases and Sorting
2.ML Case Study: Ad Click Modeling
3.Classification Task Based on User Engagement Data
4.Binary Prediction Model Creation and Deployment
5.Recommendation Strategy for Ads
6.Calculate Ads Revenue Using SQL
7.Implement a Random Generator Based on Probabilities
8.Loss Function Definition and Model Evaluation
9.Reducing Candidate Space in Recommendation Systems
10.Handling Multiple Categories in a Feature
11.Normalizing Numeric Features
12.Encoding Categorical Features
13.Implement functionality for JSON data analysis
14.SQL and Case Study Questions
15.Calculate Event Counts
16.Favorite App Product Case
17.SQL Top 5 of Each Group
18.Python Coding Challenge
19.Machine Learning Project Experience
20.Aggregate JSON Objects by Dimensions
21.Statistical Test Selection
22.Opportunity Sizing and Experiment Design
23.Metric Investigation and Business Interpretation
24.Sorting Based on Timestamps
25.Binary Classification Model Training and Testing
26.Machine Learning System Design for an Advertising System
1. Billing Status Edge Cases and Sorting
In a Billing Status problem, you are asked to consider various edge cases, such as whether the data needs to be sorted by time and if monetary columns might change. The first task is to address these factors, which may take some time. The second task involves dealing with an output issue after an overwrite, where you need to check a long and complex log. The interviewer may provide hints if there are points not implemented correctly from the first task, such as sorting by time. The third task is to discuss your approach, given a limited time frame of about five minutes.
2. ML Case Study: Ad Click Modeling
How would you design a model for Reddit's advertisement click prediction? Note that the focus is on modeling rather than system architecture.
3. Classification Task Based on User Engagement Data
You are given a dataset for a classification task. The dataset includes the following columns: time spent watching category A, time spent watching category B, time spent watching category C, a category presented (A, B, or C), and a binary indicator (1 if the user clicked the presented category, else 0). The dataset consists of several thousand rows, is balanced, and contains no missing values. Your task is to perform classification to predict whether a user will click on the presented category based on the time spent watching the different categories. Discuss your approach, the model you would choose, and how you would evaluate its performance.
4. Binary Prediction Model Creation and Deployment
Given a small dataset with a few thousand entries and four columns, including one categorical column, describe how you would perform basic data processing and use a package to create a binary prediction model. Additionally, explain how you would deploy and monitor the model in a production environment.
5. Recommendation Strategy for Ads
Describe how you would recommend ads to users.