1. Where is the mask applied in a transformer model?
Where is the mask applied in a transformer model?
2. In reinforcement learning, define on-policy and off-policy. Compare TRPO and PPO, discussing their differences and advantages.
In reinforcement learning, define on-policy and off-policy. Compare TRPO and PPO, discussing their differences and advantages.
3. Debugging an End-to-End ML Pipeline
Discuss how you would debug variance issues in an end-to-end machine learning pipeline that includes model training, inference, and evaluation. Explain how distributed training is implemented and identify which components in the model training process consume the most GPU memory (input data, weights, gradients, parameters in optimizer, etc). Also, design a testing approach to prove that an autonomous vehicle drives better than an average human.