(1)机器学习
例题参考:
- Explain the working of a Random Forest Machine Learning Algorithm.
- Describe K-Means Clustering.
- How do you parallelize machine learning algorithms?
- How is logistic regression done?
- How do you build a random forest model?
- How can you avoid overfitting your model?
- How do you find RMSE and MSE in a linear regression model?
- After studying the behavior of a population, you have identified four specific individual types that are valuable to your study. You would like to find all users who are most similar to each individual type. Which algorithm is most appropriate for this study?
- What is the goal of A/B Testing?
- Which is your favorite machine learning algorithm and why?
- Have you ever created an original algorithm? How did you go about doing that and for what purpose?
(2)统计学
例题参考:
- What is the law of large numbers?
- What are the confounding variables?
- What is selection bias?
- What are the types of biases that can occur during sampling?
- What is survivorship bias?
- Difference between Point Estimates and Confidence Interval
- How can outliers be treated?
(3)SQL
例题参考:
- Write a basic SQL query that lists all orders with customer information.
- You are given a dataset on cancer detection. You have built a classification model and achieved an accuracy of 96 percent. Why shouldn't you be happy with your model performance? What can you do about it?
- We want to predict the probability of death from heart disease based on three risk factors: age, gender, and blood cholesterol level. What is the most appropriate algorithm for this case?
常见面试真题参考答案及解析
我们挑选部分Intel经典面试真题进行解析,答案仅供参考。
原题:How do you build a random forest model?
随机森林由若干决策树(decision tree)组成。如果将数据拆分为不同的pakages,并在每个不同的数据组中创建一个决策树,那么随机森林就将所有这些树合并在一起。
创建随机森林模型的步骤:
1.在m个总特征中随机选择k个特征,k<m
2.在所选出来的k个特征中,使用最佳分割点(best split point)计算节点D
3.再次利用最佳分割点将节点分割为子节点
4.重复第二和第三步,直到叶节点完成
5. 重复第一到第四步n次,从而创造n个随机树,形成随机森林