All About Machine Learning: Machine Learning Interview Questions & Answers

Machine Learning Interview Questions & Answers - 2

Question: What is pruning of decision tree and why we do it?
Ans: Pruning is a technique in machine learning and search algorithms that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting.

Question: If we have 100gb of data , how will you manage to build model on your machine?
Ans: Refer to this link.

Question: What is Central Limit Theorem?
Ans: The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed.
More Read

Question: How does parallel processing in XG Boost works? (Remember it is boosting so trees are
dependent on the above tree )
Ans: Xgboost doesn't run multiple trees in parallel, as you need predictions after each tree to update
gradients.Rather it does the parallelization within a single tree my using openMP to create branches
independently. To observe this,build a giant dataset and run with n_rounds=1. You can observe that all
your cores are getting used on one tree. This is why it's so fast.

Question: Why Light GBM is faster than XG Boost?
Ans: Light Gradient Boosting - it is based on decision tree algorithms, it splits the tree leaf wise with the best fit whereas other boosting algorithms split the tree depth wise or level wise rather than leaf-wise. So when growing on the same leaf in Light GBM, the leaf-wise algorithm can reduce more loss than the level-wise algorithm and hence results in much better accuracy which can rarely be achieved by any of the existing boosting algorithms. Also, it is surprisingly very fast, hence the word ‘Light’.
More read

Question 19: What are different hyperpaarmeters in all the above algorithms?

Question 20: How do you find best hyper-parameters?

Question 21: What is Bias Variance tradeoff?

Question 22: What is Overfitting under fitting and best fit?

Question 23: How do you identify whether model is fitting well or over fitted or under fitting?

Question 24: What is the difference between objective and evaluation functions?

All About Machine Learning

Featured post

Quiz: Data PreProcessing

Tuesday, 7 January 2020

Machine Learning Interview Questions & Answers - 2

Machine Learning Interview Questions & Answers - 2

No comments:

Post a Comment