---Advertisement---

Data Scientist Interview Questions-Infosys

By Siva

Published On:

---Advertisement---

➡️Gen AI Questions:

1. How do you productionalize an LLM model? What aspects will you consider after deploying the LLM?
2. What is the ‘all-MiniLM-L6-v2’ model? What are the differences between all-MiniLM-L6-v2 and all-MiniLM-L6-v1 models?
3. What are the parameter differences between models like GPT-4all, Llama-2, and Llama-3?
4. What is the fundamental essence behind small models?
5. When they say GPT-4o is a smaller and more compact version of GPT-4, what changes are made in GPT-4o to make it compact?
6. What is an effective strategy to ingest and process data for LLMs?
(Discuss different chunking strategies and their pros/cons.)

➡️Python-Related Questions:

1. What is multithreading in Python?
2. How do you load heavy files in Python? How can you optimize the loading time?

➡️Machine Learning Questions:

1. In which scenarios would you use logistic regression over XGBoost?
2. Why is XGBoost such a popular ML model? Can you describe its features (e.g., construction of trees, outlier treatment, missing value handling in XGBoost)?
3. What is tree pruning in XGBoost?

➡️Statistics Questions:

1. What is ANOVA?
2. What are the assumptions of ANOVA?
3. How will you test the individual assumptions, and what will you do if any of the assumptions fail? Would you still proceed with ANOVA in this case?

---Advertisement---

Leave a Comment