---Advertisement---

Data Engineer Interview Questions-Google

By Siva

Published On:

---Advertisement---

1/ What is the difference between HDFS and traditional file systems?

2/ Explain the concept of MapReduce and how it works.

3/ What is YARN in Hadoop, and how does it improve resource management?

4/ How does Hadoop ensure fault tolerance with HDFS?

5/ What are the differences between Hive and HBase? When would you use each?

6/ How would you handle small files in Hadoop?

7/ Explain the role of block size in HDFS and its impact on performance.

8/ How do you optimize a slow MapReduce job?

9/ What is the difference between Direct Query and Import Mode in Power BI? When would you use each?

10/ Explain DAX. What are some commonly used DAX functions?

11/ How do you optimize Power BI reports for large datasets?

12/ What is Row-Level Security (RLS), and how do you implement it in Power BI?

13/ How do you create relationships between tables in Power BI?

14/ What are the steps to refresh and schedule data updates in Power BI?

15/ How would you troubleshoot performance issues in a Power BI report?

16/ What is the Power Query Editor, and how do you use it to transform data?

17/ What are the key steps of the ETL process, and why is each step important?

18/ How do you optimize ETL pipelines for performance and scalability?

19/ Explain incremental data loading vs. full data load. When would you choose each?

20/ How do you handle data quality issues during ETL?

21/ What tools have you used for ETL, and why did you choose them?

22/ How do you handle schema changes in source systems?

23/ What is data partitioning, and how does it improve ETL performance?

24/ Describe how you would debug and resolve a failed ETL job.

25/ How would you design a scalable data pipeline to process 1 TB of data daily using AWS tools?

---Advertisement---

Leave a Comment