Database Engineering & Data Systems Topics
Database design patterns, optimization, scaling strategies, storage technologies, data warehousing, and operational database management. Covers database selection criteria, query optimization, replication strategies, distributed databases, backup and recovery, and performance tuning at database layer. Distinct from Systems Architecture (which addresses service-level distribution) and Data Science (which addresses analytical approaches).
Set Operations and Complex Aggregations
Understanding UNION, UNION ALL, EXCEPT, INTERSECT operations and their performance implications. Complex GROUP BY queries, HAVING clauses, and multi-level aggregations.
SQL Scenarios
Advanced SQL query design and optimization scenarios, including complex joins, subqueries, window functions, common table expressions (CTEs), set operations, indexing strategies, explain plans, and performance considerations across relational databases.
Cloud Data Warehouse Design and Optimization
Covers design and optimization of analytical systems and data warehouses on cloud platforms. Topics include schema design patterns for analytics such as star schema and snowflake schema, purposeful denormalization for query performance, column oriented storage characteristics, distribution and sort key selection, partitioning and clustering strategies, incremental loading patterns, handling slowly changing dimensions, time series data modeling, cost and performance trade offs in cloud managed warehouses, and platform specific features that affect query performance and storage layout. Candidates should be able to discuss end to end design considerations for large scale analytic workloads and trade offs between latency, cost, and maintainability.
Large Scale Distributed Database Systems
Designing database systems handling petabyte-scale data, designing for global distribution across data centers, handling eventual consistency, managing data sovereignty and compliance requirements, and architectural decisions in complex multi-region setups.
Indexing Strategy and Selection
Covers index design principles and practical selection of indexes to accelerate queries while managing storage and write cost. Topics include index types such as B tree hash and bitmap indexes and full text and functional indexes; single column composite and covering indexes; clustered versus nonclustered index architectures and partial or filtered indexes. Candidates should reason about index selectivity and cardinality and how statistics and histograms influence optimizer choices. Also assess index maintenance overhead fragmentation and rebuild strategies and the trade off between faster reads and slower inserts updates and deletes. Practical skills include reading execution plans to identify missing or inefficient indexes proposing index consolidation or covering index designs testing and benchmarking index changes and understanding interactions between indexing partitioning and denormalization.
Structured Query Language Join Operations
Comprehensive coverage of Structured Query Language join types and multi table query patterns used to combine relational data and answer business questions. Topics include inner join, left join, right join, full outer join, cross join, self join, and anti join patterns implemented with NOT EXISTS and NOT IN. Candidates should understand equi joins versus non equi joins, joining on expressions and composite keys, and how join choice affects row counts and null semantics. Practical skills include translating business requirements into correct join logic, chaining joins across two or more tables, constructing multi table aggregations, handling one to many relationships and duplicate rows, deduplication strategies, and managing orphan records and referential integrity issues. Additional areas covered are join conditions versus WHERE clause filtering, aliasing for readability, using functions such as coalesce to manage null values, avoiding unintended Cartesian products, and basic performance considerations including join order, appropriate indexing, and interpreting query execution plans to diagnose slow joins. Interviewers may probe result correctness, edge cases such as null and composite key behavior, and the candidate ability to validate outputs against expected business logic.
Data Modeling for DoorDash Domain
Data modeling concepts tailored to the DoorDash domain, including conceptual and logical modeling, entity-relationship and dimensional modeling, schema design for transactional OLTP systems and analytical workloads, domain-driven design considerations for orders, restaurants, menus, drivers, deliveries, payments, and logs, data access patterns, and governance and schema evolution for a high-traffic on-demand delivery platform.
CTEs & Subqueries
Common Table Expressions (CTEs) and subqueries in SQL, including syntax, recursive CTEs, usage patterns, performance implications, and techniques for writing clear, efficient queries. Covers when to use CTEs versus subqueries, refactoring patterns, and potential pitfalls.
Common Table Expressions and Subqueries
Covers writing and structuring complex SQL queries using Common Table Expressions and subqueries, including when to prefer one approach over another for readability, maintainability, and performance. Candidates should be able to author WITH clauses to break multi step logic into clear stages, implement recursive CTEs for hierarchical data, and use subqueries in SELECT, FROM, and WHERE clauses. This topic also includes understanding correlated versus non correlated subqueries, how subqueries interact with joins and window functions, and practical guidance on choosing CTEs, subqueries, or joins based on clarity and execution characteristics. Interviewers may probe syntax, typical pitfalls, refactoring nested queries into CTEs, testing and validating each step of a CTE pipeline, and trade offs that affect execution plans and index usage.