Learning Journal Week 4- CST363
As we reach the halfway point of the course, I've had the chance to learn several foundational concepts that build on each other, moving from writing queries to designing the databases themselves.
Five important things I have learned so far:
Advanced SQL Querying Techniques: Beyond basic
SELECTstatements, I learned how to refine query results. This includes usingDISTINCTto eliminate duplicate rowsand the LIKEpredicate with wildcards (%,_) for powerful pattern matching in strings. I also learned how to use built-in functions to manipulate data, like concatenating strings or rounding numbers directly within a query . The Power and Variety of JOINs: I now understand that there are multiple ways to join tables, from the modern
INNER JOINsyntax to the older comma-based syntax. A key takeaway was the practical use of a LEFT OUTER JOINto find information that is "missing", for example, to list all instructors and show a count of zero for those who advise no students, which an inner join would completely exclude. Database Design Using ER Models: I learned that a good database starts with a strong conceptual design. The process involves identifying entities, attributes, and the relationships between them (one-to-many, many-to-many)
. Using a tool like MySQL Workbench to create an Entity-Relationship (ER) diagram helps visualize this structure before writing any CREATE TABLEstatements. The Principle of Normalization: I learned that a "good" database design avoids data redundancy to prevent update, insert, and delete anomalies
. This is achieved through normalization, following rules like Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF), which are based on the concept of functional dependencies . Using Indexes to Optimize Performance: I learned that indexes, like B+ Trees, are crucial for speeding up data retrieval by avoiding full table scans, especially for range-based queries
. However, this comes at a cost, as indexes degrade the performance of INSERT,UPDATE, andDELETEstatements since the index itself must also be updated. The EXPLAINcommand is a powerful tool to see if the database is actually using an index for a given query.
Three Questions I Have:
The video on normalization mentioned that while normalization is good, the resulting
JOINoperations can sometimes negatively impact performance. How do professional database designers handle this trade-off? Are there specific situations where they intentionally "de-normalize" a database for performance reasons? When it comes to indexes, is it better to start with a minimal set of indexes (e.g., on primary and foreign keys) and add more as performance issues are discovered, or are there design principles that help predict the best columns to index from the very beginning?
The video on self-joins introduced recursive SQL for navigating hierarchical data, like an employee-manager structure
. Besides organizational charts, what are some other common, real-world examples where a recursive query is the most effective solution?
Comments
Post a Comment