CSUMB CS Online Journey: May 2024

Tuesday, May 28, 2024

CST363 - Week 5

CST 363 Introduction to Database

If indexes are supposed to speed up performance of query, what does the author mean by a slow index?

In the article Slow Indexes, Part I, author Markus Winand points out the effectiveness of rebuilding the index to increase performance. He explains that the use of a self-balanced B-tree prevents the deep growth of a large index. Most of what a rebuild could do is reduce the leaf nodes. Reducing 20% to 30% of leaf nodes does not reduce the depth of an index, which in return helps with only a 0%—2% reduction in a case like INDEX UNIQUE SCAN.

The index depth of millions of records in a B-tree is only four or five. Hence, a B-tree traversal is very efficient in finding a leaf node. According to the author, tree traverses are believed to be the culprit of slow indexes, but in reality, the leaf node chain leads to slow queries.

The speed of an index is dependent on the scan operation. To illustrate this, Winand explains the following three different combinations of the tree traversal only and tree traversal with leaf node chain:

INDEX UNIQUE SCAN: the lookup operation is limited to finding the unique value. The lookup stops when the unique value is found, making this index the fastest.
INDEX RANGE SCAN: the lookup is somewhat ambiguous as performance depends on the count of the matching entries. The smaller the index to read, the faster the performance.
TABLE ACCESS BY INDEX ROWID: The additional steps to access the table rows to find a match come at a high cost and are potentially slow if the number of matched entries is high.

The point is that the index's performance depends on how the database uses the index.

Sunday, May 19, 2024

CST363 - Week 4

CST 363 Introduction to Database

1- Five things I have learned in the course so far.
I have been learning about database systems for the past four weeks, focusing on relational model design and rules. The course highlights the advantages of using database systems compared to saving and retrieving data from file systems. Moreover, I have been practicing commands for Structured Query Language, or SQL, to create statements for inserting, deleting, retrieving, and updating data from the database. I also learned about the data structure and how relational data is organized in tables, columns, and rows. This week, I am studying the three phases of database design: analysis, logical design, and physical design. The Java labs are also helpful in understanding the Schema creation and how to represent data and improve performance by creating data indexes.

2- Three questions I still have about databases.
How is data integrity thoroughly tested in large databases?
How can data retrieval performance be improved? This seems to be a very common challenge with large databases in virtual environments.
In what ways is the data modeled in the NoSQL database, and how does it work with NoSQL management systems?

Tuesday, May 14, 2024

CST363 - Week 3

CST 363 Introduction to Database

The third normal form of data ensures data consistency across all tables. First, the data must be in a second normal form. The columns in this format must not have a transitive dependency among non-key attributes. For example, if we have three columns in one table called A, B, and C. If column A is a primary key and column B depends on column A , but column C depends on column B and not column A, this form is a violation of the third normal form. The solution is to create another table for the direct dependent attributes. In other words, "a non-key column depends on the key, the whole key, and nothing but the key, so help me Codd."

An SQL view is the restructure of table columns without changing the design. The view query is created by a statement called ‘Create View’ followed by a user-defined name of the view. The outcome of the query is another table that is different from the base table. SQL view is used in combination with SQL queries for different purposes. The advantages of SQL view are producing a table that masks sensitive data from views, generating faster results with optimal queries, and simplifying the writing of complex queries.

Unlike data tables, SQL view is usually not stored like a base table. Instead, the view query is executed with the statement.

Tuesday, May 7, 2024

CST363 - Week 2

CST 363 Introduction to Database

1- SQL has the flexibility to join tables on any column(s) using any predicate (=, >, < ). Most of the time the join will use equality between a primary and foreign key.

One example of SQL-JOIN other than keys are the non-equijoin, cross-join, and self-join. In many cases these types of joins are helpful in tracking objects quantities, benchmarking, threshold control, or obtaining statistical data.

Take a look at this scenario. Imagine a retail store management wants to understand the effectiveness of a marketing campaign that offers discounts during specific months. The SQL join returns any month where the sales profit is larger than regular full price profit (Finance.Profit)

SELECT Month, Discount, Finance.Profit, Sales.SalesProfit

FROM Finance

Left Join Sales

ON SalesProfit > Finance.Profit;

2-In my opinion, SQL is easy to learn and use. The simplicity of its syntax and language flexibility provide a powerful avenue for retrieving meaningful data. One aspect that I sometimes find challenging to translate from English to SQL is the “most of” or “least of” in a multi-JOIN query.

CSUMB CS Online Journey

Tuesday, May 28, 2024

CST363 - Week 5

Sunday, May 19, 2024

CST363 - Week 4

Tuesday, May 14, 2024

CST363 - Week 3

Tuesday, May 7, 2024

CST363 - Week 2

CST462S - Final Week

Search This Blog