Index-Organized Tables Wrap-Up-Database Tables-2

I frequently look at one stock at a time for some range of days (e.g., computing a moving average). If I were to use a heap-organized table, the probability of two rows for the stock ticker ORCL existing on the same database block is almost zero. This is because every night, I insert the records for the day for all of the stocks. This fills up at least one database block (actually, many of them). Therefore, every day I add a new ORCL record, but it is on a block different from every other ORCL record already in the table. If I query as follows:

Oracle would read the index and then perform table access by rowid to get the rest of the row data. Each of the 100 rows I retrieve would be on a different database block due to the way I load the table—each would probably be a physical I/O. Now consider that

I have this same data in an IOT. That same query only needs to read the relevant index blocks, and it already has all of the data. Not only is the table access removed, but all of the rows for ORCL in a given range of dates are physically stored near each other as well. Less logical I/O and less physical I/O are incurred.

Now you understand when you might want to use IOTs and how to use them. What you need to understand next is what the options are with these tables. What are the caveats? The options are very similar to the options for a heap-organized table. Once again, we’ll use DBMS_METADATA to show us the details. Let’s start with the three basic variations of the IOT:

I generally find people’s understanding of what a cluster is in Oracle to be inaccurate. Many people tend to confuse a cluster with a SQL Server or Sybase “clustered index.” They are not the same. A cluster is a way to store a group of tables that share some common column(s) in the same database blocks and to store related data together on the same block. A clustered index in SQL Server forces the rows to be stored in sorted order according to the index key, similar to an IOT as just described. With a cluster, a single block of data may contain data from many tables. Conceptually, you are storing the data “prejoined.” It can also be used with single tables where you are storing data together grouped by some column. For example, all of the employees in department 10 will be stored on the same block (or as few blocks as possible, if they all don’t fit). It is not storing the data sorted—that is the role of the IOT. It is storing the data clustered by some key, but in a heap. So, department 100 might be right next to department 1 and very far away (physically on disk) from departments 101 and 99.

Graphically, you might think of it as shown in Figure 10-8. On the left side of the image, we are using conventional tables. EMP will be stored in its segment. DEPT will be stored on its own. They may be in different files and different tablespaces, and they are definitely in separate extents. On the right side of the image, we see what would happen if we clustered these two tables together. The square boxes represent database blocks. We now have the value 10 factored out and stored once. Then, all of the data from all of the tables in the cluster for department 10 is stored in that block. If all of the data for department 10 does not fit on the block, then additional blocks will be chained to the original block to contain the overflow, in the same fashion as the overflow blocks for an IOT

This makes sense since there will be many tables in the cluster, and they will be on the same block. Having different PCTFREEs would not make sense. Therefore, a CREATE CLUSTER statement looks a lot like a CREATE TABLE statement with a small number of columns (just the cluster key columns):

Here, we have created an index cluster (the other type being a hash cluster, which we’ll look at in a coming section “Hash Clustered Tables”). The clustering column for this cluster will be the DEPTNO column.

The columns in the tables do not have to be called DEPTNO, but they must be NUMBER(2) to match this definition. We have, on the cluster definition, a SIZE 1024 option. This is used to tell Oracle that we expect about 1024 bytes of data to be associated with each cluster key value. Oracle will use that to compute the maximum number of cluster keys that could fit per block.

Given that we have an 8KB block size, Oracle will fit up to seven cluster keys (but maybe less if the data is larger than expected) per database block. For example, the data for departments 10, 20, 30, 40, 50, 60, and 70 would tend to go onto one block, and as soon as we insert department 80, a new block will be used.

This does not mean that the data is stored in a sorted manner; it just means that if we inserted the departments in that order, they would naturally tend to be put together. If we inserted the departments in the order 10, 80, 20, 30, 40, 50, 60, and then 70, the final department (70) would tend to be on the newly added block. As we’ll see next, both the size of the data and the order in which the data is inserted will affect the number of keys we can store per block.

The SIZE parameter therefore controls the maximum number of cluster keys per block. It is the single largest influence on the space utilization of our cluster. Set the size too high, and we’ll get very few keys per block and we’ll use more space than we need. Set the size too low, and we’ll get excessive chaining of data, which offsets the purpose of the cluster to store all of the data together on a single block. It is the most important parameter for a cluster.

Next, we need to index the cluster before we can put data in it. We could create tables in the cluster right now, but we’re going to create and populate the tables simultaneously, and we need a cluster index before we can have any data. The cluster index’s job is to take a cluster key value and return the block address of the block that contains that key. It is a primary key, in effect, where each cluster key value points to a single block in the cluster itself. So, when we ask for the data in department 10, Oracle will read the cluster key, determine the block address for that, and then read the data. The cluster key index is created as follows:

It can have all of the normal storage parameters of an index and can be stored in another tablespace. It is just a regular index, so it can be on multiple columns; it just happens to index into a cluster and can also include an entry for a completely null value (see Chapter 11 for the reason why this is interesting). Note that we do not specify a list of columns in this CREATE INDEX statement—that is derived from the CLUSTER definition itself. Now we are ready to create our tables in the cluster:

Here, the only difference from a normal table is that we used the CLUSTER keyword and told Oracle which column of the base table will map to the cluster key in the cluster itself. Remember, the cluster is the segment here; therefore, this table will never have segment attributes such as TABLESPACE, PCTFREE, and so on—they are attributes of the cluster segment, not the table we just created. We can now load them up with the initial set of data:

Note I used a SQL trick to generate data in this example. I wanted more than seven departments to demonstrate that Oracle will limit the number of department keys per block based on my SIZE parameter. Therefore, I needed more than the four department rows found in SCOTT.DEPT. I generated nine rows using the “connect by level” trick against DUAL and performed a Cartesian join of those nine rows with the four in DEPT resulting in 36 unique rows. I did a similar trick with EMP to fabricate data for these departments.

So, even though we loaded DEPT first—and the DEPT rows are very small (hundreds of them could fit on an 8k block normally)—we find that the maximum number of DEPT rows on a block in this table is only seven. That fits in with what we anticipated when we set the SIZE to 1024. We estimated that with an 8k block and 1024 bytes of data per cluster key for the combined EMP and DEPT records, we would see approximately seven unique cluster key values per block, and that is exactly what we are seeing here.

Next, let’s look at the EMP and DEPT tables together. We’ll look at the rowids of each and compare the block numbers after joining by DEPTNO. If the block numbers are the same, we’ll know that the EMP row and the DEPT row are stored on the same physical database block together; if they differ, we’ll know they are not. In this case, we observe that all of our data is perfectly stored. There are no cases where a record for the EMP table is stored on a block separate from its corresponding DEPT record:

That was exactly our goal—to get every row in the EMP table stored on the same block as the corresponding DEPT row. But what would have happened if we estimated incorrectly, what if 1024 was insufficient? What if some of our departments were close to 1024 and others exceeded that value?

Then, obviously, the data could not fit on the same block and we’d have to place some of the EMP records on a block separate from the DEPT record. We can see this easily by resetting our prior example (I’m starting with the tables as they were before the load, right after creating them). When I load this time, we’ll load every EMP record eight times, to multiply the number of employee records per department:

We can see there are 46 out of 882 EMP rows on a block separate and distinct from the block their corresponding DEPTNO is on in the DEPT table. Given that we undersized the cluster (the SIZE parameter was too small given our real-life data), we could re-create it with a cluster SIZE of 1200, and then we would discover the following:

We only stored six DEPTNO values per block now, leaving sufficient room for all of the EMP data to be stored on the same block with their corresponding DEPT records.

Here is a bit of puzzle to amaze and astound your friends with. Many people mistakenly believe a rowid uniquely identifies a row in a database, and that given a rowid you can tell what table the row came from. In fact, you cannot. You can and will get duplicate rowids from a cluster. For example, after executing the preceding code, you should find

Every rowid assigned to the rows in DEPT has been assigned to the rows in EMP as well. That is because it takes a table and row ID to uniquely identify a row. The rowid pseudo-column is unique only within a table.

I also find that many people believe the cluster object to be an esoteric object that no one really uses—everyone just uses normal tables. In fact, you use clusters every time you use Oracle. Much of the data dictionary is stored in various clusters, for example, running the following as SYS (in the root container):

As you can see, most of the object-related data is stored in a single cluster (the C_OBJ# cluster): 17 tables sharing the same block. It is mostly column-related information stored there, so all of the information about the set of columns of a table or index is stored physically on the same block. This makes sense, as when Oracle parses a query,it wants to have access to the data for all of the columns in the referenced table. If this data were spread all over the place, it would take a while to get it together. Here, it is on asingle block typically and readily available.

When would you use a cluster? It is easier perhaps to describe when not to use one:

•\ If you anticipate the tables in the cluster will be modified heavily: You must be aware that an index cluster will have certain negative performance side effects on DML performance, INSERT statements in particular. It takes more work to manage the data in a cluster. The data has to be put away carefully, so it takes longer to put the data away (to insert it).

•\ If you need to perform full scans of tables in clusters: Instead of just having to full scan the data in your table, you have to full scan the data for (possibly) many tables. There is more data to scan through, so full scans will take longer.

•\ If you need to partition the tables: Tables in a cluster cannot be partitioned, nor can the cluster be partitioned.

•\ If you believe you will frequently need to TRUNCATE and load the table: Tables in clusters cannot be truncated. That is obvious—since the cluster stores more than one table on a block, you must delete the rows in a cluster table.

So, if you have data that is mostly read (that does not mean “never written”; it is perfectly OK to modify cluster tables) and read via indexes, either the cluster key index or other indexes you put on the tables in the cluster, and join this information together frequently, a cluster would be appropriate. Look for tables that are logically related and always used together, like the people who designed the Oracle data dictionary when they clustered all column-related information together.

Leave a Reply Cancel reply