Monday 10 March 2014

Oracle 11G New Feature: Virtual Column


Oracle 11g introduced the concept of ‘Virtual Column’ within a table. Virtual Columns are similar to normal table’s columns but with the following differences:
  • They are defined by an expression. The result of evaluation of this expression becomes the value of the column.
  • The values of the virtual column are not stored in the database. Rather, it’s computed at run-time when you query the data.
  • You can’t update (in SET clause of update statement) the values of virtual column. These are read only values, that are computed dynamically and any attempt to modify them will result into oracle error.
The syntax for defining a virtual column is:
column_name [datatype] [GENERATED ALWAYS] AS [expression] [VIRTUAL]

where the parameters within [] are optional and can be omitted. If you don’t mention the datatype, Oracle will decide it based on the result of the expression.
Excepting the above points, a virtual column, exists just like any other column of a normal table and the following points apply to it:
  1. Virtual columns can be used in the WHERE clause of UPDATE and DELETE statement but they can’t be modified by DML.
  2. Statistics can be collected on them.
  3. They can be used as a partition key in virtual column based partitioning.
  4. Indexes can be created on them. As you might have guessed, oracle would create function based indexes as we create on normal tables.
  5. Constraints can be created on them.

Create table with Virtual Column

For creating a virtual column, use the syntax mentioned above. Consider the following example:
CREATE TABLE EMPLOYEE
(
    empl_id        NUMBER,
    empl_nm        VARCHAR2(50),
    monthly_sal    NUMBER(10,2),
    bonus          NUMBER(10,2),
    total_sal      NUMBER(10,2) GENERATED ALWAYS AS (monthly_sal*12 + bonus)
);
Here we have defined a virtual column “total_sal” whose value would be dynamically calculated using the expression provided after the “generated always as” clause. Please note that this declaration is different than using “default” clause for a normal column as you can’t refer column names with “default” clause.
Lets check the data dictionary view:
SELECT column_name, data_type, data_length, data_default, virtual_column
  FROM user_tab_cols
 WHERE table_name = 'EMPLOYEE';
 
COLUMN_NAME | DATA_TYPE | DATA_LENGTH | DATA_DEFAULT             | VIRTUAL_COLUMN
EMPL_ID     | NUMBER    | 22          | null                     | NO           
EMPL_NM     | VARCHAR2  | 50          | null                     | NO           
MONTHLY_SAL | NUMBER    | 22          | null                     | NO           
BONUS       | NUMBER    | 22          | null                     | NO           
TOTAL_SAL   | NUMBER    | 22          | "MONTHLY_SAL"*12+"BONUS" | YES            
The value “YES” for the column “virtual_column” tells us that this is a virtual column. Another optional keyword “VIRTUAL” can also be added to make it syntactically complete.
DROP TABLE EMPLOYEE PURGE;
 
CREATE OR REPLACE FUNCTION get_empl_total_sal ( p_monthly_sal   NUMBER,
                                                p_bonus         NUMBER)
   RETURN NUMBER
DETERMINISTIC
IS
BEGIN
   RETURN p_monthly_sal * 12 + p_bonus;
END;
 
CREATE TABLE EMPLOYEE
(empl_id     NUMBER,
 empl_nm     VARCHAR2(50),
 monthly_sal NUMBER(10,2),
 bonus       NUMBER(10,2),
 total_sal   NUMBER(10,2) AS (get_empl_total_sal(monthly_sal, bonus)) VIRTUAL
);
We have included the “VIRTUAL” clause in the table definition. Please note that instead of using an expression, I have used a deterministic function. A deterministic function, when passed certain inputs, will always return the exact same output. “DETERMINISTIC” keyword is needed in order to mark a function as a candidate to be used in a function based index.
You can also create indexes on the virtual columns. Here is an example:
CREATE INDEX idx_total_sal ON employee(total_sal);
 
SELECT index_name, index_type
  FROM user_indexes
 WHERE table_name = 'EMPLOYEE';
 
INDEX_NAME     INDEX_TYPE                
IDX_TOTAL_SAL  FUNCTION-BASED NORMAL
Note that even this function is used as part of table definition, you can still drop it. But this in turn will make the table inaccessible.
DROP FUNCTION get_empl_total_sal;
 
SELECT * FROM employee;
*
Error at line 0
ORA-00904: "schema"."GET_EMPL_TOTAL_SAL": invalid identifier
You can alter the table with virtual column as you would modify a table with normal columns. Lets add the same column using the ALTER command:
DROP TABLE EMPLOYEE PURGE;
 
CREATE TABLE EMPLOYEE
(empl_id     NUMBER,
 empl_nm     VARCHAR2(50),
 monthly_sal NUMBER(10,2),
 bonus       NUMBER(10,2)
);
 
ALTER TABLE EMPLOYEE ADD (total_sal AS (monthly_sal * 12 + bonus));
Note that the datatype of the new column is not declared. It will be assigned a datatype based on the result of the expression (in this case, it would be NUMBER). Now let’s insert some data in the table:
INSERT INTO employee (empl_id, empl_nm, monthly_sal, bonus)
   WITH DATA AS
        (SELECT 100 empl_id, 'AAA' empl_nm, 20000 monthly_sal, 3000 bonus
           FROM DUAL
         UNION
         SELECT 200, 'BBB', 12000, 2000
           FROM DUAL
         UNION
         SELECT 300, 'CCC', 32100, 1000
           FROM DUAL
         UNION
         SELECT 400, 'DDD', 24300, 5000
           FROM DUAL
         UNION
         SELECT 500, 'EEE', 12300, 8000
           FROM DUAL)
   SELECT *
     FROM DATA;
 
SELECT * FROM employee;
 
EMPL_ID | EMPL_NM | MONTHLY_SAL | BONUS | TOTAL_SAL
100     | AAA     | 20000       | 3000  | 243000
200     | BBB     | 12000       | 2000  | 146000
300     | CCC     | 32100       | 1000  | 386200
400     | DDD     | 24300       | 5000  | 296600
500     | EEE     | 12300       | 8000  | 155600
Here we have populated the table columns except the virtual column with some values. Upon selecting the data, we get the value for “total_sal”. Remember that this data is not actually stored in the database but evaluated dynamically. Lets try updating this value of this virtual column:
UPDATE employee
   SET total_sal = 2000;
 
ORA-54017: UPDATE operation disallowed on virtual columns
As mentioned before, the statistics can also be gathered for the virtual columns.
EXEC DBMS_STATS.GATHER_TABLE_STATS(user, 'EMPLOYEE');
 
SELECT column_name, num_distinct,
       display_raw (low_value, data_type)  low_value,
       display_raw (high_value, data_type) high_value
  FROM dba_tab_cols
 WHERE table_name = 'EMPLOYEE';
 
COLUMN_NAME | NUM_DISTINCT | LOW_VALUE | HIGH_VALUE
TOTAL_SAL   | 5            | 146000    | 386200
BONUS       | 5            | 1000      | 8000
MONTHLY_SAL | 5            | 12000     | 32100
EMPL_NM     | 5            | AAA       | EEE
EMPL_ID     | 5            | 100       | 500

Limitations on Virtual Columns

**The query above uses a wonderful function “display_raw” by “Greg Rahn” to display the high/low values. Please check the references at the last to see it’s definition.
  1. A virtual column can only be of scalar datatype or XMLDATATYE. It can’t be a user defined type, LOB or RAW.
  2. All columns mentioned as part of the virtual column expression should belong to the same table.
  3. No DMLs are allowed on the virtual columns.
  4. The virtual column expression can’t reference any other virtual column.
  5. Virtual columns can only be created on ordinary tables. They can’t be created on index-organized, external, object, cluster or temporary tables.
  6. If a deterministic function is used as virtual column expression, that virtual column can’t be used as a partitioning key for virtual column-based partitioning.

Virtual Column-Based Partitioning

Prior releases of Oracle only allowed a table to be partitioned based on a physical column. Oracle 11g, with the addition of virtual columns, now allows a partition key based on an expression, using one or more existing columns of the table. A virtual column can now be used as a partitioning key. Lets partition our table based on the virtual column “total_sal”:
DROP TABLE EMPLOYEE PURGE;
 
CREATE TABLE employee
(empl_id     NUMBER,
 empl_nm     VARCHAR2(50),
 monthly_sal NUMBER(10,2),
 bonus       NUMBER(10,2),
 total_sal   NUMBER(10,2) AS (monthly_sal*12 + bonus)
)
PARTITION BY RANGE (total_sal)
    (PARTITION sal_200000 VALUES LESS THAN (200000),
     PARTITION sal_400000 VALUES LESS THAN (400000),
     PARTITION sal_600000 VALUES LESS THAN (600000),
     PARTITION sal_800000 VALUES LESS THAN (800000),
     PARTITION sal_default VALUES LESS THAN (MAXVALUE));
 
INSERT INTO employee (empl_id, empl_nm, monthly_sal, bonus)
   WITH DATA AS
        (SELECT 100 empl_id, 'AAA' empl_nm, 20000 monthly_sal, 3000 bonus
           FROM DUAL
         UNION
         SELECT 200, 'BBB', 12000, 2000
           FROM DUAL
         UNION
         SELECT 300, 'CCC', 32100, 1000
           FROM DUAL
         UNION
         SELECT 400, 'DDD', 24300, 5000
           FROM DUAL
         UNION
         SELECT 500, 'EEE', 12300, 8000
           FROM DUAL)
   SELECT *
     FROM DATA;
 
EXEC DBMS_STATS.GATHER_TABLE_STATS(user,'EMPLOYEE',granularity => 'PARTITION');
 
SELECT   table_name, partition_name, num_rows
    FROM user_tab_partitions
   WHERE table_name = 'EMPLOYEE'
ORDER BY partition_name;
 
TABLE_NAME | PARTITION_NAME | NUM_ROWS
EMPLOYEE   | SAL_200000     | 2
EMPLOYEE   | SAL_400000     | 3
EMPLOYEE   | SAL_600000     | 0
EMPLOYEE   | SAL_800000     | 0
EMPLOYEE   | SAL_DEFAULT    | 0
So far, everything looks fine, lets now try to update monthly salary of one employee and in turn the value of total_sal.
UPDATE employee
   SET monthly_sal = 30000
 WHERE empl_id = 500;
 
ORA-14402: updating partition key column would cause a partition change
What happened? The reason is simple, updating the “monthly_sal” would result into change in “total_sal” of the employee and thus a partition change is required. This can be handled by enabling the row movement in the current definition of the table.
ALTER TABLE employee ENABLE ROW MOVEMENT;
 
UPDATE employee
   SET monthly_sal = 80000
 WHERE empl_id = 500;
 
1 row updated.
The update works fine. As mentioned before, a deterministic function can’t be used as virtual column expression which is to be used as a partitioning key. It has to be an expression defined on the columns of the table as done in the previous example. The following syntax will result in oracle error:
CREATE TABLE employee_new
(empl_id     NUMBER,
 empl_nm     VARCHAR2(50),
 monthly_sal NUMBER(10,2),
 bonus       NUMBER(10,2),
 total_sal   NUMBER(10,2) AS (get_empl_total_sal(monthly_sal, bonus))
)
PARTITION BY RANGE (total_sal)
    (PARTITION sal_200000 VALUES LESS THAN (200000),
     PARTITION sal_400000 VALUES LESS THAN (400000),
     PARTITION sal_600000 VALUES LESS THAN (600000),
     PARTITION sal_800000 VALUES LESS THAN (800000),
     PARTITION sal_default VALUES LESS THAN (MAXVALUE));
 
ORA-54021: Cannot use PL/SQL expressions in partitioning or subpartitioning columns

No comments: