Monday, 30 September 2019

Oracle SQl/PLSQL - String Aggregation Techniques

String Aggregation Techniques

On occasion it is necessary to aggregate data from a number of rows into a single row, giving a list of data associated with a specific value. Using the SCOTT.EMP table as an example, we might want to retrieve a list of employees for each department. Below is a list of the base data and the type of output we would like to return from an aggregate query.

Base Data:

    DEPTNO ENAME
---------- ----------
        20 SMITH
        30 ALLEN
        30 WARD
        20 JONES
        30 MARTIN
        30 BLAKE
        10 CLARK
        20 SCOTT
        10 KING
        30 TURNER
        20 ADAMS
        30 JAMES
        20 FORD
        10 MILLER

Desired Output:

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

LISTAGG Analytic Function in 11g Release 2

The LISTAGG analytic function was introduced in Oracle 11g Release 2, making it very easy to aggregate strings. The nice thing about this function is it also allows us to order the elements in the concatenated list. If you are using 11g Release 2 you should use this function for string aggregation.

COLUMN employees FORMAT A50

SELECT deptno, LISTAGG(ename, ',') WITHIN GROUP (ORDER BY ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

3 rows selected.

WM_CONCAT Built-in Function (Not Supported)

If you are not running 11g Release 2 or above, but are running a version of the database where the WM_CONCAT function is present, then it is a zero effort solution as it performs the aggregation for you. It is actually an example of a user defined aggregate function described below, but Oracle have done all the work for you.

COLUMN employees FORMAT A50

SELECT deptno, wm_concat(ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

3 rows selected.

WM_CONCAT is an undocumented function and as such is not supported by Oracle for user applications (MOS Note ID 1336219.1). If this concerns you, use a User-Defined Aggregate Function described below.

Also, WM_CONCAT has been removed from 12c onward, so you can't pick this option.

User-Defined Aggregate Function

The WM_CONCAT function described above is an example of a user-defined aggregate function that Oracle have already created for you. If you don't want to use WM_CONCAT, you can create your own user-defined aggregate function as described at asktom.oracle.com. Thanks to Kim Berg Hansen for some corrections in comments.

CREATE OR REPLACE TYPE t_string_agg AS OBJECT
(
  g_string  VARCHAR2(32767),

  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
     RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER,

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER
);
/
SHOW ERRORS


CREATE OR REPLACE TYPE BODY t_string_agg IS
  STATIC FUNCTION ODCIAggregateInitialize(sctx  IN OUT  t_string_agg)
    RETURN NUMBER IS
  BEGIN
    sctx := t_string_agg(NULL);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateIterate(self   IN OUT  t_string_agg,
                                       value  IN      VARCHAR2 )
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := self.g_string || ',' || value;
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateTerminate(self         IN   t_string_agg,
                                         returnValue  OUT  VARCHAR2,
                                         flags        IN   NUMBER)
    RETURN NUMBER IS
  BEGIN
    returnValue := SUBSTR(SELF.g_string, 2);
    RETURN ODCIConst.Success;
  END;

  MEMBER FUNCTION ODCIAggregateMerge(self  IN OUT  t_string_agg,
                                     ctx2  IN      t_string_agg)
    RETURN NUMBER IS
  BEGIN
    SELF.g_string := SELF.g_string || ctx2.g_string;
    RETURN ODCIConst.Success;
  END;
END;
/
SHOW ERRORS


CREATE OR REPLACE FUNCTION string_agg (p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE AGGREGATE USING t_string_agg;
/
SHOW ERRORS

The aggregate function is implemented using a type and type body, and is used within a query.

COLUMN employees FORMAT A50

SELECT /*+ PARALLEL(2) */ deptno, string_agg(ename) AS employees
FROM   emp
GROUP BY deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,FORD,ADAMS,SCOTT,JONES
        30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD

3 rows selected.

Specific Function

One approach is to write a specific function to solve the problems. The get_employees function listed below returns a list of employees for the specified department.

CREATE OR REPLACE FUNCTION get_employees (p_deptno  in  emp.deptno%TYPE)
  RETURN VARCHAR2
IS
  l_text  VARCHAR2(32767) := NULL;
BEGIN
  FOR cur_rec IN (SELECT ename FROM emp WHERE deptno = p_deptno) LOOP
    l_text := l_text || ',' || cur_rec.ename;
  END LOOP;
  RETURN LTRIM(l_text, ',');
END;
/
SHOW ERRORS

The function can then be incorporated into a query as follows.

COLUMN employees FORMAT A50

SELECT deptno,
       get_employees(deptno) AS employees
FROM   emp
GROUP by deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

To reduce the number of calls to the function, and thereby improve performance, we might want to filter the rows in advance.

COLUMN employees FORMAT A50

SELECT e.deptno,
       get_employees(e.deptno) AS employees
FROM   (SELECT DISTINCT deptno
        FROM   emp) e;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        
3 rows selected.

Generic Function using Ref Cursor

An alternative approach is to write a function to concatenate values passed using a ref cursor. This is essentially the same as the previous example, except that the cursor is passed in making it generic, as shown below.

CREATE OR REPLACE FUNCTION concatenate_list (p_cursor IN  SYS_REFCURSOR)
  RETURN  VARCHAR2
IS
  l_return  VARCHAR2(32767); 
  l_temp    VARCHAR2(32767);
BEGIN
  LOOP
    FETCH p_cursor
    INTO  l_temp;
    EXIT WHEN p_cursor%NOTFOUND;
    l_return := l_return || ',' || l_temp;
  END LOOP;
  RETURN LTRIM(l_return, ',');
END;
/
SHOW ERRORS

The CURSOR expression is used to allow a query to be passed to the function as a ref cursor, as shown below.

COLUMN employees FORMAT A50

SELECT e1.deptno,
       concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM   emp e1
GROUP BY e1.deptno;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

Once again, the total number of function calls can be reduced by filtering the distinct values, rather than calling the function for each row.

COLUMN employees FORMAT A50

SELECT deptno,
       concatenate_list(CURSOR(SELECT e2.ename FROM emp e2 WHERE e2.deptno = e1.deptno)) employees
FROM   (SELECT DISTINCT deptno
        FROM emp) e1;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

3 rows selected.

ROW_NUMBER() and SYS_CONNECT_BY_PATH functions in Oracle 9i

An example on williamrobertson.net uses the ROW_NUMBER() and SYS_CONNECT_BY_PATH functions to achieve the same result without the use of PL/SQL or additional type definitions.

SELECT deptno,
       LTRIM(MAX(SYS_CONNECT_BY_PATH(ename,','))
       KEEP (DENSE_RANK LAST ORDER BY curr),',') AS employees
FROM   (SELECT deptno,
               ename,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) AS curr,
               ROW_NUMBER() OVER (PARTITION BY deptno ORDER BY ename) -1 AS prev
        FROM   emp)
GROUP BY deptno
CONNECT BY prev = PRIOR curr AND deptno = PRIOR deptno
START WITH curr = 1;

    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 ADAMS,FORD,JONES,SCOTT,SMITH
        30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

3 rows selected.

COLLECT function in Oracle 10g

An example on oracle-developer.net uses the COLLECT function in Oracle 10g to get the same result. This method requires a table type and a function to convert the contents of the table type to a string. I've altered his method slightly to bring it in line with this article.

CREATE OR REPLACE TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
/

CREATE OR REPLACE FUNCTION tab_to_string (p_varchar2_tab  IN  t_varchar2_tab,
                                          p_delimiter     IN  VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS
  l_string     VARCHAR2(32767);
BEGIN
  FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP
    IF i != p_varchar2_tab.FIRST THEN
      l_string := l_string || p_delimiter;
    END IF;
    l_string := l_string || p_varchar2_tab(i);
  END LOOP;
  RETURN l_string;
END tab_to_string;
/

The query below shows the COLLECT function in action.

COLUMN employees FORMAT A50

SELECT deptno,
       tab_to_string(CAST(COLLECT(ename) AS t_varchar2_tab)) AS employees
FROM   emp
GROUP BY deptno;
       
    DEPTNO EMPLOYEES
---------- --------------------------------------------------
        10 CLARK,KING,MILLER
        20 SMITH,JONES,SCOTT,ADAMS,FORD
        30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
        
3 rows selected.

Click to view reference site

Wednesday, 18 September 2019

Oracle SQL: Optimizer Hint

It is often easy to forget this, but in many ways it is after we hit the execute button that the really exciting stuff starts with our code. A number of engines silently spring into action; including the optimizer. The optimizer analyses your SQL statement and decides the most efficient way to execute it based on the objects involved in the statement and the conditions you’re subjecting them to. Your database automatically gathers stats about your objects – stuff like the number of rows, number of distinct values, of nulls, data distribution – and the optimizer uses this information in its decision-making. (You can study the explain plan to see what decisions the optimizer has taken.) The optimizer arrives at its conclusions, often in barely a whisper of time.

And when the SQL statement is executed, you sit back and you feel like a genius.

And that’s it, right? THE END.

Optimizer Hints

Well, not necessarily. The optimizer is the hero of our story; let me introduce the potential villains, Optimizer Hints. An optimizer hint is a code snippet within an SQL statement controlling the decisions of the optimizer. Hints give us the opportunity, in cases where we have superior knowledge about the database, to influence the optimizer. In fact, the very name is a misnomer – they are not hints; they are commands that override the optimizer (as long as the hint is valid and the _OPTIMIZER_IGNORE_HINTS initialization parameter is not TRUE).

Hints are injected into DML statements within the bounds of a comment. The syntax is as follows:

{DELETE|INSERT|SELECT|UPDATE} /*+ hint [text] [hint [text]] */

Also valid is the less fashionable

{DELETE|INSERT|SELECT|UPDATE} --+ hint [text] [hint [text]]

The + tells Oracle that this isn’t an ordinary comment, that it is in fact a hint. No spaces are allowed between the comment delimiter and the plus sign.

Here’s an example instructing that a full table scan should be carried out on the emp table:


Select /*+ FULL(emp) */ name
From emp

Where job = 'SALESMAN';

That’s pretty straightforward. So why am I painting optimizer hints as the baddies here?

The rules

Don’t.
If you must use hints, assume you’ve used them incorrectly. So don’t.
With every Oracle patch or upgrade, assume every hint is going to go wrong. So don’t.
With every DDL, assume every hint applied to that object is going to go wrong. So don’t.

The reason to be wary of hinting is that by embedding hints in your SQL, you are overriding the optimizer and saying that you know more than it does – not just now, but every time in the future that your SQL will be run, irrespective of any other changes that may happen to your database. The likely consequence of this is that your SQL will possibly run sub-optimally now and almost certainly in the future.

Hints in Detail

Well, not necessarily. Cos telling you about these enchanting things called hints and then telling you to immediately forget everything you’ve just heard would be like telling you there’s a tree of knowledge in the middle of the garden but that you must never, ever eat its apple. And we know how that story ends.

So, more details. There are many dozens of different hints (have a look in the v$sql_hint view), however close to half of them are undocumented. So The First Rule of Hinting must really be borne in mind if you decide to use them. Here are a select few.

FIRST_ROWS(n): This hint instructs the optimizer to select a plan that returns the first n rows most efficiently.


SELECT /*+ FIRST_ROWS(10) */ empno, ename
FROM emp

WHERE deptno = 10;

You may also want to read up about FIRST_ROWS_1, FIRST_ROWS_10 and FIRST_ROWS_100. Of interest, also, is ALL_ROWS which details the optimizer to choose the plan that most effectively returns the resultset at the minimum cost.

NO_INDEX(<table_name> < index_name>): Instructs the optimizer to specifically not use the named index in determining a plan.


SELECT /*+ NO_INDEX(emp emp_ix) */ empno, ename
FROM emp, dept

WHERE emp.deptno = dept.deptno;

See also: the INDEX hint. You may also want to investigate INDEX_COMBINE, INDEX_JOIN, INDEX_ASC and INDEX_FFS.

LEADING(table_name): This hint tells Oracle to use the parameterised table as the first in the join order. The optimizer will consequently select a join chain that starts with this table.


SELECT /*+ LEADING (dept) */ empno, ename
FROM emp, dept

WHERE emp.deptno = dept.deptno;

Related to the LEADING hint is the ORDERED hint. This hint instructs Oracle to join tables in the exact order in which they are listed in the FROM clause.

CACHE(table): This hint tells Oracle to add the blocks retrieved for the table to the head of the most recently used list. This might be useful with regularly-used lookup tables, for example.


SELECT /*+ CACHE (d) */ deptno, dname

FROM dept d;

Oracle caches small tables by default, making this hint redundant in many cases. Also often redundant is the NOCACHE hint, since this places blocks at the tail of the LRU list, which is also Oracle’s default behaviour with the majority of blocks.

CARDINALITY(table n): This hint instructs Oracle to use n as the table, rather than rely on its own stats. You may need to use this hint with a global temporary table, for instance.


SELECT /*+ CARDINALITY (gtt, 1000) */ gtt.gtt_id, dname
FROM dept d, global_temp_tab gtt

WHERE d.deptno = gtt.deptno;

REWRITE: This hint instructs Oracle to rewrite the query using a materialized view, irrespective of cost.

PARALLEL (table n): This hint tells the optimizer to use n concurrent servers for a parallel operation.

APPEND: This hint instructs the optimizer to carry out a direct-path insert. This may make INSERT … SELECT statements faster because inserted data is simply appended to the end of the table and any referential constraints are ignored.

RULE: This hint basically turns off the optimizer. This hint has been deprecated and should not be used. Never ever.

Conclusion

Well, not necessarily the end. I was holidaying in Serbia recently and, when you visit a foreign country, it is always advisable to arm yourself with a few handy local words: Good day (dobar dan), thank you (hvala) – and, of course, beer (pivo). Take the examples above as your first handy words as you discover optimizer hints. There is more to learn, much more (indeed Jonathan Lewis has written dozens of articles on the subject, as have others). Hopefully, you are now ready to tackle those articles.

How to use hints in Oracle sql for performance

With hints one can influence the optimizer. The usage of hints (with exception of the RULE-hint) causes Oracle to use the Cost Based optimizer.

The following syntax is used for hints:

select /*+ HINT */ name
from emp
where id =1;

Where HINT is replaced by the hint text.
When the syntax of the hint text is incorrect, the hint text is ignored and will not be used.

Hints for Optimization Approaches and Goals
Hints for Optimization Approaches and Goals
ALL_ROWS	The ALL_ROWS hint explicitly chooses the cost-based approach to optimize a statement block with a goal of best throughput (that is, minimum total resource consumption).
FIRST_ROWS	The FIRST_ROWS hint explicitly chooses the cost-based approach to optimize a statement block with a goal of best response time (minimum resource usage to return first row). In newer Oracle version you should give a parameter with this hint: FIRST_ROWS(n) means that the optimizer will determine an execution plan to give a fast response for returning the first n rows.
CHOOSE	The CHOOSE hint causes the optimizer to choose between the rule-based approach and the cost-based approach for a SQL statement based on the presence of statistics for the tables accessed by the statement
RULE	The RULE hint explicitly chooses rule-based optimization for a statement block. This hint also causes the optimizer to ignore any other hints specified for the statement block. The RULE hint does not work any more in Oracle 10g.
Hints for Access Paths
Hints for Access Paths
FULL	The FULL hint explicitly chooses a full table scan for the specified table. The syntax of the FULL hint is FULL(table) where table specifies the alias of the table (or table name if alias does not exist) on which the full table scan is to be performed.
ROWID	The ROWID hint explicitly chooses a table scan by ROWID for the specified table. The syntax of the ROWID hint is ROWID(table) where table specifies the name or alias of the table on which the table access by ROWID is to be performed. (This hint deprecated in Oracle 10g)
CLUSTER	The CLUSTER hint explicitly chooses a cluster scan to access the specified table. The syntax of the CLUSTER hint is CLUSTER(table) where table specifies the name or alias of the table to be accessed by a cluster scan.
HASH	The HASH hint explicitly chooses a hash scan to access the specified table. The syntax of the HASH hint is HASH(table) where table specifies the name or alias of the table to be accessed by a hash scan.
HASH_AJ	The HASH_AJ hint transforms a NOT IN subquery into a hash anti-join to access the specified table. The syntax of the HASH_AJ hint is HASH_AJ(table) where table specifies the name or alias of the table to be accessed.(deprecated in Oracle 10g)
INDEX	The INDEX hint explicitly chooses an index scan for the specified table. The syntax of the INDEX hint is INDEX(table index) where:table specifies the name or alias of the table associated with the index to be scanned and index specifies an index on which an index scan is to be performed. This hint may optionally specify one or more indexes:
NO_INDEX	The NO_INDEX hint explicitly disallows a set of indexes for the specified table. The syntax of the NO_INDEX hint is NO_INDEX(table index)
INDEX_ASC	The INDEX_ASC hint explicitly chooses an index scan for the specified table. If the statement uses an index range scan, Oracle scans the index entries in ascending order of their indexed values.
INDEX_COMBINE	If no indexes are given as arguments for the INDEX_COMBINE hint, the optimizer will use on the table whatever Boolean combination of bitmap indexes has the best cost estimate. If certain indexes are given as arguments, the optimizer will try to use some Boolean combination of those particular bitmap indexes. The syntax of INDEX_COMBINE is INDEX_COMBINE(table index).
INDEX_JOIN	Explicitly instructs the optimizer to use an index join as an access path. For the hint to have a positive effect, a sufficiently small number of indexes must exist that contain all the columns required to resolve the query.
INDEX_DESC	The INDEX_DESC hint explicitly chooses an index scan for the specified table. If the statement uses an index range scan, Oracle scans the index entries in descending order of their indexed values.
INDEX_FFS	This hint causes a fast full index scan to be performed rather than a full table.
NO_INDEX_FFS	Do not use fast full index scan (from Oracle 10g)
INDEX_SS	Exclude range scan from query plan (from Oracle 10g)
INDEX_SS_ASC	Exclude range scan from query plan (from Oracle 10g)
INDEX_SS_DESC	Exclude range scan from query plan (from Oracle 10g)
NO_INDEX_SS	The NO_INDEX_SS hint causes the optimizer to exclude a skip scan of the specified indexes on the specified table. (from Oracle 10g)
Hints for Query Transformations
Hints for Query Transformations
NO_QUERY_TRANSFORMATION	Prevents the optimizer performing query transformations. (from Oracle 10g)
USE_CONCAT	The USE_CONCAT hint forces combined OR conditions in the WHERE clause of a query to be transformed into a compound query using the UNION ALL set operator. Normally, this transformation occurs only if the cost of the query using the concatenations is cheaper than the cost without them.
NO_EXPAND	The NO_EXPAND hint prevents the optimizer from considering OR-expansion for queries having OR conditions or IN-lists in the WHERE clause. Usually, the optimizer considers using OR expansion and uses this method if it decides that the cost is lower than not using it.
REWRITE	The REWRITE hint forces the optimizer to rewrite a query in terms of materialized views, when possible, without cost consideration. Use the REWRITE hint with or without a view list. If you use REWRITE with a view list and the list contains an eligible materialized view, then Oracle uses that view regardless of its cost.
NOREWRITE / NO_REWRITE	In Oracle 10g renamed to NO_REWRITE. The NOREWRITE/NO_REWRITE hint disables query rewrite for the query block, overriding the setting of the parameter QUERY_REWRITE_ENABLED.
MERGE	The MERGE hint lets you merge views in a query.
NO_MERGE	The NO_MERGE hint causes Oracle not to merge merge-able views. This hint is most often used to reduce the number of possible permutations for a query and make optimization faster.
FACT	The FACT hint indicated that the table should be considered as a fact table. This is used in the context of the star transformation.
NO_FACT	The NO_FACT hint is used in the context of the star transformation to indicate to the transformation that the hinted table should not be considered as a fact table.
STAR_TRANSFORMATION	The STAR_TRANSFORMATION hint makes the optimizer use the best plan in which the transformation has been used. Without the hint, the optimizer could make a query optimization decision to use the best plan generated without the transformation, instead of the best plan for the transformed query.
NO_STAR_TRANSFORMATION	Do not use star transformation (from Oracle 10g)
UNNEST	The UNNEST hint specifies subquery unnesting.
NO_UNNEST	Use of the NO_UNNEST hint turns off unnesting for specific subquery blocks.
Hints for Join Orders
Hints for Join Orders
LEADING	Give this hint to indicate the leading table in a join. This will indicate only 1 table. If you want to specify the whole order of tables, you can use the ORDERED hint. Syntax: LEADING(table)
ORDERED	The ORDERED hint causes Oracle to join tables in the order in which they appear in the FROM clause. If you omit the ORDERED hint from a SQL statement performing a join , the optimizer chooses the order in which to join the tables. You may want to use the ORDERED hint to specify a join order if you know something about the number of rows selected from each table that the optimizer does not. Such information would allow you to choose an inner and outer table better than the optimizer could.
Hints for Join Operations
Hints for Join Operations
USE_NL	The USE_NL hint causes Oracle to join each specified table to another row source with a nested loops join using the specified table as the inner table. The syntax of the USE_NL hint is USE_NL(table table) where table is the name or alias of a table to be used as the inner table of a nested loops join.
NO_USE_NL	Do not use nested loop (from Oracle 10g)
USE_NL_WITH_INDEX	Specifies a nested loops join. (from Oracle 10g)
USE_MERGE	The USE_MERGE hint causes Oracle to join each specified table with another row source with a sort-merge join. The syntax of the USE_MERGE hint is USE_MERGE(table table) where table is a table to be joined to the row source resulting from joining the previous tables in the join order using a sort-merge join.
NO_USE_MERGE	Do not use merge (from Oracle 10g)
USE_HASH	The USE_HASH hint causes Oracle to join each specified table with another row source with a hash join. The syntax of the USE_HASH hint is USE_HASH(table table) where table is a table to be joined to the row source resulting from joining the previous tables in the join order using a hash join.
NO_USE_HASH	Do not use hash (from Oracle 10g)
Hints for Parallel Execution
PARALLEL	The PARALLEL hint allows you to specify the desired number of concurrent query servers that can be used for the query. The syntax is PARALLEL(table number number). The PARALLEL hint must use the table alias if an alias is specified in the query. The PARALLEL hint can then take two values separated by commas after the table name. The first value specifies the degree of parallelism for the given table, the second value specifies how the table is to be split among the instances of a parallel server. Specifying DEFAULT or no value signifies the query coordinator should examine the settings of the initialization parameters (described in a later section) to determine the default degree of parallelism.
NOPARALLEL / NO_PARALLEL	The NOPARALLEL hint allows you to disable parallel scanning of a table, even if the table was created with a PARALLEL clause. In Oracle 10g this hint was renamed to NO_PARALLEL.
PQ_DISTRIBUTE	The PQ_DISTRIBUTE hint improves the performance of parallel join operations. Do this by specifying how rows of joined tables should be distributed among producer and consumer query servers. Using this hint overrides decisions the optimizer would normally make.
NO_PARALLEL_INDEX	The NO_PARALLEL_INDEX hint overrides a PARALLEL attribute setting on an index to avoid a parallel index scan operation.
Additional Hints
APPEND	When the APPEND hint is used with the INSERT statement, data is appended to the table. Existing free space in the block is not used. If a table or an index is specified with nologging, this hint applied with an insert statement produces a direct path insert which reduces generation of redo.
NOAPPEND	Overrides the append mode.
CACHE	The CACHE hint specifies that the blocks retrieved for the table in the hint are placed at the most recently used end of the LRU list in the buffer cache when a full table scan is performed. This option is useful for small lookup tables. In the following example, the CACHE hint overrides the table default caching specification.
NOCACHE	The NOCACHE hint specifies that the blocks retrieved for this table are placed at the least recently used end of the LRU list in the buffer cache when a full table scan is performed. This is the normal behavior of blocks in the buffer cache.
PUSH_PRED	The PUSH_PRED hint forces pushing of a join predicate into the view.
NO_PUSH_PRED	The NO_PUSH_PRED hint prevents pushing of a join predicate into the view.
PUSH_SUBQ	The PUSH_SUBQ hint causes non-merged subqueries to be evaluated at the earliest possible place in the execution plan.
NO_PUSH_SUBQ	The NO_PUSH_SUBQ hint causes non-merged subqueries to be evaluated as the last step in the execution plan.
QB_NAME	Specifies a name for a query block. (from Oracle 10g)
CURSOR_SHARING_EXACT	Oracle can replace literals in SQL statements with bind variables, if it is safe to do so. This is controlled with the CURSOR_SHARING startup parameter. The CURSOR_SHARING_EXACT hint causes this behavior to be switched off. In other words, Oracle executes the SQL statement without any attempt to replace literals by bind variables.
DRIVING_SITE	The DRIVING_SITE hint forces query execution to be done for the table at a different site than that selected by Oracle
DYNAMIC_SAMPLING	The DYNAMIC_SAMPLING hint lets you control dynamic sampling to improve server performance by determining more accurate predicate selectivity and statistics for tables and indexes. You can set the value of DYNAMIC_SAMPLING to a value from 0 to 10. The higher the level, the more effort the compiler puts into dynamic sampling and the more broadly it is applied. Sampling defaults to cursor level unless you specify a table.
SPREAD_MIN_ANALYSIS	This hint omits some of the compile time optimizations of the rules, mainly detailed dependency graph analysis, on spreadsheets. Some optimizations such as creating filters to selectively populate spreadsheet access structures and limited rule pruning are still used. (from Oracle 10g)
Hints with unknown status
MERGE_AJ	The MERGE_AJ hint transforms a NOT IN subquery into a merge anti-join to access the specified table. The syntax of the MERGE_AJ hint is MERGE_AJ(table) where table specifies the name or alias of the table to be accessed.(deprecated in Oracle 10g)
AND_EQUAL	The AND_EQUAL hint explicitly chooses an execution plan that uses an access path that merges the scans on several single-column indexes. The syntax of the AND_EQUAL hint is AND_EQUAL(table index index) where table specifies the name or alias of the table associated with the indexes to be merged. and index specifies an index on which an index scan is to be performed. You must specify at least two indexes. You cannot specify more than five. (depricated in Oracle 10g)
STAR	The STAR hint forces the large table to be joined last using a nested loops join on the index. The optimizer will consider different permutations of the small tables. (deprecated in Oracle 10g)
BITMAP	Usage: BITMAP(table_name index_name) Uses a bitmap index to access the table. (deprecated ?)
HASH_SJ	Use a Hash Anti-Join to evaluate a NOT IN sub-query. Use this hint in the sub-query, not in the main query. Use this when your high volume NOT IN sub-query is using a FILTER or NESTED LOOPS join. Try MERGE_AJ if HASH_AJ refuses to work.(deprecated in Oracle 10g)
NL_SJ	Use a Nested Loop in a sub-query. (deprecated in Oracle 10g)
NL_AJ	Use an anti-join in a sub-query. (deprecated in Oracle 10g)
ORDERED_PREDICATES	(deprecated in Oracle 10g)
EXPAND_GSET_TO_UNION	(deprecated in Oracle 10g)

Oracle SQL: Current Date and Time

Getting the Current Date and Time

In any language, it’s important to know how to get the current date and time. How to do that is often one of the first questions to come up, especially in applications that involve dates in any way, as most applications do.

Up through Oracle8i Database, you had one choice for getting the date and time in PL/SQL: you used the SYSDATE function, and that was it. Beginning with Oracle9i Database, you have all the functions in Table at your disposal, and you need to understand how they work and what your choices are.

Table. Comparison of functions that return current date and time

Function	Time zone	Datatype returned
CURRENT_DATE	Session	DATE
CURRENT_TIMESTAMP	Session	TIMESTAMP WITH TIME ZONE
LOCALTIMESTAMP	Session	TIMESTAMP
SYSDATE	Database server	DATE
SYSTIMESTAMP	Database server	TIMESTAMP WITH TIME ZONE

The Oracle CURRENT_DATE function returns the current date in the session time zone, in a value in the Gregorian calendar of datatype DATE. The format in which the date is displayed depends on NLS_DATE_FORMAT parameter. The default setting of NLS_DATE_FORMAT is DD-MON-YY. This returns a 2-digit day, a three-character month abbreviation, and a 2-digit year.

Example Usage:

CURRENT_DATE

SYSDATE

SELECT TO_CHAR(CURRENT_DATE, 'DD-MON-YYYY HH:MI:SS') FROM dual;

SELECT TO_CHAR(SYSDATE, 'DD-MON-YYYY HH:MI:SS') FROM dual;

To see the current system date and time with the time zone, use just the CURRENT_DATE function.

To see the current system date and time with fractional seconds and the time zone, use the following statement:

SELECT SYSTIMESTAMP FROM dual;

For example to see the current date in the format "Wednesday, 18th September, 2019", use the following syntax:

SELECT TO_CHAR(sysdate,'Day, ddth Month, yyyy')"Today" FROM dual;

This will return the following (assuming today is 9/18/2019):

TODAY

------------------------

Wednesday, 18th September, 2019

Oracle / PLSQL: SYS_CONTEXT Function

Description

The Oracle/PLSQL SYS_CONTEXT function can be used to retrieve information about the Oracle environment.

Syntax

The syntax for the SYS_CONTEXT function in Oracle/PLSQL is:

SYS_CONTEXT( namespace, parameter [, length] )

Parameters or Arguments

namespace: An Oracle namespace that has already been created. If the namespace of 'USERENV' is used, attributes describing the current Oracle session can be returned.
parameter: A valid attribute that has been set using the DBMS_SESSION.set_context procedure.
length: Optional. It is the length of the return value in bytes. If this parameter is omitted or if an invalid entry is provided, the sys_context function will default to 256 bytes.

Returns

The SYS_CONTEXT function returns a string value.

Note

The valid parameters for the namespace called 'USERENV' are as follows: (Note that not all parameters are valid in all versions of Oracle)

Parameter	Explanation	Oracle 9i	Oracle 10g	Oracle 11g
ACTION	Returns the position in the module	No	Yes	Yes
AUDITED_CURSORID	Returns the cursor ID of the SQL that triggered the audit	Yes	Yes	Yes
AUTHENTICATED_IDENTITY	Returns the identity used in authentication	No	Yes	Yes
AUTHENTICATION_DATA	Authentication data	Yes	Yes	Yes
AUTHENTICATION_METHOD	Returns the method of authentication	No	Yes	Yes
AUTHENTICATION_TYPE	Describes how the user was authenticated. It can be one of the following values: Database, OS, Network, or Proxy	Yes	No	No
BG_JOB_ID	If the session was established by an Oracle background process, this parameter will return the Job ID. Otherwise, it will return NULL.	Yes	Yes	Yes
CLIENT_IDENTIFIER	Returns the client identifier (global context)	Yes	Yes	Yes
CLIENT_INFO	User session information	Yes	Yes	Yes
CURRENT_BIND	Bind variables for fine-grained auditing	No	Yes	Yes
CURRENT_SCHEMA	Returns the default schema used in the current schema	Yes	Yes	Yes
CURRENT_SCHEMAID	Returns the identifier of the default schema used in the current schema	Yes	Yes	Yes
CURRENT_SQL	Returns the SQL that triggered the audit event	Yes	Yes	Yes
CURRENT_SQL_LENGTH	Returns the length of the current SQL statement that triggered the audit event	No	Yes	Yes
CURRENT_USER	Name of the current user	Yes	No	No
CURRENT_USERID	Userid of the current user	Yes	No	No
DB_DOMAIN	Domain of the database from the DB_DOMAIN initialization parameter	Yes	Yes	Yes
DB_NAME	Name of the database from the DB_NAME initialization parameter	Yes	Yes	Yes
DB_UNIQUE_NAME	Name of the database from the DB_UNIQUE_NAME initialization parameter	No	Yes	Yes
ENTRYID	Available auditing entry identifier	Yes	Yes	Yes
ENTERPRISE_IDENTITY	Returns the user's enterprise-wide identity	No	Yes	Yes
EXTERNAL_NAME	External of the database user	Yes	No	No
FG_JOB_ID	If the session was established by a client foreground process, this parameter will return the Job ID. Otherwise, it will return NULL.	Yes	Yes	Yes
GLOBAL_CONTEXT_MEMORY	The number used in the System Global Area by the globally accessed context	Yes	Yes	Yes
GLOBAL_UID	The global user ID from Oracle Internet Directory for enterprise security logins. Returns NULL for all other logins.	No	No	Yes
HOST	Name of the host machine from which the client has connected	Yes	Yes	Yes
IDENTIFICATION_TYPE	Returns the way the user's schema was created	No	Yes	Yes
INSTANCE	The identifier number of the current instance	Yes	Yes	Yes
INSTANCE_NAME	The name of the current instance	No	Yes	Yes
IP_ADDRESS	IP address of the machine from which the client has connected	Yes	Yes	Yes
ISDBA	Returns TRUE if the user has DBA privileges. Otherwise, it will return FALSE.	Yes	Yes	Yes
LANG	The ISO abbreviate for the language	Yes	Yes	Yes
LANGUAGE	The language, territory, and character of the session. In the following format: language_territory.characterset	Yes	Yes	Yes
MODULE	Returns the appplication name set through DBMS_APPLICATION_INFO package or OCI	No	Yes	Yes
NETWORK_PROTOCOL	Network protocol used	Yes	Yes	Yes
NLS_CALENDAR	The calendar of the current session	Yes	Yes	Yes
NLS_CURRENCY	The currency of the current session	Yes	Yes	Yes
NLS_DATE_FORMAT	The date format for the current session	Yes	Yes	Yes
NLS_DATE_LANGUAGE	The language used for dates	Yes	Yes	Yes
NLS_SORT	BINARY or the linguistic sort basis	Yes	Yes	Yes
NLS_TERRITORY	The territory of the current session	Yes	Yes	Yes
OS_USER	The OS username for the user logged in	Yes	Yes	Yes
POLICY_INVOKER	The invoker of row-level security policy functions	No	Yes	Yes
PROXY_ENTERPRISE_IDENTITY	The Oracle Internet Directory DN when the proxy user is an enterprise user	No	Yes	Yes
PROXY_GLOBAL_UID	The global user ID from Oracle Internet Directory for enterprise user security proxy users. Returns NULL for all other proxy users.	No	Yes	Yes
PROXY_USER	The name of the user who opened the current session on behalf of SESSION_USER	Yes	Yes	Yes
PROXY_USERID	The identifier of the user who opened the current session on behalf of SESSION_USER	Yes	Yes	Yes
SERVER_HOST	The host name of the machine where the instance is running	No	Yes	Yes
SERVICE_NAME	The name of the service that the session is connected to	No	Yes	Yes
SESSION_USER	The database user name of the user logged in	Yes	Yes	Yes
SESSION_USERID	The database identifier of the user logged in	Yes	Yes	Yes
SESSIONID	The identifier of the auditing session	Yes	Yes	Yes
SID	Session number	No	Yes	Yes
STATEMENTID	The auditing statement identifier	No	Yes	Yes
TERMINAL	The OS identifier of the current session	Yes	Yes	Yes

Applies To

The SYS_CONTEXT function can be used in the following versions of Oracle/PLSQL:

Oracle 12c, Oracle 11g, Oracle 10g, Oracle 9i, Oracle 8i

Example

Let's look at some Oracle SYS_CONTEXT function examples and explore how to use the SYS_CONTEXT function in Oracle/PLSQL.
For example:

SYS_CONTEXT('USERENV', 'NLS_DATE_FORMAT')
Result: 'RR-MM-DD'

SYS_CONTEXT('USERENV', 'NLS_SORT')
Result: 'BINARY'

Oracle: CHAR / VARCHAR2

Difference between CHAR and VARCHAR2 In Oracle

Oracle database contains different types of Character data types. Among those data types, CHAR and VARCHAR2 are widely utilized compared to other types such as VARCHAR, CLOB etc. in PL/SQL programming. Due to this it becomes evident to understand the difference between CHAR and VARCHAR2. Additionally, this is the most frequently asked interview question of Oracle database.

Difference between CHAR and VARCHAR2

VARCHAR2	CHAR
VARCHAR2 has maximum length of 32767 bytes	CHAR has maximum length of 2000 bytes
Space is not padded to the values for unused length	A space is padded to the right if length of variable is more than value passed to the variable
VARCHAR2 is used mainly for variable length character string	CHAR is generally used for fixed length character string, for example, Pin code
It is mandatory to specify length in VARCHAR2	By default the length of CHAR is 1 if it is not specified

Difference between CHAR and VARCHAR2 with example

Let us declare two variables with same string and different data types,

For example,

DECLARE
  lv_char      CHAR(8) := 'Hello';
  lv_varchar2  VARCHAR2(8) := 'Hello';
  len_char     NUMBER(2);
  len_varchar2 NUMBER(2);
BEGIN
  len_char     := LENGTH(lv_char);
  len_varchar2 := LENGTH(lv_varchar2);
  DBMS_OUTPUT.PUT_LINE('Length of CHAR -> ' || len_char);
  DBMS_OUTPUT.PUT_LINE('Length of VARCHAR2 -> ' || len_varchar2);
END;

Output

Length of CHAR -> 8
Length of VARCHAR2 -> 5

Which data type is better?

It is a long going myth among some programmers that performance of CHAR is better than VARCHAR2. There is no impact on performance if you have used VARCHAR2 for fixed length character string.

I personally would recommend using VARCHAR2 instead of CHAR because it will help you save waste of unnecessary disk space in case you are incorrect about fixed length character string.

Hope you like this article and find it useful. We highly appreciate comments and feedback.

Oracle Forms: KEY-NEXT-ITEM / POST-TEXT-ITEM

Difference Between KEY-NEXT-ITEM and POST-TEXT-ITEM Triggers in Oracle Forms

Several times you must have faced this tricky interview question about the difference between KEY-NEXT-ITEM and POST-TEXT-ITEM triggers in Oracle Forms. This question is also prevalent in Oracle Apps Technical interviews. Although the functions of both the triggers are same, why has Oracle given them separately? Let us understand the differences between them in this article.

Difference Between KEY-NEXT-ITEM and POST-TEXT-ITEM Triggers

KEY-NEXT-ITEM	POST-TEXT-ITEM
The key-next-item trigger fires only when pressing TAB on the keyboard.	Post-Text-Item fires once you leave the current item to the next item by a keyboard or a mouse.
It fires before the cursor is focused on the next item.	It fires after the cursor is focused on the next item.
Without a key event, this trigger will not be fired.	This trigger fires without a key event.

Please feel free to comment, if you find mistakes or errors in any of the topics. Your suggestions to improve the article helps us improve.

Oracle PL/SQL: Raise_Application_Error

What is raise_application_error?

The raise_application_error is a built-in procedure that allows the developer to associate his/her own error message to an oracle error number. It helps you in returning a custom error to your application so that you can avoid returning unhandled exceptions. Thus, the message given by a developer acts as an oracle error message here (Interesting!).
When raise_application_error raises an exception all uncommitted transactions are rolled back and an error message with an error number (ORA-) is returned to the user. This built-in procedure can raise an exception but cannot handle it.
raise_application_error procedure is a part of DBMS_STANDARD and STANDARD packages. You don’t have to reference them while using it.

Syntax of raise_application_error

raise_application_error(error_number, error_message [, {TRUE | FALSE}]);

error_number: Range of negative integers between -20,000 and -20,999
error_message: User-defined error message of datatype varchar2(2000)
TRUE/FALSE: Optional parameter that tells the procedure to add an error to error stack.
If TRUE then error is added into the stack of previous errors, if FALSE then error replaces all previous errors.

Example

SQL> create or replace procedure calc_absense (v_absense IN number) as
 2
 3 begin
 4 if v_absence > 10 then
 5 raise_application_error(-20001, 'Employee absence cannot be more than 10');
 6 end if;
 7
 8 end;

Procedure created.

SQL> declare
 2 v_abs number := &1;
 2 begin
 3 calc_absense (v_abs);
 4 end;
 6 /

Enter value for number: 20
 old 2: v_abs number := &1;
 new 2: v_abs number := 20;
 declare
 *
 ERROR at line 1:
 ORA-20001: Employee absence cannot be more than 10
 ORA-06512: at "PUBS.calc_absense ", line 5
 ORA-06512: at line 3

Note: Error numbers other than between -20000 and -20999 are reserved by Oracle to display its own standard error messages.

If you liked the above post, please leave your comments below.

Analytic functions by Example

This article provides a clear, thorough concept of analytic functions and its various options by a series of simple yet concept building examples. The article is intended for SQL coders, who for might be not be using analytic functions due to unfamiliarity with its cryptic syntax or uncertainty about its logic of operation. Often I see that people tend to reinvent the feature provided by analytic functions by native join and sub-query SQL. This article assumes familiarity with basic Oracle SQL, sub-query, join and group function from the reader. Based on that familiarity, it builds the concept of analytic functions through a series of examples.

It is true that whatever an analytic function does can be done by native SQL, with join and sub-queries. But the same routine done by analytic function is always faster, or at least as fast, when compared to native SQL. Moreover, I am not considering here the amount of time that is spent in coding the native SQLs, testing, debugging and tuning them.
The general syntax of analytic function is:

Function(arg1,..., argn) OVER ( [PARTITION BY <...>] [ORDER BY <....>] [<window_clause>] )
<window_clause> is like "ROW <?>" or "RANK <?>"
All the keywords will be dealt in details as we walk through the examples. The script for creating the schema (SCOTT) on which the example queries of this article are run can be obtained in ORACLE_HOME/sqlplus/demo/demobld.sql of any standard Oracle installation.

How are analytic functions different from group or aggregate functions?

SELECT deptno,
COUNT(*) DEPT_COUNT
FROM emp
WHERE deptno IN (20, 30)
GROUP BY deptno;

DEPTNO                 DEPT_COUNT             
---------------------- ---------------------- 
20                     5                      
30                     6                      

2 rows selected

Query-1

Consider the Query-1 and its result. Query-1 returns departments and their employee count. Most importantly it groups the records into departments in accordance with the GROUP BY clause. As such any non-"group by" column is not allowed in the select clause.

SELECT empno, deptno, 
COUNT(*) OVER (PARTITION BY 
deptno) DEPT_COUNT
FROM emp
WHERE deptno IN (20, 30);

     EMPNO     DEPTNO DEPT_COUNT
---------- ---------- ----------
      7369         20          5
      7566         20          5
      7788         20          5
      7902         20          5
      7876         20          5
      7499         30          6
      7900         30          6
      7844         30          6
      7698         30          6
      7654         30          6
      7521         30          6

11 rows selected.

Query-2

Now consider the analytic function query (Query-2) and its result. Note the repeating values of DEPT_COUNT column.
This brings out the main difference between aggregate and analytic functions. Though analytic functions give aggregate result they do not group the result set. They return the group value multiple times with each record. As such any other non-"group by" column or expression can be present in the select clause, for example, the column EMPNO in Query-2.
Analytic functions are computed after all joins, WHERE clause, GROUP BY and HAVING are computed on the query. The main ORDER BY clause of the query operates after the analytic functions. So analytic functions can only appear in the select list and in the main ORDER BY clause of the query.
In absence of any PARTITION or <window_clause> inside the OVER( ) portion, the function acts on entire record set returned by the where clause. Note the results of Query-3 and compare it with the result of aggregate function query Query-4.

SELECT empno, deptno, 
COUNT(*) OVER ( ) CNT
FROM emp
WHERE deptno IN (10, 20)
ORDER BY 2, 1;

     EMPNO     DEPTNO        CNT
---------- ---------- ----------
      7782         10          8
      7839         10          8
      7934         10          8
      7369         20          8
      7566         20          8
      7788         20          8
      7876         20          8
      7902         20          8

Query-3

SELECT COUNT(*) FROM emp
WHERE deptno IN (10, 20);

  COUNT(*)
----------
         8

Query-4

How to break the result set in groups or partitions?

It might be obvious from the previous example that the clause PARTITION BY is used to break the result set into groups. PARTITION BY can take any non-analytic SQL expression.
Some functions support the <window_clause> inside the partition to further limit the records they act on. In the absence of any <window_clause> analytic functions are computed on all the records of the partition clause.
The functions SUM, COUNT, AVG, MIN, MAX are the common analytic functions the result of which does not depend on the order of the records.
Functions like LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST VALUE depends on order of records. In the next example we will see how to specify that.

How to specify the order of the records in the partition?

The answer is simple, by the "ORDER BY" clause inside the OVER( ) clause. This is different from the ORDER BY clause of the main query which comes after WHERE. In this section we go ahead and introduce each of the very useful functions LEAD, LAG, RANK, DENSE_RANK, ROW_NUMBER, FIRST, FIRST VALUE, LAST, LAST VALUE and show how each depend on the order of the record.
The general syntax of specifying the ORDER BY clause in analytic function is:
ORDER BY <sql_expr> [ASC or DESC] NULLS [FIRST or LAST]
The syntax is self-explanatory.

ROW_NUMBER, RANK and DENSE_RANK

All the above three functions assign integer values to the rows depending on their order. That is the reason of clubbing them together.
ROW_NUMBER( ) gives a running serial number to a partition of records. It is very useful in reporting, especially in places where different partitions have their own serial numbers. In Query-5, the function ROW_NUMBER( ) is used to give separate sets of running serial to employees of departments 10 and 20 based on their HIREDATE.

SELECT empno, deptno, hiredate,
ROW_NUMBER( ) OVER (PARTITION BY
deptno ORDER BY hiredate
NULLS LAST) SRLNO
FROM emp
WHERE deptno IN (10, 20)
ORDER BY deptno, SRLNO;

EMPNO  DEPTNO HIREDATE       SRLNO
------ ------- --------- ----------
  7782      10 09-JUN-81          1
  7839      10 17-NOV-81          2
  7934      10 23-JAN-82          3
  7369      20 17-DEC-80          1
  7566      20 02-APR-81          2
  7902      20 03-DEC-81          3
  7788      20 09-DEC-82          4
  7876      20 12-JAN-83          5

8 rows selected.

Query-5 (ROW_NUMBER example)

RANK and DENSE_RANK both provide rank to the records based on some column value or expression. In case of a tie of 2 records at position N, RANK declares 2 positions N and skips position N+1 and gives position N+2 to the next record. While DENSE_RANK declares 2 positions N but does not skip position N+1.
Query-6 shows the usage of both RANK and DENSE_RANK. For DEPTNO 20 there are two contenders for the first position (EMPNO 7788 and 7902). Both RANK and DENSE_RANK declares them as joint toppers. RANK skips the next value that is 2 and next employee EMPNO 7566 is given the position 3. For DENSE_RANK there are no such gaps.

SELECT empno, deptno, sal,
RANK() OVER (PARTITION BY deptno
ORDER BY sal DESC NULLS LAST) RANK,
DENSE_RANK() OVER (PARTITION BY
deptno ORDER BY sal DESC NULLS
LAST) DENSE_RANK
FROM emp
WHERE deptno IN (10, 20)
ORDER BY 2, RANK;

EMPNO  DEPTNO   SAL  RANK DENSE_RANK
------ ------- ----- ----- ----------
  7839      10  5000     1          1
  7782      10  2450     2          2
  7934      10  1300     3          3
  7788      20  3000     1          1
  7902      20  3000     1          1
  7566      20  2975     3          2
  7876      20  1100     4          3
  7369      20   800     5          4

8 rows selected.

Query-6 (RANK and DENSE_RANK example)

LEAD and LAG

LEAD has the ability to compute an expression on the next rows (rows which are going to come after the current row) and return the value to the current row. The general syntax of LEAD is shown below:
LEAD (<sql_expr>, <offset>, <default>) OVER (<analytic_clause>)
<sql_expr> is the expression to compute from the leading row.
<offset> is the index of the leading row relative to the current row.
<offset> is a positive integer with default 1.
<default> is the value to return if the <offset> points to a row outside the partition range.
The syntax of LAG is similar except that the offset for LAG goes into the previous rows.
Query-7 and its result show simple usage of LAG and LEAD function.

SELECT deptno, empno, sal,
LEAD(sal, 1, 0) OVER (PARTITION BY dept ORDER BY sal DESC NULLS LAST) NEXT_LOWER_SAL,
LAG(sal, 1, 0) OVER (PARTITION BY dept ORDER BY sal DESC NULLS LAST) PREV_HIGHER_SAL
FROM emp
WHERE deptno IN (10, 20)
ORDER BY deptno, sal DESC;

 DEPTNO  EMPNO   SAL NEXT_LOWER_SAL PREV_HIGHER_SAL
------- ------ ----- -------------- ---------------
     10   7839  5000           2450               0
     10   7782  2450           1300            5000
     10   7934  1300              0            2450
     20   7788  3000           3000               0
     20   7902  3000           2975            3000
     20   7566  2975           1100            3000
     20   7876  1100            800            2975
     20   7369   800              0            1100

8 rows selected.

Query-7 (LEAD and LAG)

FIRST VALUE and LAST VALUE function

The general syntax is:
FIRST_VALUE(<sql_expr>) OVER (<analytic_clause>)
The FIRST_VALUE analytic function picks the first record from the partition after doing the ORDER BY. The <sql_expr> is computed on the columns of this first record and results are returned. The LAST_VALUE function is used in similar context except that it acts on the last record of the partition.

-- How many days after the first hire of each department were the next
-- employees hired?

SELECT empno, deptno, hiredate ? FIRST_VALUE(hiredate)
OVER (PARTITION BY deptno ORDER BY hiredate) DAY_GAP
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, DAY_GAP;

     EMPNO     DEPTNO    DAY_GAP
---------- ---------- ----------
      7369         20          0
      7566         20        106
      7902         20        351
      7788         20        722
      7876         20        756
      7499         30          0
      7521         30          2
      7698         30         70
      7844         30        200
      7654         30        220
      7900         30        286

11 rows selected.

Query-8 (FIRST_VALUE)

FIRST and LAST function

The FIRST function (or more properly KEEP FIRST function) is used in a very special situation. Suppose we rank a group of record and found several records in the first rank. Now we want to apply an aggregate function on the records of the first rank. KEEP FIRST enables that.
The general syntax is:
Function( ) KEEP (DENSE_RANK FIRST ORDER BY <expr>) OVER (<partitioning_clause>)
Please note that FIRST and LAST are the only functions that deviate from the general syntax of analytic functions. They do not have the ORDER BY inside the OVER clause. Neither do they support any <window> clause. The ranking done in FIRST and LAST is always DENSE_RANK. The query below shows the usage of FIRST function. The LAST function is used in similar context to perform computations on last ranked records.

-- How each employee's salary compare with the average salary of the first
-- year hires of their department?

SELECT empno, deptno, TO_CHAR(hiredate,'YYYY') HIRE_YR, sal,
TRUNC(
AVG(sal) KEEP (DENSE_RANK FIRST
ORDER BY TO_CHAR(hiredate,'YYYY') )
OVER (PARTITION BY deptno)
     ) AVG_SAL_YR1_HIRE
FROM emp
WHERE deptno IN (20, 10)
ORDER BY deptno, empno, HIRE_YR;

     EMPNO     DEPTNO HIRE        SAL AVG_SAL_YR1_HIRE
---------- ---------- ---- ---------- ----------------
      7782         10 1981       2450             3725
      7839         10 1981       5000             3725
      7934         10 1982       1300             3725
      7369         20 1980        800              800
      7566         20 1981       2975              800
      7788         20 1982       3000              800
      7876         20 1983       1100              800
      7902         20 1981       3000              800

8 rows selected.

Query-9 (KEEP FIRST)

How to specify the Window clause (ROW type or RANGE type windows)?

Some analytic functions (AVG, COUNT, FIRST_VALUE, LAST_VALUE, MAX, MIN and SUM among the ones we discussed) can take a window clause to further sub-partition the result and apply the analytic function. An important feature of the windowing clause is that it is dynamic in nature.
The general syntax of the <window_clause> is

[ROW or RANGE] BETWEEN <start_expr> AND <end_expr>
<start_expr> can be any one of the following

UNBOUNDED PECEDING
CURRENT ROW
<sql_expr> PRECEDING or FOLLOWING.

UNBOUNDED FOLLOWING or
CURRENT ROW or
<sql_expr> PRECEDING or FOLLOWING.

For ROW type windows the definition is in terms of row numbers before or after the current row. So for ROW type windows <sql_expr> must evaluate to a positive integer.
For RANGE type windows the definition is in terms of values before or after the current ORDER. We will take this up in details latter.
The ROW or RANGE window cannot appear together in one OVER clause. The window clause is defined in terms of the current row. But may or may not include the current row. The start point of the window and the end point of the window can finish before the current row or after the current row. Only start point cannot come after the end point of the window. In case any point of the window is undefined the default is UNBOUNDED PRECEDING for <start_expr> and UNBOUNDED FOLLOWING for <end_expr>.
If the end point is the current row, syntax only in terms of the start point can be can be
[ROW or RANGE] [<start_expr> PRECEDING or UNBOUNDED PRECEDING ]
[ROW or RANGE] CURRENT ROW is also allowed but this is redundant. In this case the function behaves as a single-row function and acts only on the current row.

ROW Type Windows

For analytic functions with ROW type windows, the general syntax is:
Function( ) OVER (PARTITIN BY <expr1> ORDER BY <expr2,..> ROWS BETWEEN <start_expr> AND <end_expr>)
or
Function( ) OVER (PARTITON BY <expr1> ORDER BY <expr2,..> ROWS [<start_expr> PRECEDING or UNBOUNDED PRECEDING]
For ROW type windows the windowing clause is in terms of record numbers.
The query Query-10 has no apparent real life description (except column FROM_PU_C) but the various windowing clause are illustrated by a COUNT(*) function. The count simply shows the number of rows inside the window definition. Note the build up of the count for each column for the YEAR 1981.
The column FROM_P3_TO_F1 shows an example where start point of the window is before the current row and end point of the window is after current row. This is a 5 row window; it shows values less than 5 during the beginning and end.

-- The query below has no apparent real life description (except 
-- column FROM_PU_C) but is remarkable in illustrating the various windowing
-- clause by a COUNT(*) function.
 
SELECT empno, deptno, TO_CHAR(hiredate, 'YYYY') YEAR,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 FOLLOWING) FROM_P3_TO_F1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM_PU_TO_C,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING) FROM_P2_TO_P1,
COUNT(*) OVER (PARTITION BY TO_CHAR(hiredate, 'YYYY')
ORDER BY hiredate ROWS BETWEEN 1 FOLLOWING AND 3 FOLLOWING) FROM_F1_TO_F3
FROM emp
ORDEDR BY hiredate

 EMPNO  DEPTNO YEAR FROM_P3_TO_F1 FROM_PU_TO_C FROM_P2_TO_P1 FROM_F1_TO_F3
------ ------- ---- ------------- ------------ ------------- -------------
  7369      20 1980             1            1             0             0
  <font bgcolor=yellow>7499      30 1981             2            1             0             3
  7521      30 1981             3            2             1             3
  7566      20 1981             4            3             2             3
  7698      30 1981             5            4             3             3
  7782      10 1981             5            5             3             3
  7844      30 1981             5            6             3             3
  7654      30 1981             5            7             3             3
  7839      10 1981             5            8             3             2
  7900      30 1981             5            9             3             1
  7902      20 1981             4           10             3             0</font>
  7934      10 1982             2            1             0             1
  7788      20 1982             2            2             1             0
  7876      20 1983             1            1             0             0

14 rows selected.

Query-10 (ROW type windowing example)

The column FROM_PU_TO_CURR shows an example where start point of the window is before the current row and end point of the window is the current row. This column only has some real world significance. It can be thought of as the yearly employee build-up of the organization as each employee is getting hired.
The column FROM_P2_TO_P1 shows an example where start point of the window is before the current row and end point of the window is before the current row. This is a 3 row window and the count remains constant after it has got 3 previous rows.
The column FROM_F1_TO_F3 shows an example where start point of the window is after the current row and end point of the window is after the current row. This is a reverse of the previous column. Note how the count declines during the end.

RANGE Windows

For RANGE windows the general syntax is same as that of ROW:
Function( ) OVER (PARTITION BY <expr1> ORDER BY <expr2> RANGE BETWEEN <start_expr> AND <end_expr>)
or
Function( ) OVER (PARTITION BY <expr1> ORDER BY <expr2> RANGE [<start_expr> PRECEDING or UNBOUNDED PRECEDING]
For <start_expr> or <end_expr> we can use UNBOUNDED PECEDING, CURRENT ROW or <sql_expr> PRECEDING or FOLLOWING. However for RANGE type windows <sql_expr> must evaluate to value compatible with ORDER BY expression <expr1>.
<sql_expr> is a logical offset. It must be a constant or expression that evaluates to a positive numeric value or an interval literal. Only one ORDER BY expression is allowed.
If <sql_expr> evaluates to a numeric value, then the ORDER BY expr must be a NUMBER or DATE datatype. If <sql_expr> evaluates to an interval value, then the ORDER BY expr must be a DATE datatype.
Note the example (Query-11) below which uses RANGE windowing. The important thing here is that the size of the window in terms of the number of records can vary.

-- For each employee give the count of employees getting half more that their 
-- salary and also the count of employees in the departments 20 and 30 getting half 
-- less than their salary.
 
SELECT deptno, empno, sal,
Count(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN UNBOUNDED PRECEDING AND (sal/2) PRECEDING) CNT_LT_HALF,
COUNT(*) OVER (PARTITION BY deptno ORDER BY sal RANGE
BETWEEN (sal/2) FOLLOWING AND UNBOUNDED FOLLOWING) CNT_MT_HALF
FROM emp
WHERE deptno IN (20, 30)
ORDER BY deptno, sal

 DEPTNO  EMPNO   SAL CNT_LT_HALF CNT_MT_HALF
------- ------ ----- ----------- -----------
     20   7369   800           0           3
     20   7876  1100           0           3
     20   7566  2975           2           0
     20   7788  3000           2           0
     20   7902  3000           2           0
     30   7900   950           0           3
     30   7521  1250           0           1
     30   7654  1250           0           1
     30   7844  1500           0           1
     30   7499  1600           0           1
     30   7698  2850           3           0

11 rows selected.

Query-11 (RANGE type windowing example)

Order of computation and performance tips

Defining the PARTITOIN BY and ORDER BY clauses on indexed columns (ordered in accordance with the PARTITION CLAUSE and then the ORDER BY clause in analytic function) will provide optimum performance. For Query-5, for example, a composite index on (deptno, hiredate) columns will prove effective.
It is advisable to always use CBO for queries using analytic functions. The tables and indexes should be analyzed and optimizer mode should be CHOOSE.
Even in absence of indexes analytic functions provide acceptable performance but need to do sorting for computing partition and order by clause. If the query contains multiple analytic functions, sorting and partitioning on two different columns should be avoided if they are both not indexed.

Conclusion

The aim of this article is not to make the reader try analytic functions forcibly in every other complex SQL. It is meant for a SQL coder, who has been avoiding analytic functions till now, even in complex analytic queries and reinventing the same feature much painstakingly by native SQL and join query. Its job is done if such a person finds analytic functions clear, understandable and usable after going through the article, and starts using them.

Click to view article reference

Adsense Ad

Monday, 30 September 2019

String Aggregation Techniques

LISTAGG Analytic Function in 11g Release 2

WM_CONCAT Built-in Function (Not Supported)

User-Defined Aggregate Function

Specific Function

Generic Function using Ref Cursor

ROW_NUMBER() and SYS_CONNECT_BY_PATH functions in Oracle 9i

COLLECT function in Oracle 10g

Wednesday, 18 September 2019

Optimizer Hints

Hints in Detail

Conclusion

How to use hints in Oracle sql for performance

Getting the Current Date and Time

Description

Syntax

Parameters or Arguments

Returns

Note

Applies To

Example

Difference between CHAR and VARCHAR2

Difference between CHAR and VARCHAR2 with example

Which data type is better?

Difference Between KEY-NEXT-ITEM and POST-TEXT-ITEM Triggers

Syntax of raise_application_error

Example

How are analytic functions different from group or aggregate functions?

How to break the result set in groups or partitions?

How to specify the order of the records in the partition?

ROW_NUMBER, RANK and DENSE_RANK

LEAD and LAG

FIRST VALUE and LAST VALUE function

FIRST and LAST function

How to specify the Window clause (ROW type or RANGE type windows)?

ROW Type Windows

RANGE Windows

Order of computation and performance tips

Conclusion

OCP

Blog Archive

adsense ad