Buddha Tree: A detailed introduction of SQL Query

A major feature of relational databases, however, is to combine (join) tuples stored in different tables in order to display more meaningful and complete information. In SQL the select statement is used for this kind of queries joining relations:

select [distinct] [.], . . . , [.]
from [], . . . , []
[where ]

The specification of table aliases in the from clause is necessary to refer to columns that have
the same name in different tables. For example, the column DEPTNO occurs in both EMP and
DEPT. If we want to refer to either of these columns in the where or select clause, a table
alias has to be specified and put in the front of the column name. Instead of a table alias also
the complete relation name can be put in front of the column such as DEPT.DEPTNO, but this
sometimes can lead to rather lengthy query formulations.

Joining Relations – Comparisons in the where clause are used to combine rows from the tables listed in the from clause.

Example: In the table EMP only the numbers of the departments are stored, not their name. For each salesman, we now want to retrieve the name as well as the number and the name of the department where he is working:

select ENAME, E.DEPTNO, DNAME from EMP E, DEPT D where E.DEPTNO = D.DEPTNO
and JOB = ’SALESMAN’;

Explanation: E and D are table aliases for EMP and DEPT, respectively. The computation of the
query result occurs in the following manner (without optimization):

1. Each row from the table EMP is combined with each row from the table DEPT (this operation
is called Cartesian product). If EMP contains m rows and DEPT contains n rows, we thus get n * m rows.

2. From these rows those that have the same department number are selected (where E.DEPTNO = D.DEPTNO).

3. From this result finally all rows are selected for which the condition JOB = ’SALESMAN’ holds.
In this example the joining condition for the two tables is based on the equality operator “=”.
The columns compared by this operator are called join columns and the join operation is called
an equijoin.

Any number of tables can be combined in a select statement.
Example: For each project, retrieve its name, the name of its manager, and the name of the department where the manager is working:

select ENAME, DNAME, PNAME from EMP E, DEPT D, PROJECT P where E.EMPNO = P.MGR
and D.DEPTNO = E.DEPTNO;

It is even possible to join a table with itself: Example: List the names of all employees together with the name of their manager:

select E1.ENAME, E2.ENAME from EMP E1, EMP E2 where E1.MGR = E2.EMPNO;

Explanation: The join columns are MGR for the table E1 and EMPNO for the table E2.
The equijoin comparison is E1.MGR = E2.EMPNO.

Subqueries – Up to now we have only concentrated on simple comparison conditions in a where clause, i.e., we have compared a column with a constant or we have compared two columns. As we have already seen for the insert statement, queries can be used for assignments to columns. A query result can also be used in a condition of a where clause. In such a case the query is called a
subquery and the complete select statement is called a nested query. A respective condition in the where clause then can have one of the following forms:

1. Set-valued subqueries
[not] in ()
[any|all] ()
An can either be a column or a computed value.

2. Test for (non)existence
[not] exists ()
In a where clause conditions using subqueries can be combined arbitrarily by using the logical
connectives and and or.

Example: List the name and salary of employees of the department 20 who are leading
a project that started before December 31, 1990:

select ENAME, SAL from EMP where EMPNO in (select PMGR from PROJECT where PSTART < ’31-DEC-90’) and DEPTNO =20;

Explanation: The subquery retrieves the set of those employees who manage a project that
started before December 31, 1990. If the employee working in department 20 is contained in
this set (in operator), this tuple belongs to the query result set.
Example: List all employees who are working in a department located in BOSTON:

select * from EMP where DEPTNO in (select DEPTNO from DEPT where LOC = ’BOSTON’);

The subquery retrieves only one value (the number of the department located in Boston). Thus
it is possible to use “=” instead of in. As long as the result of a subquery is not known in
advance, i.e., whether it is a single value or a set, it is advisable to use the in operator.

A subquery may use again a subquery in its where clause. Thus conditions can be nested
arbitrarily. An important class of subqueries are those that refer to its surrounding (sub)query
and the tables listed in the from clause, respectively. Such type of queries is called correlated
subqueries.
Example: List all those employees who are working in the same department as their manager
(note that components in [ ] are optional:

select * from EMP E1 where DEPTNO in (select DEPTNO from EMP [E] where [E.]EMPNO = E1.MGR);

Explanation: The subquery in this example is related to its surrounding query since it refers to
the column E1.MGR. A tuple is selected from the table EMP (E1) for the query result if the value
for the column DEPTNO occurs in the set of values select in the subquery. One can think of the
evaluation of this query as follows: For each tuple in the table E1, the subquery is evaluated
individually. If the condition where DEPTNO in . . . evaluates to true, this tuple is selected.
Note that an alias for the table EMP in the subquery is not necessary since columns without a
preceding alias listed there always refer to the innermost query and tables.
Conditions of the form [any|all] are used
to compare a given with each value selected by .

• For the clause any, the condition evaluates to true if there exists at least on row selected
by the subquery for which the comparison holds. If the subquery yields an empty result set, the condition is not satisfied.
• For the clause all, in contrast, the condition evaluates to true if for all rows selected by
the subquery the comparison holds. In this case the condition evaluates to true if the subquery does not yield any row or value.
Example: Retrieve all employees who are working in department 10 and who earn at least as much as any (i.e., at least one) employee working in department 30:

select * from EMP where SAL >= any (select SAL from EMP where DEPTNO = 30) and DEPTNO = 10;

Note: Also in this subquery no aliases are necessary since the columns refer to the innermost
from clause.
Example: List all employees who are not working in department 30 and who earn more than
all employees working in department 30:

select * from EMP where SAL > all (select SAL from EMP where DEPTNO = 30) and DEPTNO 30;

For all and any, the following equivalences hold:
in , = any
not in , all or != all

Often a query result depends on whether certain rows do (not) exist in (other) tables. Such
type of queries is formulated using the exists operator.
Example: List all departments that have no employees:

select * from DEPT where not exists (select * from EMP where DEPTNO = DEPT.DEPTNO);

Explanation: For each tuple from the table DEPT, the condition is checked whether there exists
a tuple in the table EMP that has the same department number (DEPT.DEPTNO). In case no such
tuple exists, the condition is satisfied for the tuple under consideration and it is selected. If
there exists a corresponding tuple in the table EMP, the tuple is not selected.

Operations on Result Sets – Sometimes it is useful to combine query results from two or more queries into a single result. SQL supports three set operators which have the pattern:

The set operators are:

• union [all] returns a table consisting of all rows either appearing in the result of or in the result of . Duplicates are automatically eliminated unless the clause all is used.
• intersect returns all rows that appear in both results and .
• minus returns those rows that appear in the result of but not in the result of .

Example: Assume that we have a table EMP2 that has the same structure and columns as the table EMP:
• All employee numbers and names from both tables: select EMPNO, ENAME from EMP
union select EMPNO, ENAME from EMP2;
• Employees who are listed in both EMP and EMP2: select * from EMP intersect
select * from EMP2;
• Employees who are only listed in EMP: select * from EMP minus select * from EMP2;
Each operator requires that both tables have the same data types for the columns to which the
operator is applied.

Grouping – Often applications require grouping rows that have certain properties and then applying an aggregate function on one column for each group separately. For this, SQL provides the clause group by . This clause appears after the where clause and must refer to columns of tables listed in the from clause.
select from where group by [having ];

Those rows retrieved by the selected clause that have the same value(s) for
are grouped. Aggregations specified in the select clause are then applied to each group separately.
It is important that only those columns that appear in the clause
can be listed without an aggregate function in the select clause

Example: For each department, we want to retrieve the minimum and maximum salary.
select DEPTNO, min(SAL), max(SAL) from EMP group by DEPTNO;

Rows from the table EMP are grouped such that all rows in a group have the same department
number. The aggregate functions are then applied to each such group. We thus get the following
query result:

DEPTNO MIN(SAL) MAX(SAL)
10 1300 5000
20 800 3000
30 950 2850

Rows to form a group can be restricted in the where clause. For example, if we add the
condition where JOB = ’CLERK’, only respective rows build a group. The query then would
retrieve the minimum and maximum salary of all clerks for each department. Note that is not
allowed to specify any other column than DEPTNO without an aggregate function in the select
clause since this is the only column listed in the group by clause (is it also easy to see that
other columns would not make any sense).

Once groups have been formed, certain groups can be eliminated based on their properties,
e.g., if a group contains less than three rows. This type of condition is specified using the
having clause. As for the select clause also in a having clause only and
aggregations can be used.

Example: Retrieve the minimum and maximum salary of clerks for each department having
more than three clerks.
select DEPTNO, min(SAL), max(SAL) from EMP where JOB = ’CLERK’ group by DEPTNO
having count(*) > 3;

Note that it is even possible to specify a subquery in a having clause. In the above query, for
example, instead of the constant 3, a subquery can be specified.

A query containing a group by clause is processed in the following way:
1. Select all rows that satisfy the condition specified in the where clause.
2. From these rows form groups according to the group by clause.
3. Discard all groups that do not satisfy the condition in the having clause.
4. Apply aggregate functions to each group.
5. Retrieve values for the columns and aggregations listed in the select clause.

Comments on Tables – Accessing tables of other users, Provided that a user has the privilege to access tables of other users , she/he can refer to these tables in her/his queries. Let be a user in the Oracle system and a table of this user. This table can be accessed by other (privileged) users using the command

select * from .;

In case that one often refers to tables of other users, it is useful to use a synonym instead of
.. In Oracle-SQL a synonym can be created using the command create synonym for . ;

It is then possible to use simply in a from clause. Synonyms can also be created for
one’s own tables.

Adding Comments to Definitions

For applications that include numerous tables, it is useful to add comments on table definitions
or to add comments on columns. A comment on a table can be created using the command
comment on table is ’’;

A comment on a column can be created using the command comment on column . is ’’; Comments on tables and columns are stored in the data dictionary. They can be accessed using the data dictionary views USER TAB COMMENTS and USER COL COMMENTS

Modifying Table- and Column Definitions – It is possible to modify the structure of a table (the relation schema) even if rows have already been inserted into this table. A column can be added using the alter table command alter table add( [default ] []);

If more than only one column should be added at one time, respective add clauses need to be
separated by colons. A table constraint can be added to a table using

alter table add ();

Note that a column constraint is a table constraint, too. not null and primary key constraints
can only be added to a table if none of the specified columns contains a null value. Table
definitions can be modified in an analogous way. This is useful, e.g., when the size of strings
that can be stored needs to be increased. The syntax of the command for modifying a column
is – alter table modify( [] [default ] []);

Note: In earlier versions of Oracle it is not possible to delete single columns from a table definition. A workaround is to create a temporary table and to copy respective columns and rows into this new table. Furthermore, it is not possible to rename tables or columns. In the most recent version (9i), using the alter table command, it is possible to rename a table, columns, and constraints. In this version, there also exists a drop column clause as part of the alter table statement.

Deleting a Table – A table and its rows can be deleted by issuing the command drop table [cascade constraints];.

Buddha Tree

Pages

Tuesday 20 August 2013

A detailed introduction of SQL Query

No comments:

Post a Comment