Tuesday, August 8, 2017

SQL JOINS



Inner Join

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data. SQL examines both tables specified for the join to retrieve data from all the rows that meet the search condition for the join. There are two ways of specifying an inner join: using the JOIN syntax, and using the WHERE clause.
Suppose you want to retrieve the employee numbers, names, and project numbers for all employees that are responsible for a project. In other words, you want the EMPNO and LASTNAME columns from the CORPDATA.EMPLOYEE table and the PROJNO column from the CORPDATA.PROJECT table. Only employees with last names starting with 'S' or later should be considered. To find this information, you need to join the two tables.

Inner join using JOIN syntax

To use the inner join syntax, both of the tables you are joining are listed in the FROM clause, along with the join condition that applies to the tables. The join condition is specified after the ON keyword and determines how the two tables are to be compared to each other to produce the join result. The condition can be any comparison operator; it does not need to be the equal operator. Multiple join conditions can be specified in the ON clause separated by the AND keyword. Any additional conditions that do not relate to the actual join are specified in either the WHERE clause or as part of the actual join in the ON clause.

  SELECT EMPNO, LASTNAME, PROJNO
    FROM CORPDATA.EMPLOYEE INNER JOIN CORPDATA.PROJECT
          ON EMPNO = RESPEMP
    WHERE LASTNAME > 'S'
 
In this example, the join is done on the two tables using the EMPNO and RESPEMP columns from the tables. Since only employees that have last names starting with at least 'S' are to be returned, this additional condition is provided in the WHERE clause.
This query returns the following output:

EMPNO
LASTNAME
PROJNO
000250
SMITH
AD3112
000060
STERN
MA2110
000100
SPENSER
OP2010
000020
THOMPSON
PL2100

Inner join using the WHERE clause

Using the WHERE clause to perform this same join is written with both the join condition and the additional selection condition in the WHERE clause. The tables to be joined are listed in the FROM clause, separated by commas.
  SELECT EMPNO, LASTNAME, PROJNO
    FROM CORPDATA.EMPLOYEE, CORPDATA.PROJECT
    WHERE EMPNO = RESPEMP
    AND LASTNAME > 'S'
This query returns the same output as the previous example.

Left Outer Join

A left outer join will return all the rows that an inner join returns plus one row for each of the other rows in the first table that did not have a match in the second table.
Suppose you want to find all employees and the projects they are currently responsible for. You want to see those employees that are not currently in charge of a project as well. The following query will return a list of all employees whose names are greater than 'S', along with their assigned project numbers.
  SELECT EMPNO, LASTNAME, PROJNO
    FROM CORPDATA.EMPLOYEE LEFT OUTER JOIN CORPDATA.PROJECT
          ON EMPNO = RESPEMP
    WHERE LASTNAME > 'S'
The result of this query contains some employees that do not have a project number. They are listed in the query, but have the null value returned for their project number.

EMPNO
LASTNAME
PROJNO
000020
THOMPSON
PL2100
000060
STERN
MA2110
000100
SPENSER
OP2010
000170
YOSHIMURA
-
000180
SCOUTTEN
-
000190
WALKER
-
000250
SMITH
AD3112
000280
SCHNEIDER
-
000300
SMITH
-
000310
SETRIGHT
-
200170
YAMAMOTO
-
200280
SCHWARTZ
-
200310
SPRINGER
-
200330
WONG
-

Notes

Using the RRN scalar function to return the relative record number for a column in the table on the right in a left outer join or exception join will return a value of 0 for the unmatched rows.

Right Outer Join

A right outer join will return all the rows that an inner join returns plus one row for each of the other rows in the second table that did not have a match in the first table. It is the same as a left outer join with the tables specified in the opposite order.
The query that was used as the left outer join example could be rewritten as a right outer join as follows:

  SELECT EMPNO, LASTNAME, PROJNO
    FROM CORPDATA.PROJECT RIGHT OUTER JOIN CORPDATA.EMPLOYEE
          ON EMPNO = RESPEMP
    WHERE LASTNAME > 'S'
The results of this query are identical to the results from the left outer join query.

Using the UNION keyword to combine subselects

Using the UNION keyword, you can combine two or more subselects to form a fullselect. When SQL encounters the UNION keyword, it processes each subselect to form an interim result table, then it combines the interim result table of each subselect and deletes duplicate rows to form a combined result table. You use UNION to merge lists of values from two or more tables. You can use any of the clauses and techniques you have learned so far when coding select-statements.
You can use UNION to eliminate duplicates when merging lists of values obtained from several tables. For example, you can obtain a combined list of employee numbers that includes:
·     People in department D11
·     People whose assignments include projects MA2112, MA2113, and AD3111
The combined list is derived from two tables and contains no duplicates. To do this, specify:
SELECT EMPNO
   FROM CORPDATA.EMPLOYEE
   WHERE WORKDEPT = 'D11'
 UNION
 SELECT EMPNO
   FROM CORPDATA.EMPPROJACT
   WHERE PROJNO = 'MA2112' OR
      PROJNO = 'MA2113' OR
      PROJNO = 'AD3111'
 ORDER BY EMPNO
 
You can also use UNION in a common table expression, nested table expression, or when creating a view. See Creating a view with UNION for details.
To better understand the results from these SQL statements, imagine that SQL goes through the following process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
     FROM CORPDATA.EMPLOYEE
     WHERE WORKDEPT = 'D11'
Which results in an interim result table:
EMPNO from CORPDATA.EMPLOYEE
000060
000150
000160
000170
000180
000190
000200
000210
000220
200170
200220

Step 2. SQL processes the second SELECT statement:
SELECT EMPNO
     FROM CORPDATA.EMPPROJACT
     WHERE PROJNO='MA2112' OR
                 PROJNO= 'MA2113' OR
                 PROJNO= 'AD3111'
 
Which results in another interim result table:
EMPNO from CORPDATA.EMPPROJACT
000230
000230
000240
000230
000230
000240
000230
000150
000170
000190
000170
000190
000150
000160
000180
000170
000210
000210

Step 3. SQL combines the two interim result tables, removes duplicate rows, and orders the result:
SELECT EMPNO 
     FROM CORPDATA.EMPLOYEE
     WHERE WORKDEPT = 'D11'
UNION
SELECT EMPNO 
     FROM CORPDATA.EMPPROJACT
     WHERE PROJNO='MA2112' OR
                 PROJNO= 'MA2113' OR
                 PROJNO= 'AD3111'
ORDER BY EMPNO
 
Which results in a combined result table with values in ascending sequence:
EMPNO
000060
000150
000160
000170
000180
000190
000200
000210
000220
000230
000240
200170
200220

When you use UNION:
·     Any ORDER BY clause must appear after the last subselect that is part of the union. In this example, the results are sequenced on the basis of the first selected column, EMPNO. The ORDER BY clause specifies that the combined result table is to be in collated sequence. ORDER BY is not allowed in a view.
·     A name may be specified on the ORDER BY clause if the result columns are named. A result column is named if the corresponding columns in each of the unioned select-statements have the same name. An AS clause can be used to assign a name to columns in the select list.
·           SELECT A + B AS X ...
·           UNION
·           SELECT X ... ORDER BY X
If the result columns are unnamed, use a positive integer to order the result. The number refers to the position of the expression in the list of expressions you include in your subselects.
  SELECT A + B ...
  UNION
  SELECT X ... ORDER BY 1
To identify which subselect each row is from, you can include a constant at the end of the select list of each subselect in the union. When SQL returns your results, the last column contains the constant for the subselect that is the source of that row. For example, you can specify:
  SELECT A, B, 'A1' ... 
  UNION 
  SELECT X, Y, 'B2'...
When a row is returned, it includes a value (either A1 or B2) to indicate the table that is the source of the row's values. If the column names in the union are different, SQL uses the set of column names specified in the first subselect when interactive SQL displays or prints the results, or in the SQLDA resulting from processing an SQL DESCRIBE statement.


























                  




No comments:

Post a Comment