Micro Codes - "Coding Schools": 2008

Finding the Min/Max Value in a Column : SQL

Problem

You want to find the highest and lowest values in a given column. For example, you want to find the highest and lowest salaries for all employees, as well as the highest and lowest salaries for each department.

Solution

When searching for the lowest and highest salaries for all employees, simply use the functions MIN and MAX, respectively:

 
 1 select min(sal) as min_sal, max(sal) as max_sal
 2   from emp

    MIN_SAL     MAX_SAL
 ----------  ----------
        800        5000

When searching for the lowest and highest salaries for each department, use the functions MIN and MAX with the GROUP BY clause:

 1 select deptno, min(sal) as min_sal, max(sal) as max_sal
 2   from emp
 3  group by deptno

     DEPTNO     MIN_SAL     MAX_SAL
 ----------  ----------  ----------
         10        1300        5000
         20         800        3000
         30         950        2850

Discussion

When searching for the highest or lowest values, and in cases where the whole table is the group or window, simply apply the MIN or MAX function to the column you are interested in without using the GROUP BY clause.

Remember that the MIN and MAX functions ignore NULLs, and that you can have NULL groups as well as NULL values for columns in a group. The following are examples that ultimately lead to a query using GROUP BY that returns NULL values for two groups (DEPTNO 10 and 20):

 
 select deptno, comm
   from emp
  where deptno in (10,30)
  order by 1


      DEPTNO       COMM
  ---------- ----------
          10
          10
          10
          30        300
          30        500
          30
          30          0
          30       1300
          30

 
 select min(comm), max(comm)
   from emp

  MIN(COMM)   MAX(COMM)
 ----------  ----------
          0        1300

 
 select deptno, min(comm), max(comm)
   from emp
  group by deptno

      DEPTNO  MIN(COMM)   MAX(COMM)
  ---------- ----------  ----------
          10
          20
          30          0        1300

Remember, Even if nothing other than aggregate functions are listed in the SELECT clause, you can still group by other columns in the table; for example:

 select min(comm), max(comm)
   from emp
  group by deptno

  MIN(COMM)   MAX(COMM)
 ----------  ----------
          0        1300

Here you are still grouping by DEPTNO even though it is not in the SELECT clause. Including the column you are grouping by in the SELECT clause often improves readability, but is not mandatory. It is mandatory, however, that any column in the SELECT list of a GROUP BY query also be listed in the GROUP BY clause.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Computing an Average : SQL Tutorial

Problem

You want to compute the average value in a column, either for all rows in a table or for some subset of rows. For example, you might want to find the average salary for all employees as well as the average salary for each department.

Solution

When computing the average of all employee salaries, simply apply the AVG function to the column containing those salaries. By excluding a WHERE clause, the average is computed against all non-NULL values:

 
 1 select avg(sal) as avg_sal
 2   from emp

    AVG_SAL
 ----------
 2073.21429

To compute the average salary for each department, use the GROUP BY clause to create a group corresponding to each department:

 
 1 select deptno, avg(sal) as avg_sal
 2   from emp
 3  group by deptno

     DEPTNO     AVG_SAL
 ----------  ----------
         10  2916.66667
         20        2175
         30  1566.66667

Discussion

When finding an average where the whole table is the group or window, simply apply the AVG function to the column you are interested in without using the GROUP BY clause. It is important to realize that the function AVG ignores NULLs. The effect of NULL values being ignored can be seen here:

 create table t2(sal integer)
 insert into t2 values (10)
 insert into t2 values (20)
 insert into t2 values (null)

 
 select avg(sal)    select distinct 30/2
   from t2            from t2

   AVG(SAL)               30/2
 ----------         ----------
         15                 15

 
 select avg(coalesce(sal,0))    select distinct 30/3
   from t2                        from t2
   
 AVG(COALESCE(SAL,0))                 30/3
 --------------------           ----------
                   10                   10

The COALESCE function will return the first non-NULL value found in the list of values that you pass. When NULL SAL values are converted to zero, the average changes. When invoking aggregate functions, always give thought to how you want NULLs handled.

The second part of the solution uses GROUP BY (line 3) to divide employee records into groups based on department affiliation. GROUP BY automatically causes aggregate functions such as AVG to execute and return a result for each group. In this example, AVG would execute once for each department-based group of employee records.

It is not necessary, by the way, to include GROUP BY columns in your select list. For example:

 
 select avg(sal)
   from emp
  group by deptno

   AVG(SAL)
 ----------
 2916.66667
       2175
 1566.66667

You are still grouping by DEPTNO even though it is not in the SELECT clause. Including the column you are grouping by in the SELECT clause often improves readability, but is not mandatory. It is mandatory, however, to avoid placing columns in your SELECT list that are not also in your GROUP BY clause.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Parsing an IP Address in SQL Server

Problem

You want to parse an IP address's fields into columns. Consider the following IP address:

 111.22.3.4

You would like the result of your query to be:

 A     B    C      D
 ----- ----- ----- ---
 111   22    3     4

Solution

The solution depends on the built-in functions provided by your DBMS. Regardless of your DBMS, being able to locate periods and the numbers immediately surrounding them are the keys to the solution.

DB2

Use the recursive WITH clause to simulate an iteration through the IP address while using SUBSTR to easily parse it. A leading period is added to the IP address so that every set of numbers has a period in front of it and can be treated the same way.

  1 with x (pos,ip) as (
  2   values (1,'.92.111.0.222')
  3   union all
  4  select pos+1,ip from x where pos+1 <= 20
  5 )
  6 select max(case when rn=1 then e end) a,
  7        max(case when rn=2 then e end) b,
  8        max(case when rn=3 then e end) c,
  9        max(case when rn=4 then e end) d
 10   from (
 11 select pos,c,d,
 12        case when posstr(d,'.') > 0 then substr(d,1,posstr(d,'.')-1)
 13             else d
 14        end as e,
 15        row_number() over( order by pos desc) rn
 16   from (
 17 select pos, ip,right(ip,pos) as c, substr(right(ip,pos),2) as d
 18   from x
 19  where pos <= length(ip)
 20   and substr(right(ip,pos),1,1) = '.'
 21      ) x
 22      ) y

MySQL

The function SUBSTR_INDEX makes parsing an IP address an easy operation:

 1 select substring_index(substring_index(y.ip,'.',1),'.',-1) a,
 2        substring_index(substring_index(y.ip,'.',2),'.',-1) b,
 3        substring_index(substring_index(y.ip,'.',3),'.',-1) c,
 4        substring_index(substring_index(y.ip,'.',4),'.',-1) d
 5   from (select '92.111.0.2' as ip from t1) y

Oracle

Use the built-in function SUBSTR and INSTR to parse and navigate through the IP address:

 1 select ip,
 2        substr(ip, 1, instr(ip,'.')-1 ) a,
 3        substr(ip, instr(ip,'.')+1,
 4                    instr(ip,'.',1,2)-instr(ip,'.')-1 ) b,
 5        substr(ip, instr(ip,'.',1,2)+1,
 6                    instr(ip,'.',1,3)-instr(ip,'.',1,2)-1 ) c,
 7        substr(ip, instr(ip,'.',1,3)+1 ) d
 8   from (select '92.111.0.2' as ip from t1)

PostgreSQL

Use the built-in function SPLIT_PART to parse an IP address:

 1 select split_part(y.ip,'.',1) as a,
 2        split_part(y.ip,'.',2) as b,
 3        split_part(y.ip,'.',3) as c,
 4        split_part(y.ip,'.',4) as d
 5   from (select cast('92.111.0.2' as text) as ip from t1) as y

SQL Server

  1  with x (pos,ip) as (
  2    select 1 as pos,'.92.111.0.222' as ip from t1
  3    union all
  4   select pos+1,ip from x where pos+1 <= 20
  5  )
  6  select max(case when rn=1 then e end) a,
  7         max(case when rn=2 then e end) b,
  8         max(case when rn=3 then e end) c,
  9         max(case when rn=4 then e end) d
 10    from  (
 11  select pos,c,d,
 12         case when charindex('.',d) > 0
 13            then substring(d,1,charindex('.',d)-1)
 14            else d
 15        end as e,
 16        row_number() over(order by pos desc) rn
 17   from (
 18 select pos, ip,right(ip,pos) as c,
 19        substring(right(ip,pos),2,len(ip)) as d
 20   from x
 21  where pos <= len(ip)
 22    and substring(right(ip,pos),1,1) = '.'
 23       ) x
 24       ) y

Discussion

By using the built-in functions for your database, you can easily walk through parts of a string. The key is being able to locate each of the periods in the address. Then you can parse the numbers between each.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Extracting the nth Delimited Substring : SQL

Problem

You want to extract a specified, delimited substring from a string. Consider the following view V, which generates source data for this problem:

 create view V as
 select 'mo,larry,curly' as name
   from t1
  union all
 select 'tina,gina,jaunita,regina,leena' as name
   from t1

Output from the view is as follows:

 
 select * from v

 NAME
 -------------------
 mo,larry,curly
 tina,gina,jaunita,regina,leena

You would like to extract the second name in each row, so the final result set would be:

   SUB
 -----
 larry
 gina

Solution

The key to solving this problem is to return each name as an individual row while preserving the order in which the name exists in the list. Exactly how you do these things depends on which DBMS you are using.

DB2

After walking the NAMEs returned by view V, use the function ROW_NUMBER to keep only the second name from each string:

  1 select substr(c,2,locate(',',c,2)-2)
  2   from (
  3 select pos, name, substr(name, pos) c,
  4        row_number() over( partition by name
  5                       order by length(substr(name,pos)) desc) rn
  6  from (
  7 select ',' ||csv.name|| ',' as name,
  8         cast(iter.pos as integer) as pos
  9   from V csv,
 10        (select row_number( ) over( ) pos from t100 ) iter
 11  where iter.pos <= length(csv.name)+2
 12        ) x
 13  where length(substr(name,pos)) > 1
 14   and substr(substr(name,pos),1,1) = ','
 15        ) y
 16  where rn = 2

MySQL

After walking the NAMEs returned by view V, use the position of the commas to return only the second name in each string:

  1 select name
  2   from (
  3 select iter.pos,
  4        substring_index(
  5        substring_index(src.name,',',iter.pos),',',-1) name
  6   from V src,
  7        (select id pos from t10) iter,
  8  where iter.pos <=
  9        length(src.name)-length(replace(src.name,',',''))
 10        ) x
 11  where pos = 2

Oracle

After walking the NAMEs returned by view V, retrieve the second name in each list by using SUBSTR and INSTR:

  1 select sub
  2   from (
  3 select iter.pos,
  4        src.name,
  5        substr( src.name,
  6         instr( src.name,',',1,iter.pos )+1,
  7         instr( src.name,',',1,iter.pos+1 ) -
  8         instr( src.name,',',1,iter.pos )-1) sub
  9   from (select ','||name||',' as name from V) src,
 10        (select rownum pos from emp) iter
 11  where iter.pos <>
 12        )
 13  where pos = 2

PostgreSQL

Use the function SPLIT_PART to help return each individual name as a row:

  1 select name
  2   from (
  3 select iter.pos, split_part(src.name,',',iter.pos) as name
  4   from (select id as pos from t10) iter,
  5        (select cast(name as text) as name from v) src
  7  where iter.pos <=
  8         length(src.name)-length(replace(src.name,',',''))+1
  9        ) x
 10  where pos = 2

SQL Server

After walking the NAMEs returned by view V, use the function ROW_NUMBER to keep only the second name from each string:

  1 select substring(c,2,charindex(',',c,2)-2)
  2   from (
  3 select pos, name, substring(name, pos, len(name)) as c,
  4        row_number() over(
  5         partition by name
  6         order by len(substring(name,pos,len(name))) desc) rn
  7   from (
  8 select ',' + csv.name + ',' as name,
  9        iter.pos
 10   from V csv,
 11        (select id as pos from t100 ) iter
 12  where iter.pos <= len(csv.name)+2
 13        ) x
 14  where len(substring(name,pos,len(name))) > 1
 15    and substring(substring(name,pos,len(name)),1,1) = ','
 16        ) y
 17  where rn = 2

Discussion

DB2 and SQL Server

The syntax is slightly different between these two DBMSs, but the technique is the same. I will use the solution for DB2 for the discussion. The strings are walked and the results are represented by inline view X:

 
 select ','||csv.name|| ',' as name,
         iter.pos
   from v csv,
         (select row_number() over( ) pos from t100 ) iter
  where iter.pos <= length(csv.name)+2

 EMPS                             POS
 ------------------------------- ----
 ,tina,gina,jaunita,regina,leena,    1
 ,tina,gina,jaunita,regina,leena,    2
 ,tina,gina,jaunita,regina,leena,    3
 …

The next step is to then step through each character in each string:

 
 select pos, name, substr(name, pos) c,
         row_number() over(partition by name
                         order by length(substr(name, pos)) desc) rn
   from (
 select ','||csv.name||',' as name,
        cast(iter.pos as integer) as pos
   from v csv,
       (select row_number() over() pos from t100 ) iter
  where iter.pos <= length(csv.name)+2
        ) x
  where length(substr(name,pos)) > 1

 POS EMPS            C                RN
 --- --------------- ---------------- --
   1 ,mo,larry,curly, ,mo,larry,curly, 1
   2 ,mo,larry,curly, mo,larry,curly,  2
   3 ,mo,larry,curly, o,larry,curly,   3
   4 ,mo,larry,curly, ,larry,curly,    4
   …

Now that different portions of the string are available to you, simply identify which rows to keep. The rows you are interested in are the ones that begin with a comma; the rest can be discarded:

 
 select pos, name, substr(name,pos) c,
        row_number() over(partition by name
                        order by length(substr(name, pos)) desc) rn
   from (
 select ','||csv.name||',' as name,
        cast(iter.pos as integer) as pos
   from v csv,
        (select row_number() over( ) pos from t100 ) iter 
  where iter.pos <= length(csv.name)+2
        ) x 
  where length(substr(name,pos)) > 1 
     and substr(substr(name,pos),1,1) = ','

 POS  EMPS                               C                                    RN
  ---  --------------                     ----------------                      --
    1  ,mo,larry,curly,                   ,mo,larry,curly,                       1
    4  ,mo,larry,curly,                   ,larry,curly,                          2
   10  ,mo,larry,curly,                   ,curly,                                3
    1  ,tina,gina,jaunita,regina,leena,   ,tina,gina,jaunita,regina,leena,       1
    6  ,tina,gina,jaunita,regina,leena,   ,gina,jaunita,regina,leena,            2
   11  ,tina,gina,jaunita,regina,leena,   ,jaunita,regina,leena,                 3
   19  ,tina,gina,jaunita,regina,leena,   ,regina,leena,                         4
   26  ,tina,gina,jaunita,regina,leena,   ,leena,                                5

This is an important step as it sets up how you will get the nth substring. Notice that many rows have been eliminated from this query because of the following condition in the WHERE clause:

 substr(substr(name,pos),1,1) = ','

You'll notice that ,larry,curly, was ranked 4, but now is ranked 2. Remember, the WHERE clause is evaluated before the SELECT, so the rows with leading commas are kept, then ROW_NUMBER performs its ranking. At this point it's easy to see that, to get the nth substring you want rows where RN equals n. The last step is to keep only the rows you are interested in (in this case where RN equals 2) and use SUBSTR to extract the name from that row. The name to keep is the first name in the row: larry from ,larry,curly, and gina from ,gina,jaunita,regina,leena,.

MySQL

The inline view X walks each string. You can determine how many values are in each string by counting the delimiters in the string:

 
 select iter.pos, src.name
   from (select id pos from t10) iter,
        V src
  where iter.pos <=
        length(src.name)-length(replace(src.name,',',''))

 +------+--------------------------------+
 | pos  | name                           |
 +------+--------------------------------+
 |    1 | mo,larry,curly                 |
 |    2 | mo,larry,curly                 |
 |    1 | tina,gina,jaunita,regina,leena |
 |    2 | tina,gina,jaunita,regina,leena |
 |    3 | tina,gina,jaunita,regina,leena |
 |    4 | tina,gina,jaunita,regina,leena |
 +------+--------------------------------+

In this case, there is one fewer row than values in each string because that's all that is needed. The function SUBSTRING_INDEX takes care of parsing the needed values:

 
 select iter.pos,src.name name1,
         substring_index(src.name,',',iter.pos) name2,
         substring_index(
          substring_index(src.name,',',iter.pos),',',-1) name3
    from (select id pos from t10) iter,
         V src
   where  iter.pos <=
         length(src.name)-length(replace(src.name,',',''))

+------+--------------------------------+--------------------------+---------+ 
| pos  | name1                        | name2                      | name3   | 
+------+--------------------------------+--------------------------+---------+
|    1 | mo,larry,curly                 | mo                       | mo      |
|    2 | mo,larry,curly                 | mo,larry                 | larry   |
|    1 | tina,gina,jaunita,regina,leena | tina                     | tina    |
|    2 | tina,gina,jaunita,regina,leena | tina,gina                | gina    |
|    3 | tina,gina,jaunita,regina,leena | tina,gina,jaunita        | jaunita |
|    4 | tina,gina,jaunita,regina,leena | tina,gina,jaunita,regina | regina  |
+------+--------------------------------+--------------------------+---------+

I've shown three name fields, so you can see how the nested SUBSTRING_INDEX calls work. The inner call returns all characters to the left of the nth occurrence of a comma. The outer call returns everything to the right of the first comma it finds (starting from the end of the string). The final step is to keep the value for NAME3 where POS equals n, in this case 2.

Oracle

The inline view walks each string. The number of times each string is returned is determined by how many values are in each string. The solution finds the number of values in each string by counting the number of delimiters in it. Because each string is enclosed in commas, the number of values in a string is the number of commas minus one. The strings are then UNIONed and joined to a table with a cardinality that is at least the number of values in the largest string. The functions SUBSTR and INSTR use the value of POS to parse each string:

 
 select iter.pos, src.name,
        substr( src.name,
         instr( src.name,',',1,iter.pos )+1,
         instr( src.name,',',1,iter.pos+1 )
         instr( src.name,',',1,iter.pos )-1) sub
   from (select ','||name||',' as name from v) src,
        (select rownum pos from emp) iter
  where iter.pos <>

 POS NAME                              SUB
 --- --------------------------------- -------------
   1 ,mo,larry,curly,                  mo
   1 , tina,gina,jaunita,regina,leena, tina
   2 ,mo,larry,curly,                  larry
   2 , tina,gina,jaunita,regina,leena, gina
   3 ,mo,larry,curly,                  curly
   3 , tina,gina,jaunita,regina,leena, jaunita
   4 , tina,gina,jaunita,regina,leena, regina 
   5 , tina,gina,jaunita,regina,leena, leena

The first call to INSTR within SUBSTR determines the start position of the substring to extract. The next call to INSTR within SUBSTR finds the position of the nth comma (same as the start position) as well the position of the nth + 1 comma. Subtracting the two values returns the length of the substring to extract. Because every value is parsed into its own row, simply specify WHERE POS = n to keep the nth substring (in this case, where POS = 2, so, the second substring in the list).

PostgreSQL

The inline view X walks each string. The number of rows returned is determined by how many values are in each string. To find the number of values in each string, find the number of delimiters in each string and add one. The function SPLIT_PART uses the values in POS to find the nth occurrence of the delimiter and parse the string into values:

 
 select iter.pos, src.name as name1,
         split_part(src.name,',',iter.pos) as name2
    from (select id as pos from t10) iter,
         (select cast(name as text) as name from v) src
   where iter.pos <=
         length(src.name)-length(replace(src.name,',',''))+1

  pos |             name1              |  name2
 -----+--------------------------------+---------
    1 | mo,larry,curly                 | mo
    2 | mo,larry,curly                 | larry
    3 | mo,larry,curly                 | curly
    1 | tina,gina,jaunita,regina,leena | tina
    2 | tina,gina,jaunita,regina,leena | gina
    3 | tina,gina,jaunita,regina,leena | jaunita
    4 | tina,gina,jaunita,regina,leena | regina
    5 | tina,gina,jaunita,regina,leena | leena

I've shown NAME twice so you can see how SPLIT_PART parses each string using POS. Once each string is parsed, the final step is the keep the rows where POS equals the nth substring you are interested in, in this case, 2.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Identifying Strings That Can Be Treated as Numbers : SQL

Problem

You have a column that is defined to hold character data. Unfortunately, the rows contain mixed numeric and character data. Consider view V:

 create view V as
 select replace(mixed,' ','') as mixed
   from (
 select substr(ename,1,2)||
        cast(deptno as char(4))||
        substr(ename,3,2) as mixed
   from emp
  where deptno = 10
  union all
 select cast(empno as char(4)) as mixed
   from emp
  where deptno = 20
  union all
 select ename as mixed
   from emp
  where deptno = 30
        ) x
 select * from v

  MIXED
  --------------
  CL10AR
  KI10NG
  MI10LL
  7369
  7566
  7788
  7876
  7902
  ALLEN
  WARD
  MARTIN
  BLAKE
  TURNER
  JAMES

You want to return rows that are numbers only, or that contain at least one number. If the numbers are mixed with character data, you want to remove the characters and return only the numbers. For the sample data above you want the following result set:

Solution

The functions REPLACE and TRANSLATE are extremely useful for manipulating strings and individual characters. The key is to convert all numbers to a single character, which then makes it easy to isolate and identify any number by referring to a single character.

DB2

Use functions TRANSLATE, REPLACE, and POSSTR to isolate the numeric characters in each row. The calls to CAST are necessary in view V; otherwise, the view will fail to be created due to type conversion errors. You'll need the function REPLACE to remove extraneous white space due to casting to the fixed length CHAR:

  1 select mixed old,
  2        cast(
  3          case
  4          when
  5            replace(
  6           translate(mixed,'9999999999','0123456789'),'9','') = ''
  7          then
  8             mixed
  9          else replace(
 10            translate(mixed,
 11               repeat('#',length(mixed)),
 12             replace(
 13              translate(mixed,'9999999999','0123456789'),'9','')),
 14                      '#','')
 15          end as integer ) mixed
 16   from V
 17  where posstr(translate(mixed,'9999999999','0123456789'),'9') > 0

MySQL

The syntax for MySQL is slightly different and will define view V as:

 create view V as
 select concat(
          substr(ename,1,2),
          replace(cast(deptno as char(4)),' ',''),
          substr(ename,3,2)
        ) as mixed
   from emp
  where deptno = 10
  union all
 select replace(cast(empno as char(4)), ' ', '')
   from emp where deptno = 20
  union all
 select ename from emp where deptno = 30

Because MySQL does not support the TRANSLATE function, you must walk each row and evaluate it on a character-by-character basis.

  1 select cast(group_concat(c order by pos separator '') as unsigned)
  2        as MIXED1
  3   from (
  4 select v.mixed, iter.pos, substr(v.mixed,iter.pos,1) as c
  5   from V,
  6        ( select id pos from t10 ) iter
  7  where iter.pos <= length(v.mixed)
  8    and ascii(substr(v.mixed,iter.pos,1)) between 48 and 57
  9        ) y
 10  group by mixed
 11  order by 1

Oracle

Use functions TRANSLATE, REPLACE, and INSTR to isolate the numeric characters in each row. The calls to CAST are not necessary in view V. Use the function REPLACE to remove extraneous white space due to casting to the fixed length CHAR. If you decide you would like to keep the explicit type conversion calls in the view definition, it is suggested you cast to VARCHAR2:

  1 select to_number (
  2         case
  3         when
  4            replace(translate(mixed,'0123456789','9999999999'),'9')
  5           is not null
  6         then
  7              replace(
  8            translate(mixed,
  9              replace(
 10            translate(mixed,'0123456789','9999999999'),'9'),
 11                     rpad('#',length(mixed),'#')),'#')
 12         else
 13              mixed
 14         end
 15         ) mixed
 16   from V
 17  where instr(translate(mixed,'0123456789','9999999999'),'9') > 0

PostgreSQL

Use functions TRANSLATE, REPLACE, and STRPOS to isolate the numeric characters in each row. The calls to CAST are not necessary in view V. Use the function REPLACE ito remove extraneous white space due to casting to the fixed length CHAR. If you decide you would like to keep the explicit type conversion calls in the view definition, it is suggested you cast to VARCHAR:

  1 select cast(
  2        case
  3        when
  4         replace(translate(mixed,'0123456789','9999999999'),'9','')
  5         is not null
  6        then
  7           replace(
  8        translate(mixed,
  9           replace(
 10        translate(mixed,'0123456789','9999999999'),'9',''),
 11                 rpad('#',length(mixed),'#')),'#','')
 12        else
 13          mixed
 14        end as integer ) as mixed
 15    from V
 16  where strpos(translate(mixed,'0123456789','9999999999'),'9') > 0

SQL Server

The built-in function ISNUMERIC along with a wildcard search allows you to easily identify strings that contains numbers, but getting numeric characters out of a string is not particularly efficient because the TRANSLATE function is not supported.

Discussion

The TRANSLATE function is very useful here as it allows you to easily isolate and identify numbers and characters. The trick is to convert all numbers to a single character; this way, rather than searching for different numbers you only search for one character.

DB2, Oracle, and PostgreSQL

The syntax differs slightly among these DBMSs, but the technique is the same. I'll use the solution for PostgreSQL for the discussion.

The real work is done by functions TRANSLATE and REPLACE. To get the final result set requires several function calls, each listed below in one query:

 
 select mixed as orig,
 translate(mixed,'0123456789','9999999999') as mixed1,
 replace(translate(mixed,'0123456789','9999999999'),'9','') as mixed2,
  translate(mixed,
  replace(
  translate(mixed,'0123456789','9999999999'),'9',''),
           rpad('#',length(mixed),'#')) as mixed3,
  replace(
  translate(mixed,
  replace(
 translate(mixed,'0123456789','9999999999'),'9',''),
           rpad('#',length(mixed),'#')),'#','') as mixed4
   from V
  where strpos(translate(mixed,'0123456789','9999999999'),'9') > 0

   ORIG  | MIXED1 | MIXED2 | MIXED3 | MIXED4 | MIXED5 
 --------+--------+--------+--------+--------+--------
  CL10AR | CL99AR | CLAR   | ##10## | 10     |     10
  KI10NG | KI99NG | KING   | ##10## | 10     |     10
  MI10LL | MI99LL | MILL   | ##10## | 10     |     10
  7369   | 9999   |        | 7369   | 7369   |   7369
  7566   | 9999   |        | 7566   | 7566   |   7566
  7788   | 9999   |        | 7788   | 7788   |   7788
  7876   | 9999   |        | 7876   | 7876   |   7876
  7902   | 9999   |        | 7902   | 7902   |   7902

First, notice that any rows without at least one number are removed. How this is accomplished will become clear as you examine each of the columns in the above result set. The rows that are kept are the values in the ORIG column and are the rows that will eventually make up the result set. The first step to extracting the numbers is to use the function TRANSLATE to convert any number to a 9 (you can use any digit; 9 is arbitrary), this is represented by the values in MIXED1. Now that all numbers are 9's, they can be treating as a single unit. The next step is to remove all of the numbers by using the function REPLACE. Because all digits are now 9, REPLACE simply looks for any 9's and removes them. This is represented by the values in MIXED2. The next step, MIXED3, uses values that are returned by MIXED2. These values are then compared to the values in ORIG. If any characters from MIXED2 are found in ORIG, they are converted to the # character by TRANSLATE. The result set from MIXED3 shows that the letters, not the numbers, have now been singled out and converted to a single character. Now that all non-numeric characters are represented by #'s, they can be treated as a single unit. The next step, MIXED4, uses REPLACE to find and remove any # characters in each row; what's left are numbers only. The final step is to cast the numeric characters as numbers. Now that you've gone through the steps, you can see how the WHERE clause works. The results from MIXED1 are passed to STRPOS, and if a 9 is found (the position in the string where the first 9 is located) the result must be greater than 0. For rows that return a value greater than zero, it means there's at least one number in that row and it should be kept.

MySQL

The first step is to walk each string and evaluate each character and determine whether or not it's a number:

 
 select v.mixed, iter.pos, substr(v.mixed,iter.pos,1) as c
   from V,
       ( select id pos from t10 ) iter
  where iter.pos <= length(v.mixed)
  order by 1,2

 +--------+------+------+
 | mixed  | pos  | c    |
 +--------+------+------+
 | 7369   |    1 | 7    |
 | 7369   |    2 | 3    |
 | 7369   |    3 | 6    |
 | 7369   |    4 | 9    |
 …
 | ALLEN  |    1 | A    |
 | ALLEN  |    2 | L    |
 | ALLEN  |    3 | L    |
 | ALLEN  |    4 | E    |
 | ALLEN  |    5 | N    |
 …
 | CL10AR |    1 | C    |
 | CL10AR |    2 | L    |
 | CL10AR |    3 | 1    |
 | CL10AR |    4 | 0    |
 | CL10AR |    5 | A    |
 | CL10AR |    6 | R    |
 +--------+------+------+

Now that each character in each string can be evaluated individually, the next step is to keep only the rows that have a number in the C column:

 
 select v.mixed, iter.pos, substr(v.mixed,iter.pos,1) as c
    from V,
        ( select id pos from t10 ) iter
  where iter.pos <= length(v.mixed)
   and ascii(substr(v.mixed,iter.pos,1)) between 48 and 57
  order by 1,2

 +--------+------+------+
 | mixed  | pos  | c    |
 +--------+------+------+
 | 7369   |    1 | 7    |
 | 7369   |    2 | 3    |
 | 7369   |    3 | 6    |
 | 7369   |    4 | 9    |
 …
 | CL10AR |    3 | 1    |
 | CL10AR |    4 | 0    |
 …
 +--------+------+------+

At this point, all the rows in column C are numbers. The next step is to use GROUP_CONCAT to concatenate the numbers to form their respective whole number in MIXED. The final result is then cast as a number:

 
 select cast(group_concat(c order by pos separator '') as unsigned)
          as MIXED1
   from ( 
 select v.mixed, iter.pos, substr(v.mixed,iter.pos,1) as c 
   from V, 
       ( select id pos from t10 ) iter 
  where iter.pos <= length(v.mixed) 
   and ascii(substr(x.mixed,iter.pos,1)) between 48 and 57
        ) y
   group by mixed
   order by 1

 +--------+
 | MIXED1 |
 +--------+
 |    10  |
 |    10  |
 |    10  |
 |  7369  |
 |  7566  |
 |  7788  |
 |  7876  |
 |  7902  |
 +--------+

As a final note, keep in mind that any digits in each string will be concatenated to form one numeric value. For example, an input value of, say, '99Gennick87' will result in the value 9987 being returned. This is something to keep in mind, particularly when working with serialized data.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Alphabetizing a String : SQL SERVER

Problem

You want alphabetize the individual characters within strings in your tables. Consider the following result set:

 ENAME
 ----------
 ADAMS
 ALLEN
 BLAKE
 CLARK
 FORD
 JAMES
 JONES
 KING
 MARTIN
 MILLER
 SCOTT
 SMITH
 TURNER
 WARD

You would like the result to be:

 OLD_NAME   NEW_NAME
 ---------- --------
 ADAMS      AADMS
 ALLEN      AELLN
 BLAKE      ABEKL
 CLARK      ACKLR
 FORD       DFOR
 JAMES      AEJMS
 JONES      EJNOS
 KING       GIKN
 MARTIN     AIMNRT
 MILLER     EILLMR
 SCOTT      COSTT
 SMITH      HIMST
 TURNER     ENRRTU
 WARD       ADRW

Solution

This problem is a perfect example of why it is crucial to understand your DBMS and what functionality is available to you. In situations where your DBMS does not provide built-in functions to facilitate this solution, you need to come up with something creative. Compare the MySQL solution with the rest.

DB2

To alphabetize rows of strings it is necessary to walk each string then order its characters:

  1 select ename,
  2        max(case when pos=1 then c else '' end)||
  3        max(case when pos=2 then c else '' end)||
  4        max(case when pos=3 then c else '' end)||
  5        max(case when pos=4 then c else '' end)||
  6        max(case when pos=5 then c else '' end)||
  7        max(case when pos=6 then c else '' end)
  8   from (
  9 select e.ename,
 10        cast(substr(e.ename,iter.pos,1) as varchar(100)) c,
 11        cast(row_number( )over(partition by e.ename
 12                               order by substr(e.ename,iter.pos,1))
 13            as integer) pos
 14   from emp e,
 15        (select cast(row_number( )over( ) as integer) pos
 16           from emp) iter
 17  where iter.pos <= length(e.ename)
 18        ) x
 19  group by ename

MySQL

The key here is the GROUP_CONCAT function, which allows you to not only concatenate the characters that make up each name but also order them:

 1 select ename, group_concat(c order by c separator '')
 2   from (
 3 select ename, substr(a.ename,iter.pos,1) c
 4   from emp a,
 5        ( select id pos from t10 ) iter
 6  where iter.pos <= length(a.ename)
 7        ) x
 8  group by ename

Oracle

The function SYS_CONNECT_BY_PATH allows you to iteratively build a list:

  1 select old_name, new_name
  2   from (
  3 select old_name, replace(sys_connect_by_path(c,' '),' ') new_name
  4   from (
  5 select e.ename old_name,
  6        row_number() over(partition by e.ename
  7                         order by substr(e.ename,iter.pos,1)) rn,
  8        substr(e.ename,iter.pos,1) c
  9   from emp e,
 10        ( select rownum pos from emp ) iter
 11  where iter.pos <= length(e.ename)
 12  order by 1
 13         ) x
 14  start with rn = 1
 15 connect by prior rn = rn-1 and prior old_name = old_name
 16         )
 17  where length(old_name) = length(new_name)

PostgreSQL

PostgreSQL does not offer any built-in functions to easily sort characters in a string, so it is necessary not only to walk through each string but also to know in advance the largest length of any one name. View V is used in this solution for readability:

  create or replace view V as
  select x.*
    from (
  select a.ename,
         substr(a.ename,iter.pos,1) as c
    from emp a,
         (select id as pos from t10) iter
   where iter.pos <= length(a.ename)
   order by 1,2
         ) x

The following select statement leverages the view:

  1 select ename,
  2        max(case when pos=1 then
  3                 case when cnt=1 then c
  4                      else rpad(c,cast(cnt as integer),c)
  5                 end
  6                 else ''
  7             end)||
  8        max(case when pos=2 then
  9                 case when cnt=1 then c
 10                      else rpad(c,cast(cnt as integer),c)
 11                 end
 12                 else ''
 13             end)||
 14        max(case when pos=3 then
 15                 case when cnt=1 then c
 16                      else rpad(c,cast(cnt as integer),c)
 17                 end
 18                 else ''
 19             end)||
 20        max(case when pos=4 then
 21                 case when cnt=1 then c
 22                      else rpad(c,cast(cnt as integer),c)
 23                 end
 24                 else ''
 25             end)||
 26        max(case when pos=5 then
 27                 case when cnt=1 then c
 28                      else rpad(c,cast(cnt as integer),c)
 29                 end
 30                 else ''
 31             end)||
 32        max(case when pos=6 then
 33                 case when cnt=1 then c
 34                      else rpad(c,cast(cnt as integer),c)
 35                 end
 36                 else ''
 37             end)
 38   from (
 39 select a.ename, a.c,
 40        (select count(*)
 41           from v b
 42          where a.ename=b.ename and a.c=b.c ) as cnt,
 43        (select count(*)+1
 44           from v b
 45          where a.ename=b.ename and b.c

SQL Server

To alphabetize rows of strings it is necessary to walk each string, and then order their characters:

  1 select ename,
  2           max(case when pos=1 then c else '' end)+
  3           max(case when pos=2 then c else '' end)+
  4           max(case when pos=3 then c else '' end)+
  5           max(case when pos=4 then c else '' end)+
  6           max(case when pos=5 then c else '' end)+
  7           max(case when pos=6 then c else '' end)
  8      from (
  9    select e.ename,
 10         substring(e.ename,iter.pos,1) as c,
 11         row_number() over (
 12          partition by e.ename
 13              order by substring(e.ename,iter.pos,1)) as pos
 14    from emp e,
 15         (select row_number()over(order by ename) as pos
 16            from emp) iter
 17   where iter.pos <= len(e.ename)  18          ) x  19   group by ename

Discussion

DB2 and SQL Server

The inline view X returns each character in each name as a row. The function SUBSTR or SUBSTRING extracts each character from each name, and the function ROW_NUMBER ranks each character alphabetically:

 ENAME  C  POS
 -----  -  ---
 ADAMS  A  1
 ADAMS  A  2
 ADAMS  D  3
 ADAMS  M  4
 ADAMS  S  5
 …

To return each letter of a string as a row, you must walk the string. This is accomplished with inline view ITER.

Now that the letters in each name have been alphabetized, the last step is to put those letters back together, into a string, in the order they are ranked. Each letter's position is evaluated by the CASE statements (lines 27). If a character is found at a particular position it is then concatenated to the result of the next evaluation (the following CASE statement). Because the aggregate function MAX is used as well, only one character per position POS is returned, so that only one row per name is returned. The CASE evaluation goes up to the number 6, which is the maximum number of characters in any name in table EMP.

MySQL

The inline view X (lines 36) returns each character in each name as a row. The function SUBSTR extracts each character from each name:

 ENAME  C
 -----  -
 ADAMS  A
 ADAMS  A
 ADAMS  D
 ADAMS  M
 ADAMS  S
 …

Inline view ITER is used to walk the string. From there, the rest of the work is done by the GROUP_CONCAT function. By specifying an order, the function not only concatenates each letter, it does so alphabetically.

Oracle

The real work is done by inline view X (lines 511), where the characters in each name are extracted and put into alphabetical order. This is accomplished by walking the string, then imposing order on those characters. The rest of the query merely glues the names back together.

The tearing apart of names can be seen by executing only inline view X:

 OLD_NAME          RN C
 ---------- --------- -
 ADAMS              1 A
 ADAMS              2 A
 ADAMS              3 D
 ADAMS              4 M
 ADAMS              5 S
 …

The next step is to take the alphabetized characters and rebuild each name. This is done with the function SYS_CONNECT_BY_PATH by appending each character to the ones before it:

 OLD_NAME   NEW_NAME
 ---------- ---------
 ADAMS      A
 ADAMS      AA
 ADAMS      AAD
 ADAMS      AADM
 ADAMS      AADMS
 …

The final step is to keep only the strings that have the same length as the names they were built from.

PostgreSQL

For readability, view V is used in this solution to walk the string. The function SUBSTR, in the view definition, extracts each character from each name so that the view returns:

 ENAME C
 ----- -
 ADAMS A
 ADAMS A
 ADAMS D
 ADAMS M
 ADAMS S
 …

The view also orders the results by ENAME and by each letter in each name. The inline view X (lines 1518) returns the names and characters from view V, the number of times each character occurs in each name, and its position (alphabetically):

 ename | c | cnt | pos
 ------+---+-----+-----
 ADAMS | A |   2 |   1
 ADAMS | A |   2 |   1
 ADAMS | D |   1 |   3
 ADAMS | M |   1 |   4
 ADAMS | S |   1 |   5

The extra columns CNT and POS, returned by the inline view X, are crucial to the solution. POS is used to rank each character and CNT is used to determine the number of times the character exists in each name. The final step is to evaluate the position of each character and rebuild the name. You'll notice that each case statement is actually two case statements. This is to determine whether or not a character occursmore than once in a name; if it does, then rather than return that character, what is returned is that character appended to itself CNT times. The aggregate function, MAX, is used to ensure there is only one row per name.** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Converting Delimited Data into a Multi-Valued IN-List : SQL Server

Problem

You have delimited data that you want to pass to the IN-list iterator of a WHERE clause. Consider the following string:

 7654,7698,7782,7788

You would like to use the string in a WHERE clause but the following SQL fails because EMPNO is a numeric column:

 select ename,sal,deptno
   from emp
  where empno in ( '7654,7698,7782,7788' )

This SQL fails because, while EMPNO is a numeric column, the IN list is composed of a single string value. You want that string to be treated as a comma-delimited list of numeric values.

Solution

On the surface it may seem that SQL should do the work of treating a delimited string as a list of delimited values for you, but that is not the case. When a comma embedded within quotes is encountered, SQL can't possibly know that signals a multi-valued list. SQL must treat everything between the quotes as a single entity, as one string value. You must break the string up into individual EMPNOs. The key to this solution is to walk the string, but not into individual characters. You want to walk the string into valid EMPNO values.

DB2

By walking the string passed to the IN-list, you can you can easily convert it to rows. The functions ROW_NUMBER, LOCATE, and SUBSTR are particularly useful here:

  1 select empno,ename,sal,deptno
  2   from emp
  3  where empno in (
  4 select cast(substr(c,2,locate(',',c,2)-2) as integer) empno
  5   from (
  6 select substr(csv.emps,cast(iter.pos as integer)) as c
  7   from (select ','||'7654,7698,7782,7788'||',' emps
  8          from t1) csv,
  9        (select id as pos
 10           from t100 ) iter
 11  where iter.pos <= length(csv.emps)
 12        ) x
 13  where length(c) > 1
 14    and substr(c,1,1) = ','
 15        ) y

MySQL

By walking the string passed to the IN-list, you can easily convert it to rows:

  1 select empno, ename, sal, deptno 
  2   from emp
  3  where empno in
  4        (
  5 select substring_index(
  6        substring_index(list.vals,',',iter.pos),',',-1) empno
  6   from (select id pos from t10) as iter,
  7        (select '7654,7698,7782,7788' as vals
  8           from t1) list
  9  where iter.pos <=
 10        (length(list.vals)-length(replace(list.vals,',','')))+1
 11        ) x

Oracle

By walking the string passed to the IN-list, you can you can easily convert it to rows. The functions ROWNUM, SUBSTR, and INSTR are particularly useful here:

  1 select empno,ename,sal,deptno
  2   from emp
  3  where empno in (
  4        select to_number(
  5                   rtrim(
  6                 substr(emps,
  7                  instr(emps,',',1,iter.pos)+1,
  8                  instr(emps,',',1,iter.pos+1)
  9                  instr(emps,',',1,iter.pos)),',')) emps
 10          from (select ','||'7654,7698,7782,7788'||',' emps from t1) csv,
 11               (select rownum pos from emp) iter
 12         where iter.pos <= ((length(csv.emps)-  13                   length(replace(csv.emps,',')))/length(','))-1  14 )

Postgres

By walking the string passed to the IN-list, you can easily convert it to rows. The function SPLIT_PART makes it easy to parse the string into individual numbers:

  1 select ename,sal,deptno
  2   from emp
  3  where empno in (
  4 select cast(empno as integer) as empno
  5   from (
  6 select split_part(list.vals,',',iter.pos) as empno
  7   from (select id as pos from t10) iter,
  8        (select ','||'7654,7698,7782,7788'||',' as vals
  9           from t1) list
 10  where iter.pos <=  11        length(list.vals)-length(replace(list.vals,',',''))  12        ) z  13  where length(empno) > 0
 14        ) x

SQL Server

By walking the string passed to the IN-list, you can you can easily convert it to rows. The functions ROW_NUMBER, CHARINDEX, and SUBSTRING are particularly useful here:

  1 select empno,ename,sal,deptno
  2   from emp
  3  where empno in (select substring(c,2,charindex(',',c,2)-2) as empno
  4   from (
  5 select substring(csv.emps,iter.pos,len(csv.emps)) as c
  6   from (select ','+'7654,7698,7782,7788'+',' as emps
  7           from t1) csv,
  8        (select id as pos
  9          from t100) iter
 10  where iter.pos <= len(csv.emps)  11       ) x  12  where len(c) > 1
 13    and substring(c,1,1) = ','
 14       ) y

Discussion

The first and most important step in this solution is to walk the string. Once you've accomplished that, all that's left is to parse the string into individual, numeric values using your DBMS's provided functions.

DB2 and SQL Server

The inline view X (lines 611) walks the string. The idea in this solution is to "walk through" the string, so that each row has one less character than the one before it:

 ,7654,7698,7782,7788,
 7654,7698,7782,7788,
 654,7698,7782,7788,
 54,7698,7782,7788,
 4,7698,7782,7788,
 ,7698,7782,7788,
 7698,7782,7788,
 698,7782,7788,
 98,7782,7788,
 8,7782,7788,
 ,7782,7788,
 7782,7788,
 782,7788,
 82,7788,
 2,7788,
 ,7788,
 7788,
 788,
 88,
 8,
 ,

Notice that by enclosing the string in commas (the delimiter), there's no need to make special checks as to where the beginning or end of the string is.

The next step is to keep only the values you want to use in the IN-list. The values to keep are the ones with leading commas, with the exception of the last row with its lone comma. Use SUBSTR or SUBSTRING to identify which rows have a leading comma, then keep all characters found before the next comma in that row. Once that's done, cast the string to a number so it can be properly evaluated against the numeric column EMPNO (lines 414):

  EMPNO
 ------
   7654
   7698
   7782
   7788

The final step is to use the results in a subquery to return the desired rows.

MySQL

The inline view (lines 59) walks the string. The expression on line 10 determines how many values are in the string by finding the number of commas (the delimiter) and adding one. The function SUBSTRING_INDEX (line 6) returns all characters in the string before (to the left of ) the nth occurrence of a comma (the delimiter):

 +---------------------+
 | empno               |
 +---------------------+
 | 7654                |
 | 7654,7698           |
 | 7654,7698,7782      |
 | 7654,7698,7782,7788 |
 +---------------------+

Those rows are then passed to another call to SUBSTRING_INDEX (line 5); this time the nth occurrence of the delimited is 1, which causes all values to the right of the nth occurrence of the delimiter to be kept:

 +-------+
 | empno |
 +-------+
 | 7654  |
 | 7698  |
 | 7782  |
 | 7788  |
 +-------+

The final step is to plug the results into a subquery.

Oracle

The first step is to walk the string:

 
 select emps,pos
   from (select ','||'7654,7698,7782,7788'||',' emps
           from t1) csv,
        (select rownum pos from emp) iter
  where iter.pos <=   ((length(csv.emps)-length(replace(csv.emps,',')))/length(','))-1

 EMPS                         POS
 --------------------- ----------
 ,7654,7698,7782,7788,          1
 ,7654,7698,7782,7788,          2
 ,7654,7698,7782,7788,          3
 ,7654,7698,7782,7788,          4

The number of rows returned represents the number of values in your list. The values for POS are crucial to the query as they are needed to parse the string into individual values. The strings are parsed using SUBSTR and INSTR. POS is used to locate the nth occurrence of the delimiter in each string. By enclosing the strings in commas, no special checks are necessary to determine the beginning or end of a string. The values passed to SUBSTR, INSTR (lines 79) locate the nth and nth+1 occurrence of the delimiter. By subtracting the value returned for the current comma (the location in the string where the current comma is) from the value returned bythe next comma (the location in the string where the next comma is) you can extract each value from the string:

 
 select substr(emps,
        instr(emps,',',1,iter.pos)+1,
        instr(emps,',',1,iter.pos+1)
        instr(emps,',',1,iter.pos)) emps
   from (select ','||'7654,7698,7782,7788'||',' emps
           from t1) csv,
        (select rownum pos from emp) iter
  where iter.pos <=
   ((length(csv.emps)-length(replace(csv.emps,',')))/length(','))-1
  EMPS
  -----------
  7654,
  7698,
  7782,
  7788,

The final step is to remove the trailing comma from each value, cast it to a number, and plug it into a subquery.

PostgreSQL

The inline view Z (lines 69) walks the string. The number of rows returned is determined by how many values are in the string. To find the number of values in the string, subtract the size of the string without the delimiter from the size of the string with the delimiter (line 9). The function SPLIT_PART does the work of parsing the string. It looks for the value that comes before the nth occurrence of the delimiter:

 
 select list.vals,
        split_part(list.vals,',',iter.pos) as empno,
        iter.pos
   from (select id as pos from t10) iter,
        (select ','||'7654,7698,7782,7788'||',' as vals
           from t1) list
  where iter.pos <=         length(list.vals)-length(replace(list.vals,',',''))

      vals             | empno | pos
 ----------------------+-------+-----
 ,7654,7698,7782,7788, |       |   1
 ,7654,7698,7782,7788, | 7654  |   2
 ,7654,7698,7782,7788, | 7698  |   3
 ,7654,7698,7782,7788, | 7782  |   4
 ,7654,7698,7782,7788, | 7788  |   5

The final step is to cast the values (EMPNO) to a number and plug it into a subquery.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

SQL Server : Creating a Delimited List from Table Rows

Problem

You want to return table rows as values in a delimited list, perhaps delimited by commas, rather than in vertical columns as they normally appear. You want to convert a result set from this:

 DEPTNO EMPS
 ------ ----------
     10 CLARK
     10 KING
     10 MILLER
     20 SMITH
     20 ADAMS
     20 FORD
     20 SCOTT
     20 JONES
     30 ALLEN
     30 BLAKE
     30 MARTIN
     30 JAMES
     30 TURNER
     30 WARD

to this:

  DEPTNO EMPS
 ------- ------------------------------------
      10 CLARK,KING,MILLER
      20 SMITH,JONES,SCOTT,ADAMS,FORD
      30 ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

Solution

Each DBMS requires a different approach to this problem. The key is to take advantage of the built-in functions provided by your DBMS. Understanding what is available to you will allow you to exploit your DBMS's functionality and come up with creative solutions for a problem that is typically not solved in SQL.

DB2

Use recursive WITH to build the delimited list:

  1   with x (deptno, cnt, list, empno, len)
  2     as (
  3 select deptno, count(*) over (partition by deptno),
  4        cast(ename as varchar(100)), empno, 1
  5   from emp
  6  union all
  7  select x.deptno, x.cnt, x.list ||','|| e.ename, e.empno, x.len+1
  8   from emp e, x
  9  where e.deptno = x.deptno
 10    and e.empno > x. empno
 11        )
 12 select deptno,list
 13   from x
 14  where len = cnt

MySQL

Use the built-in function GROUP_CONCAT to build the delimited list:

 1 select deptno,
 2        group_concat(ename order by empno separator, ',') as emps
 3   from emp
 4  group by deptno

Oracle

Use the built-in function SYS_CONNECT_BY_PATH to build the delimited list:

  1 select deptno,
  2        ltrim(sys_connect_by_path(ename,','),',') emps
  3   from (
  4 select deptno,
  5        ename,
  6        row_number() over
  7                 (partition by deptno order by empno) rn,
  8        count(*) over
  9                 (partition by deptno) cnt
 10   from emp
 11        )
 12  where level = cnt
 13  start with rn = 1
 14 connect by prior deptno = deptno and prior rn = rn-1

PostgreSQL

PostgreSQL does not offer a standard built-in function for creating a delimited list, so it is necessary to know how many values will be in the list in advance. Once you know the size of the largest list, you can determine the number of values to append to create your list by using standard transposition and concatenation:

  1 select deptno,
  2        rtrim(
  3            max(case when pos=1 then emps else '' end)||
  4            max(case when pos=2 then emps else '' end)||
  5            max(case when pos=3 then emps else '' end)||
  6            max(case when pos=4 then emps else '' end)||
  7            max(case when pos=5 then emps else '' end)||
  8            max(case when pos=6 then emps else '' end),','
  9        ) as emps
 10   from (
 11 select a.deptno,
 12        a.ename||',' as emps,
 13        d.cnt,
 14        (select count(*) from emp b
 15           where a.deptno=b.deptno and b.empno <= a.empno) as pos
 16   from emp a,
 17        (select deptno, count(ename) as cnt
 18           from emp
 19           group by deptno) d
 20  where d.deptno=a.deptno
 21        ) x
 22  group by deptno
 23  order by 1

SQL Server

Use recursive WITH to build the delimited list:

  1  with x (deptno, cnt, list, empno, len)
  2     as (
  3 select deptno, count(*) over (partition by deptno),
  4        cast(ename as varchar(100)),
  5        empno,
  6        1
  7   from emp
  9  union all
  9 select x.deptno, x.cnt,
 10        cast(x.list + ',' + e.ename as varchar(100)),
 11        e.empno, x.len+1
 12   from emp e, x
 13 where e.deptno = x.deptno
 14   and e.empno > x. empno
 15            )
 16 select deptno,list
 17   from x
 18  where len = cnt
 19  order by 1

Discussion

Being able to create delimited lists in SQL is useful because it is a common requirement. Yet each DBMS offers a unique method for building such a list in SQL. There's very little commonality between the vendor-specific solutions; the techniques vary from using recursion, to hierarchal functions, to classic transposition, to aggregation.

DB2 and SQL Server

The solution for these two databases differ slightly in syntax (the concatenation operators are "||" for DB2 and "+" for SQL Server), but the technique is the same. The first query in the WITH clause (upper portion of the UNION ALL) returns the following information about each employee: the department, the number of employees in that department, the name, the ID, and a constant 1 (which at this point doesn't do anything). Recursion takes place in the second query (lower half of the UNION ALL) to build the list. To understand how the list is built, examine the following excerpts from the solution: first, the third SELECT-list item from the second query in the union:

 x.list ||','|| e.ename

and then the WHERE clause from that same query:

 where e.deptno = x.deptno
   and e.empno > x.empno

The solution works by first ensuring the employees are in the same department. Then, for every employee returned by the upper portion of the UNION ALL, append the name of the employees who have a greater EMPNO. By doing this, you ensure that no employee will have his own name appended. The expression

 x.len+1

increments LEN (which starts at 1) every time an employee has been evaluated. When the incremented value equals the number of employees in the department:

 where len = cnt

you know you have evaluated all the employees and have completed building the list. That is crucial to the query as it not only signals when the list is complete, but also stops the recursion from running longer than necessary.

MySQL

The function GROUP_CONCAT does all the work. It concatenates the values found in the column passed to it, in this case ENAME. It's an aggregate function, thus the need for GROUP BY in the query.

Oracle

The first step to understanding the Oracle query is to break it down. Running the inline view by itself (lines 410), you generate a result set that includes the following for each employee: her department, her name, a rank within her respective department that is derived by an ascending sort on EMPNO, and a count of all employees in her department. For example:

 
 select deptno,
         ename,
         row_number() over
                   (partition by deptno order by empno) rn,
         count(*) over (partition by deptno) cnt
   from emp

 DEPTNO ENAME      RN CNT
 ------ ---------- -- ---
     10 CLARK       1   3
     10 KING        2   3
     10 MILLER      3   3
     20 SMITH       1   5
     20 JONES       2   5
     20 SCOTT       3   5
     20 ADAMS       4   5
     20 FORD        5   5
     30 ALLEN       1   6
     30 WARD        2   6
     30 MARTIN      3   6
     30 BLAKE       4   6
     30 TURNER      5   6
     30 JAMES       6   6

The purpose of the rank (aliased RN in the query) is to allow you to walk the tree. Since the function ROW_NUMBER generates an enumeration starting from one with no duplicates or gaps, just subtract one (from the current value) to reference a prior (or parent) row. For example, the number prior to 3 is 3 minus 1, which equals 2. In this context, 2 is the parent of 3; you can observe this on line 12. Additionally, the lines

 start with rn = 1
 connect by prior deptno = deptno

identify the root for each DEPTNO as having RN equal to 1 and create a new list whenever a new department is encountered (whenever a new occurrence of 1 is found for RN).

At this point, it's important to stop and look at the ORDER BY portion of the ROW_NUMBER function. Keep in mind the names are ranked by EMPNO and the list will be created in that order. The number of employees per department is calculated (aliased CNT) and is used to ensure that the query returns only the list that has all the employee names for a department. This is done because SYS_CONNECT_ BY_PATH builds the list iteratively, and you do not want to end up with partial lists.

For heirarchical queries, the pseudocolumn LEVEL starts with 1 (for queries not using CONNECT BY, LEVEL is 0, unless you are on 10g and later when LEVEL is only available when using CONNECT BY) and increments by one after each employee in a department has been evaluated (for each level of depth in the hierarchy). Because of this, you know that once LEVEL reaches CNT, you have reached the last EMPNO and will have a complete list.

PostgreSQL

PostgreSQL's solution requires you to know in advance the maximum number of employees in any one department. Running the inline view by itself (lines 1118) generates a result set that includes (for each employee) his department, his name with a comma appended, the number of employees in his department, and the number of employees who have an EMPNO that is less than his:

 deptno  |  emps    | cnt | pos
 --------+----------+-----+-----
     20  |  SMITH,  |   5 |   1
     30  |  ALLEN,  |   6 |   1
     30  |  WARD,   |   6 |   2
     20  |  JONES,  |   5 |   2
     30  |  MARTIN, |   6 |   3
     30  |  BLAKE,  |   6 |   4
     10  |  CLARK,  |   3 |   1
     20  |  SCOTT,  |   5 |   3
     10  |  KING,   |   3 |   2
     30  |  TURNER, |   6 |   5
     20  |  ADAMS,  |   5 |   4
     30  |  JAMES,  |   6 |   6
     20  |  FORD,   |   5 |   5
     10  |  MILLER, |   3 |   3

The scalar subquery, POS (lines 14-15), is used to rank each employee by EMPNO. For example, the line

 max(case when pos = 1 then ename else '' end)||

evaluates whether or not POS equals 1. The CASE expression returns the employee name when POS is 1, and otherwise returns NULL.

You must query your table first to find the largest number of values that could be in any one list. Based on the EMP table, the largest number of employees in any one department is six, so the largest number of items in a list is six.

The next step is to begin creating the list. Do this by performing some conditional logic (in the form of CASE expressions) on the rows returned from the inline view.

You must write as many CASE expressions as there are possible values to be concatenated together.

If POS equals one, the current name is added to the list. The second CASE expression evaluates whether or not POS equals two; if it does, then the second name is appended to the first. If there is no second name, then an additional comma is appended to the first name (this process is repeated for each distinct value of POS until the last one is reached).

The use of the MAX function is necessary because you want to build only one list per department (you can also use MIN; it makes no difference in this case, since POS returns only one value for each case evaluation). Whenever an aggregate function is used, any items in the SELECT list not acted upon by the aggregate must be specified in the GROUP BY clause. This guarantees you will have only one row per item in the SELECT list not acted upon by the aggregate function.

Notice that you also need the function RTRIM to remove trailing commas; the number of commas will always be equal to the maximum number of values that could potentially be in a list (in this case, six).

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

Problem

Solution

Discussion

Problem

Solution

Discussion

Problem

Solution

DB2

MySQL

Oracle

PostgreSQL

SQL Server

Discussion

Problem

Solution

DB2

MySQL

Oracle

PostgreSQL

SQL Server

Discussion

DB2 and SQL Server

MySQL

Oracle

PostgreSQL

Problem

Solution

DB2

MySQL

Oracle

PostgreSQL

SQL Server

Discussion

DB2, Oracle, and PostgreSQL

MySQL

Problem

Solution

DB2

MySQL

Oracle

PostgreSQL

SQL Server

Discussion

DB2 and SQL Server

MySQL

Oracle

PostgreSQL

Problem

Solution

DB2

MySQL

Oracle

Postgres

SQL Server

Discussion

DB2 and SQL Server

MySQL

Oracle

PostgreSQL

Problem

Solution

DB2

MySQL

Oracle

PostgreSQL

SQL Server

Discussion

DB2 and SQL Server

MySQL

Oracle

PostgreSQL

Labels

Other Sites

Blog Archive

About Me