Alphabetizing a String : SQL SERVER

Problem

You want alphabetize the individual characters within strings in your tables. Consider the following result set:

 ENAME
----------
ADAMS
ALLEN
BLAKE
CLARK
FORD
JAMES
JONES
KING
MARTIN
MILLER
SCOTT
SMITH
TURNER
WARD
You would like the result to be:
 OLD_NAME   NEW_NAME
---------- --------
ADAMS AADMS
ALLEN AELLN
BLAKE ABEKL
CLARK ACKLR
FORD DFOR
JAMES AEJMS
JONES EJNOS
KING GIKN
MARTIN AIMNRT
MILLER EILLMR
SCOTT COSTT
SMITH HIMST
TURNER ENRRTU
WARD ADRW

Solution

This problem is a perfect example of why it is crucial to understand your DBMS and what functionality is available to you. In situations where your DBMS does not provide built-in functions to facilitate this solution, you need to come up with something creative. Compare the MySQL solution with the rest.

DB2

To alphabetize rows of strings it is necessary to walk each string then order its characters:

  1 select ename,
2 max(case when pos=1 then c else '' end)||
3 max(case when pos=2 then c else '' end)||
4 max(case when pos=3 then c else '' end)||
5 max(case when pos=4 then c else '' end)||
6 max(case when pos=5 then c else '' end)||
7 max(case when pos=6 then c else '' end)
8 from (
9 select e.ename,
10 cast(substr(e.ename,iter.pos,1) as varchar(100)) c,
11 cast(row_number( )over(partition by e.ename
12 order by substr(e.ename,iter.pos,1))
13 as integer) pos
14 from emp e,
15 (select cast(row_number( )over( ) as integer) pos
16 from emp) iter
17 where iter.pos <= length(e.ename)
18 ) x
19 group by ename
MySQL

The key here is the GROUP_CONCAT function, which allows you to not only concatenate the characters that make up each name but also order them:

 1 select ename, group_concat(c order by c separator '')
2 from (
3 select ename, substr(a.ename,iter.pos,1) c
4 from emp a,
5 ( select id pos from t10 ) iter
6 where iter.pos <= length(a.ename)
7 ) x
8 group by ename

Oracle

The function SYS_CONNECT_BY_PATH allows you to iteratively build a list:

  1 select old_name, new_name
2 from (
3 select old_name, replace(sys_connect_by_path(c,' '),' ') new_name
4 from (
5 select e.ename old_name,
6 row_number() over(partition by e.ename
7 order by substr(e.ename,iter.pos,1)) rn,
8 substr(e.ename,iter.pos,1) c
9 from emp e,
10 ( select rownum pos from emp ) iter
11 where iter.pos <= length(e.ename)
12 order by 1
13 ) x
14 start with rn = 1
15 connect by prior rn = rn-1 and prior old_name = old_name
16 )
17 where length(old_name) = length(new_name)

PostgreSQL

PostgreSQL does not offer any built-in functions to easily sort characters in a string, so it is necessary not only to walk through each string but also to know in advance the largest length of any one name. View V is used in this solution for readability:

  create or replace view V as
select x.*
from (
select a.ename,
substr(a.ename,iter.pos,1) as c
from emp a,
(select id as pos from t10) iter
where iter.pos <= length(a.ename)
order by 1,2
) x
The following select statement leverages the view:
  1 select ename,
2 max(case when pos=1 then
3 case when cnt=1 then c
4 else rpad(c,cast(cnt as integer),c)
5 end
6 else ''
7 end)||
8 max(case when pos=2 then
9 case when cnt=1 then c
10 else rpad(c,cast(cnt as integer),c)
11 end
12 else ''
13 end)||
14 max(case when pos=3 then
15 case when cnt=1 then c
16 else rpad(c,cast(cnt as integer),c)
17 end
18 else ''
19 end)||
20 max(case when pos=4 then
21 case when cnt=1 then c
22 else rpad(c,cast(cnt as integer),c)
23 end
24 else ''
25 end)||
26 max(case when pos=5 then
27 case when cnt=1 then c
28 else rpad(c,cast(cnt as integer),c)
29 end
30 else ''
31 end)||
32 max(case when pos=6 then
33 case when cnt=1 then c
34 else rpad(c,cast(cnt as integer),c)
35 end
36 else ''
37 end)
38 from (
39 select a.ename, a.c,
40 (select count(*)
41 from v b
42 where a.ename=b.ename and a.c=b.c ) as cnt,
43 (select count(*)+1
44 from v b
45 where a.ename=b.ename and b.c
SQL Server

To alphabetize rows of strings it is necessary to walk each string, and then order their characters:

  1 select ename,
2 max(case when pos=1 then c else '' end)+
3 max(case when pos=2 then c else '' end)+
4 max(case when pos=3 then c else '' end)+
5 max(case when pos=4 then c else '' end)+
6 max(case when pos=5 then c else '' end)+
7 max(case when pos=6 then c else '' end)
8 from (
9 select e.ename,
10 substring(e.ename,iter.pos,1) as c,
11 row_number() over (
12 partition by e.ename
13 order by substring(e.ename,iter.pos,1)) as pos
14 from emp e,
15 (select row_number()over(order by ename) as pos
16 from emp) iter
17 where iter.pos <= len(e.ename) 18 ) x 19 group by ename

Discussion

DB2 and SQL Server

The inline view X returns each character in each name as a row. The function SUBSTR or SUBSTRING extracts each character from each name, and the function ROW_NUMBER ranks each character alphabetically:

 ENAME  C  POS
----- - ---
ADAMS A 1
ADAMS A 2
ADAMS D 3
ADAMS M 4
ADAMS S 5


To return each letter of a string as a row, you must walk the string. This is accomplished with inline view ITER.

Now that the letters in each name have been alphabetized, the last step is to put those letters back together, into a string, in the order they are ranked. Each letter's position is evaluated by the CASE statements (lines 27). If a character is found at a particular position it is then concatenated to the result of the next evaluation (the following CASE statement). Because the aggregate function MAX is used as well, only one character per position POS is returned, so that only one row per name is returned. The CASE evaluation goes up to the number 6, which is the maximum number of characters in any name in table EMP.

MySQL

The inline view X (lines 36) returns each character in each name as a row. The function SUBSTR extracts each character from each name:

 ENAME  C
----- -
ADAMS A
ADAMS A
ADAMS D
ADAMS M
ADAMS S

Inline view ITER is used to walk the string. From there, the rest of the work is done by the GROUP_CONCAT function. By specifying an order, the function not only concatenates each letter, it does so alphabetically.
Oracle

The real work is done by inline view X (lines 511), where the characters in each name are extracted and put into alphabetical order. This is accomplished by walking the string, then imposing order on those characters. The rest of the query merely glues the names back together.

The tearing apart of names can be seen by executing only inline view X:

 OLD_NAME          RN C
---------- --------- -
ADAMS 1 A
ADAMS 2 A
ADAMS 3 D
ADAMS 4 M
ADAMS 5 S


The next step is to take the alphabetized characters and rebuild each name. This is done with the function SYS_CONNECT_BY_PATH by appending each character to the ones before it:

 OLD_NAME   NEW_NAME
---------- ---------
ADAMS A
ADAMS AA
ADAMS AAD
ADAMS AADM
ADAMS AADMS

The final step is to keep only the strings that have the same length as the names they were built from.
PostgreSQL

For readability, view V is used in this solution to walk the string. The function SUBSTR, in the view definition, extracts each character from each name so that the view returns:

 ENAME C
----- -
ADAMS A
ADAMS A
ADAMS D
ADAMS M
ADAMS S

The view also orders the results by ENAME and by each letter in each name. The inline view X (lines 1518) returns the names and characters from view V, the number of times each character occurs in each name, and its position (alphabetically):
 ename | c | cnt | pos
------+---+-----+-----
ADAMS | A | 2 | 1
ADAMS | A | 2 | 1
ADAMS | D | 1 | 3
ADAMS | M | 1 | 4
ADAMS | S | 1 | 5
The extra columns CNT and POS, returned by the inline view X, are crucial to the solution. POS is used to rank each character and CNT is used to determine the number of times the character exists in each name. The final step is to evaluate the position of each character and rebuild the name. You'll notice that each case statement is actually two case statements. This is to determine whether or not a character occursmore than once in a name; if it does, then rather than return that character, what is returned is that character appended to itself CNT times. The aggregate function, MAX, is used to ensure there is only one row per name.** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

0 Comments: