SQL Server : Separating Numeric and Character Data

Problem

You have (unfortunately) stored numeric data along with character data together in one column. You want to separate the character data from the numeric data. Consider the following result set:

 DATA
---------------
SMITH800
ALLEN1600
WARD1250
JONES2975
MARTIN1250
BLAKE2850
CLARK2450
SCOTT3000
KING5000
TURNER1500
ADAMS1100
JAMES950
FORD3000
MILLER1300

You would like the result to be:

 ENAME             SAL
---------- ----------
SMITH 800
ALLEN 1600
WARD 1250
JONES 2975
MARTIN 1250
BLAKE 2850
CLARK 2450
SCOTT 3000
KING 5000
TURNER 1500
ADAMS 1100
JAMES 950
FORD 3000
MILLER 1300

Solution

Use the built-in functions TRANSLATE and REPLACE to isolate the character from the numeric data. Like other recipes in this chapter, the trick is to use TRANSLATE to transform multiple characters into a single character you can reference. This way you are no longer searching for multiple numbers or characters, rather one character to represent all numbers or one character to represent all characters.

DB2

Use the functions TRANSLATE and REPLACE to isolate and separate the numeric from the character data:

  1 select replace(
2 translate(data,'0000000000','0123456789'),'0','') ename,
3 cast(
4 replace(
5 translate(lower(data),repeat('z',26),
6 'abcdefghijklmnopqrstuvwxyz'),'z','') as integer) sal
7 from (
8 select ename||cast(sal as char(4)) data
9 from emp
10 ) x

Oracle

Use the functions TRANSLATE and REPLACE to isolate and separate the numeric from the character data:

  1 select replace(
2 translate(data,'0123456789','0000000000'),'0') ename,
3 to_number(
5 replace(
6 translate(lower(data),
7 'abcdefghijklmnopqrstuvwxyz',
8 rpad('z',26,'z')),'z')) sal
9 from (
10 select ename||sal data
11 from emp
12 )

PostgreSQL

Use the functions TRANSLATE and REPLACE to isolate and separate the numeric from the character data:

  1 select replace(
2 translate(data,'0123456789','0000000000'),'0','') as ename,
3 cast(
4 replace(
5 translate(lower(data),
6 'abcdefghijklmnopqrstuvwxyz',
7 rpad('z',26,'z')),'z','') as integer) as sal
8 from (
9 select ename||sal as data
10 from emp
11 ) x

Discussion

The syntax is a bit different for each DBMS, but the technique is the same. I will use the solution for Oracle in the discussion section. The key to solving this problem is to isolate the numeric and character data. You can use TRANSLATE and REPLACE to do this. To extract the numeric data, first isolate all character data using TRANSLATE:

 
select data,
translate(lower(data),
'abcdefghijklmnopqrstuvwxyz',
rpad('z',26,'z')) sal
from (select ename||sal data from emp)


DATA SAL
-------------------- -------------------
SMITH800 zzzzz800
ALLEN1600 zzzzz1600
WARD1250 zzzz1250
JONES2975 zzzzz2975
MARTIN1250 zzzzzz1250
BLAKE2850 zzzzz2850
CLARK2450 zzzzz2450
SCOTT3000 zzzzz3000
KING5000 zzzz5000
TURNER1500 zzzzzz1500
ADAMS1100 zzzzz1100
JAMES950 zzzzz950
FORD3000 zzzz3000
MILLER1300 zzzzzz1300

By using TRANSLATE you convert every non-numeric character into a lowercase Z. The next step is to remove all instances of lowercase Z from each record using REPLACE, leaving only numerical characters that can then be cast to a number:

 
select data,
to_number(
replace(
translate(lower(data),
'abcdefghijklmnopqrstuvwxyz',
rpad('z',26,'z')),'z')) sal
from (select ename||sal data from emp)


DATA SAL
-------------------- ----------
SMITH800 800
ALLEN1600 1600
WARD1250 1250
JONES2975 2975
MARTIN1250 1250
BLAKE2850 2850
CLARK2450 2450
SCOTT3000 3000
KING5000 5000
TURNER1500 1500
ADAMS1100 1100
JAMES950 950
FORD3000 3000
MILLER1300 1300

To extract the non-numeric characters, isolate the numeric characters using TRANSLATE:

 
select data,
translate(data,'0123456789','0000000000') ename
from (select ename||sal data from emp)


DATA ENAME
-------------------- ----------
SMITH800 SMITH000
ALLEN1600 ALLEN0000
WARD1250 WARD0000
JONES2975 JONES0000
MARTIN1250 MARTIN0000
BLAKE2850 BLAKE0000
CLARK2450 CLARK0000
SCOTT3000 SCOTT0000
KING5000 KING0000
TURNER1500 TURNER0000
ADAMS1100 ADAMS0000
JAMES950 JAMES000
FORD3000 FORD0000
MILLER1300 MILLER0000

By using TRANSLATE you convert every numeric character into a zero. The next step is to remove all instances of zero from each record using REPLACE, leaving only non-numeric characters:

 
select data,
replace(translate(data,'0123456789','0000000000'),'0') ename
from (select ename||sal data from emp)


DATA ENAME
-------------------- -------
SMITH800 SMITH
ALLEN1600 ALLEN
WARD1250 WARD
JONES2975 JONES
MARTIN1250 MARTIN
BLAKE2850 BLAKE
CLARK2450 CLARK
SCOTT3000 SCOTT
KING5000 KING
TURNER1500 TURNER
ADAMS1100 ADAMS
JAMES950 JAMES
FORD3000 FORD
MILLER1300 MILLER

Put the two techniques together and you have your solution.

** If you want the Full Table detail. Refer the SQL Table in Label List. Or Click here to View the Table

0 Comments: