Tuesday, May 24, 2011

Remove Multiple Spaces using T-SQL XML Functions


String manupulation is very frequent in programming and also in t-sql sql development.
T-SQL developers and SQL DBA 's frequently require to remove multiple spaces and replace multiple spaces with single space in string variables in sql data types like varchar or nvarchar.
I have developed a sql script which can be used in order to replace multiple spaces within a string in sql.

Here is the t-sql script to remove extra spaces in sql variable @str which is in SQL Server data type nvarchar(max).
In this script the sql code replace multiple spaces with single space.

DECLARE @str NVARCHAR(MAX)
DECLARE @xml XML
SET @str = N'  remove    extra   spaces    replace    multiple  spaces   excess  spaces '

SELECT @str = REPLACE('<r>' + @str + '</r>', SPACE(1), '</r><r>')
SELECT @xml = CAST(@str AS XML);

WITH CTE AS (
  SELECT
    val
  FROM (
    SELECT
      i.value('.','nvarchar(100)') val
    FROM @xml.nodes('/r') AS x(i)
  ) p
  WHERE
    val <> ''
)
SELECT @str =
LTRIM(RTRIM((
  SELECT SPACE(1) + val FROM CTE FOR XML PATH('')
)))

SELECT @str

As you see while the input of the above t-sql script is :
'  remove    extra   spaces    replace    multiple  spaces   excess  spaces '
the output of the sql code is as :
'remove extra spaces replace multiple spaces excess spaces'

We delete the extra spaces in the sql code line containing "val <> ''".
This where condition eliminated the spaces from the string sql variable @str which is converted into XML data type in variable @xml and then converted into rows using the FROM @xml.nodes() method.

The second task to remove extra spaces from character variable is concatenating the parsed words with a single delimiter value space(1) (' ').
For string concatenation I code in sql using the FOR XML PATH() method. And while concatenating I used the SPACE character as the delimiter between words.

Actually, I did not test the performance issues related with the execution of the above sql batch script.
You should test the above remove multiple spaces script with load according to your needs and decide to use this method to replace the multiple spaces in your string variables.



1 comment:

  1. Nice, sql's xml is awesome. How about:

    SELECT CONVERT(XML, '' + REPLACE(CONVERT(NVARCHAR(MAX), (SELECT @str FOR XML PATH(''), TYPE)), ' ', '') + '').query('for $v in /r[text()!='''']/text() return concat('''', $v)').value('/', 'NVARCHAR(MAX)')

    which does it in one statement and handles entitization properly (so weird characters don't gum up the works).

    ReplyDelete