RSS
 

Posts Tagged ‘T-SQL’

Tally Table – String Cleaning

13 Jan

Overview
This is something that most people eventually need for reporting purposes.  This function uses a Tally table to ‘clean’ a string, removing anything you don’t specify in the @Rep parameter.  In the case below, I remove everything but letters, numbers, spaces, commas and periods.   This is a slightly modified version of a function I wrote for something else, so the specifics like replacing a lot of the double spaces, and replacing carriage returns and line breaks with spaces were specifically for that.  Remove those pieces of code if you wish, or add those as additional parameters to the function.

Other Methods
I’ll mention that I’ve seen people claim a while loop will beat this method and others that say this method is still faster.  I haven’t tested this enough personally to make the claim one way or another, but I prefer this one.  This is also one of the accepted places where a CLR function can be superior to T-SQL, but for people who can’t or don’t want to use CLR on their servers, this will still work.

Tally Method

/*
SELECT dbo.SSC_fn_TallyClean('This will remove all of this---->!)#%*)^+__#@#$+_!)#~!!! <----.','[a-zA-Z0-9. ,]') Cleaned
*/


CREATE FUNCTION SSC_fn_TallyClean(
    @A VARCHAR(500),
    @Rep VARCHAR(100))
RETURNS VARCHAR(500)

AS
BEGIN
DECLARE @B VARCHAR(500)
SET @B = '' -- Initialize @B

-- Remove Line Feed / Carriage Returns (The tally code would have removed them, but I wanted to replace them with spaces for readability.)
SET @A = REPLACE(REPLACE(@A,CHAR(10),' '),CHAR(13),' ')

SELECT @B = @B + SUBSTRING(@A,N,1)
FROM Tally
WHERE N <= DATALENGTH(@A)
    AND SUBSTRING(@A,N,1) LIKE @Rep -- Remove everything but letters, numbers, spaces, period and comma.

RETURN REPLACE(REPLACE(@B,' ',' '),' ',' ') -- Removes some double spaces.
END
 
No Comments

Posted in All

 

Tally Table – Character Date Validation

14 Dec

Tally (or numbers) tables are one of my favorite query writing tools.  Such a simple premise that can be applied to a wide variety of problems.   If you don’t know what they are, I’d recommend this article.   Here is one problem I’ve seen a few times now that you can use a Tally table to easily solve.  This is by no means the only way to do something like this, nor is it necessarily even the *best* way, but it is *a* way.

The Problem:

You have a char/varchar column that holds dates in YYYYMMDD format.  You now need to cast these values to dates.  The problem is, with no validation on the column, some of these values aren’t actually valid dates and you get conversion errors.  You’d like to go through and identify anything that didn’t follow the pattern of YYYYMMDD or wasn’t a valid date so that it can be either fixed or removed.

The Tally Solution:

The following query uses a Tally table to generate a range of valid dates that you can check your varchar field against.  The below version works in 2000+.  (You can just as easily use a CTE tally in 2005/2008)

DECLARE @StartDate datetime
DECLARE @EndDate datetime

SET @StartDate = '19910101 00:00:00'
SET @EndDate = '20250101 00:00:00'

SELECT *
FROM YourTable
WHERE vcdatecolumn NOT IN (
SELECT CAST(YEAR(dateadd(d,n,@StartDate)) AS CHAR(4)) +
RIGHT('0' + CAST(MONTH(dateadd(d,n,@StartDate)) AS VARCHAR(2)),2) +
RIGHT('0' + CAST(DAY(dateadd(d,n,@StartDate)) AS VARCHAR(2)),2)
FROM Tally
WHERE DATEADD(d,n,@StartDate) &lt;= @EndDate
)

Other Methods:
As mentioned, there are a lot of ways to do this.  Because ISDATE() jumps out as being such an easy one, I’ll mention a couple of the problems with using that method.  The main gap left is that some of the junk values (fairly common with this setup) like ’9901′ are valid dates according to ISDATE().  However, it is unlikely that this date is really supposed to be 9901/01/01.  You can add a length check, but then you run into ’9901June’ still being valid, etc. etc.  Using the Tally method, all of these are simply flagged as invalid and you can make up your mind about what to do with them once they’ve been identified.

 
2 Comments

Posted in All

 

T-SQL Challenges

01 Dec
I just (at the time of initially writing this anyways) finished up my second attempt at one of the T-SQL Challenges ( http://beyondrelational.com/blogs/tc/default.aspx ), and feel much better about this submission.  The first one worked… but it just seemed too slow.  It finished in well under a second on my dev machine, but knowing that you’ll be putting your responses up against some of the greatest minds in the industry really makes you think twice.  I thought about it a bit more and decided to come back at the problem from a completely different angle.  The second attempt was an improvement of more than 100%.  Thinking that much about a problem makes me take a step back and look at all the things I take for granted in my day to day coding.  Had this been a query written in normal development, I likely would have been perfectly happy to leave it at the first attempt.  Reastically, it probably would have been even less efficient than my first attempt, because I was already trying to pull every ounce of speed out of it the first time.  Things that you do so often that they become routine aren’t always the best way to approach every problem, and these challenges force you to really sit back and think about all the different ways you could tackle a given problem.
I personally find the logic puzzles and challenges extremely rewarding, regardless of whether I win them or not.  I get the chance to sit at home with a beer and come up with a solution to a (usually) complex problem, and then see how my peers would handle the same situation.  I would encourage anyone who regularly writes T-SQL to give these a shot.   Even if you don’t think of the best solution yourself, thinking that hard about a problem makes looking at the winning solution that much more meaningful.  They have beginner challenges as well.
Also, I believe Phil Factor is doing similar challenges (http://www.sqlservercentral.com/blogs/philfactor/archive/2009/10/25/phil-factor-sql-speed-phreak-competition.aspx).  Although I haven’t had a chance to check those out yet, I look forward to doing so in the near future.

I just (at the time of initially writing this anyways) finished up my second attempt at one of the T-SQL Challenges on BeyondRelational.com, and feel much better about this submission.  The first one worked… but it just seemed too slow.  It finished in well under a second on my dev machine, but knowing that you’ll be putting your responses up against some of the greatest minds in the industry really makes you think twice.  I thought about it a bit more and decided to come back at the problem from a completely different angle.  The second attempt was an improvement of more than 100%.  Thinking that much about a problem makes me take a step back and look at all the things I take for granted in my day to day coding.  Had this been a query written in normal development, I likely would have been perfectly happy to leave it at the first attempt.  Realistically, it probably would have been even less efficient than my first attempt, because I was already trying to pull every ounce of speed out of it the first time.  Things that you do so often that they become routine aren’t always the best way to approach every problem, and these challenges force you to really sit back and think about all the different ways you could tackle a given problem.

I personally find the logic puzzles and challenges extremely rewarding, regardless of whether I win them or not.  I get the chance to sit at home with a beer and come up with a solution to a (usually) complex problem, and then see how my peers would handle the same situation.  I would encourage anyone who regularly writes T-SQL to give these a shot.   Even if you don’t think of the best solution yourself, thinking that hard about a problem makes looking at the winning solution that much more meaningful.  They have beginner challenges as well.

Also, I believe Phil Factor is doing similar challenges.  Although I haven’t had a chance to check those out yet, I look forward to doing so in the near future.

[Edit]  I originally wrote this post a couple weeks ago (I didn’t have my Blog online at the time) when the current T-SQL Challenge was 17.  Number 18 just came out and this one looks exceptionally interesting.  Usually I can look at the challenges and 3 or 4 ways to accomplish them immediately go through my head.  Not so in this case, this one will definitely take some thought.

 
1 Comment

Posted in All