Microsoft announced a new program last year to help you understand the skills that a Data Scientist needs in their daily life.  It consists of nine courses and a final project, you can get all the details about it on the Microsoft Academy site.  I started working on this at the end of 2016 when things were slow at work and at home.  I completed seven of the courses before things started to pick up at work and home.  I’ve been mid-way through the eighth course for almost four months now, having to go back to the beginning of the course a few times due to being pulled away. 

The program has been very informative so far, providing courses on statistics and probability, Machine Learning, Power BI, R (and Python) and general data science concepts.  I’m hoping things will slow down a bit so I can complete the program by the end of Summer.

If you are at all curious about what a Data Scientist does, I highly recommend this program.  The great thing about this program is that you can take all the courses for free, that’s right, I said free, gratis, no dough required, etc.  However, if you do opt for the free route, you don’t earn that beautiful certificate that you can share with others, you just get the satisfaction of completing the course and broadening your horizons.  Either way, it’s a good way to get started in the field of Data Science.

Posted by: sqlswimmer | June 7, 2017

Import Export Wizard Mapping Files

Recently I had to copy data from our AS400 to a SQL Server database.  Normally I would whip up my fancy SQL Server Integration Services (SSIS) project and get it hammered out pretty quickly.  Unfortunately there were over 4,000 tables that I needed to pull data from, no way in HELL was I going to manually create 4,000+ packages.  Now most of my BIML friends would say, I could BIML that for you in 2 hours and if my BIML weren’t so rusty, I probably could have too.  I didn’t have to do any fancy transformations on the data, I just had to copy it.  So I decided to take the “easy way out” and use the Import Export Wizard in SQL Server Management Studio.  Shoot, all I would have to do is a few clicks and be done with it, right?  Boy was I wrong.

This post talks about the issue I ran into with SSIS Mapping Files.

We currently run DB2 on an IBM iSeries AS400 for our ERP system.  I was tasked with copying data from the AS400 to a SQL Server database for some consultants to use.  The C-Suite didn’t want to give the consultants access to our AS400, so this was the work around that was put forth and accepted (and no, no one asked me before I was “voluntold” for the task).  Since this would essentially be a “one-time” thing, I chose to use the Import Export Wizard, but I would save the package just in case they wanted this process repeated.

I fired up the Import Export Wizard and selected my source, IBM DB2 for i IBMDA400 OLE DB Provider.  Now before you can select this data source you must install the IBM DB2 drivers.  You can find out more about them here, unfortunately you have to have a maintenance contract and an account with IBM before you can download them <sigh>.  It’s a straight forward install once you have the installation package.

EZ peazy, lemon squeezy.

I selected my destination, SQL Server Native Client 11.0, of course!

 

On a roll now, should only be another 30 seconds and I can put this project to bed.  Well, we all know that’s not what happened, otherwise you wouldn’t be reading this.

When I clicked on the Edit mappings button in the wizard to make sure all the datatypes had been mapped successfully, I got “<unknown type>” for every single column.  WTH?!  This doesn’t happen when I’m working in Visual Studio with my SSIS projects.  After some frantic googling, I found a couple of very good articles on the Mapping Files for the Import Export Wizard.

Data Type Mapping

Import Export Wizard can show numbers..

I took the advice of the articles and made copies of my Mapping Files before I modified them.  I made my modifications to include the column types and their respective mappings courtesy of the Data Type Mapping article and saved my changes.  I made sure the Import Export Wizard was closed then started it again.  This isn’t so hard, no big deal, they’ll all populate correctly now..WHAT?!  Still <unknown type> for all columns!  Now this has become a matter of solving it using this method, I will NOT resort to brushing up on my BIML.

After many attempts I finally figured out what the issue was.  There were two.  First, the order in which the Import Export Wizard searches through the Mapping Files.  Second, the Source Type within the Mapping File.

According to the Import Export Wizard, my source provider is IBMDA400 and it can’t find any mapping file.  But if you notice the Mapping file for my destination…

For the first issue, a little background on how the Import Export Wizard works.  When you select a source and destination the wizard has to know how to map the data types from source to destination so you don’t end up with gobbledygook in your destination.  So it searches through all the files in the following directories, depending on your architecture (I’m using SQL Server 2016 hence the 130 in the path):

C:\Program Files\Microsoft SQL Server\130\DTS\MappingFiles

C:\Program Files (x86)\Microsoft SQL Server\130\DTS\MappingFiles

The key word there is ALL the files in the directory, so if you just copy your original files to the same folder (with the famous “ – copy” so courteously appended by Windows Explorer), it will search through both your original AND the modified copy of the respective Mapping File.  In my case it was finding the source/destination Mapping File in the original Mapping File and completely ignoring my modified copy <sigh>.  Easy enough to fix, I moved my original “copies” to a completely different folder.

For the second issue, the source type within the Mapping File.  Now I will admit that I had been working on this for about 5 hours straight and had become so obsessed with making this work with the Import Export Wizard, I started to not pay attention to detail.  I want to see how long it takes you to find the issue, this is my file that I think should work.

 

This is the file that actually works

 

Did you find it?  How long did it take you?  Took me about an hour to figure this out

In case you still haven’t found the issue, the answer is:  The SourceType in the first file is using IBMDADB2* and the second file is using IBMDA*.  Since our source provider is IBMDA400 and we are using the first file (IBMDADB2*), there is will be no match on the source.  As soon as we change the SourceType (IBMDA*) we get a match (* is used as a wild card) it works.  Three little letters, that’s all it took for me to waste half a day.

Now what I ended up doing instead of modifying the original mapping file is creating a copy of it, renaming it to something meaningful to me, but still following the naming convention of the Mapping Files, changing the SourceType value to IBMDA* and adding all the data types that were missing.  This way there will be no conflict if I ever need to use the mapping file with the IBMDADB2 SourceType.

I hope this helps someone else.  There’re tons of posts out there about data mapping, but none of them tell you to pay special attention to the issues I had.  Granted my issues were self created, but they were issues nonetheless.

Posted by: sqlswimmer | May 25, 2017

Speaking at SQL Saturday Atlanta (#652)

I am so excited and honored to have been selected to speak at SQL Saturday Atlanta (#652) this year.  This is huge event where I’ve been a volunteer and attendee in the past, but this will be my first time as a speaker.

I will be presenting my session What is Power BI?  I’ve presented this session a couple of times in the past but will be updating it to contain information regarding the changes that go into effect June 1, 2017.

If you are close to Atlanta on July 15, 2017, please stop by and say “Hello”, I’d love to see you.

#SQLSatATL

Posted by: sqlswimmer | March 22, 2017

SQL Saturday Richmond (#610)

On Saturday, March 18, 2017, I spoke at my very first SQL Saturday.  I have been an attendee, involved in organizing and volunteered at many over the years, but this was the very first time I was a speaker.  My session, What is Power BI?

I have presented this session twice before, once to my local user group and once at the Triad Developers Conference, so I was fairly comfortable with my content.  Richmond had 245 attendees registered with 5 different session tracks.  I had about 25 people in my session.  Of those 25, I only saw one nod off, but it was the first session of the day (8:30am), so I’m going to chalk that one up to not enough caffeine prior to the session.  There were some great questions and I had several people approach after the session with more detailed questions and to tell me how much they enjoyed my session.  Some were so excited they would be able to take action when they got back to work on Monday based on my session.  It really doesn’t get any better than that.

With my session in the rear view, I was excited to attend other sessions.  I was able to make it into two other sessions, both of which were fabulous.  The organizing team for SQL Saturday Richmond did a fabulous job.  The event appeared to run like clockwork, I think maybe they’ve done this before Winking smile

One thing I really liked about this event is that they did not have a typical speaker dinner and gift, they did a speaker event.  It was at G-Force Karts which was so exciting for me because I’d never driven a go kart before.  I’ve always fancied myself a race car driver in another life (much to Martin’s dismay) so this was my opportunity to see if it was true.  All I can say is, “Yes, it’s true.”  I had so much fun.  I wish more SQL Saturday organizers would consider doing something like this.  A nice dinner is always appreciated and a gift is a thoughtful gesture, but the memories I made with my #SQLFamily at G-Force Karts are something I will NEVER forget.

I just want to thank the organizers of SQL Saturday Richmond, all the volunteers, sponsors and spouses who made this event happen.  It was truly amazing and something I will remember my entire life.  Well done.

SQLSatRichmondGoKart

Photo courtesy of Doug Purnell (Blog |Twitter)

Posted by: sqlswimmer | March 16, 2017

Check Those Settings

Recently, I was tasked with “enhancing” a third party application.  This third party application (TPA) outputs a bunch of files to a file share in a way that makes sense to the application, but makes no sense to a human. 

The Task

Make a copy of these files in a new location that makes sense to humans.

The Rules of Engagement

  • Do not modify any of the existing files or file structures created by the TPA.
  • Do not modify any of the TPA database objects.
  • Do not add any objects to the TPA database.

First thing that popped into my head was, “I can do that in PowerShell in less than 5 minutes.”  Kind of like the old game show Name That Tune, my confidence level was high.  Little did I know what was in store for me.  Less than 5 minutes turned into more than 5 hours.

In order to make the files and file structure make sense to a human, I had to use a stored procedure from the TPA to decode some bits.  Easy enough in PowerShell, just use my favorite command-let from dbatools.io, invoke-sqlcmd2.  But wait, one minor detail, I am not allowed to install any snap-ins or any other tools on the server where this would run.  In fact, I can’t even use the latest version of PowerShell, I am stuck with using PowerShell 2.0 <sigh>.

After I dusted off my PowerShell 2.0 documentation, I got my script written and started testing.  I processed several folders and their files before I received the following error while running my PowerShell script:

Invoke-Sqlcmd : String or binary data would be truncated.
The statement has been terminated.
At line:127 char:36
+ … MyResults = Invoke-Sqlcmd -ServerInstance $ServerName `
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Invoke-Sqlcmd], SqlPowerShellSqlExecutionException
    + FullyQualifiedErrorId : SqlError,Microsoft.SqlServer.Management.PowerShell.GetScriptCommand

Interesting.  I added some Write-Host statements for troubleshooting and found the offending entry.  Like any good programmer, I tested my stored procedure call in SQL Server Management Studio (SSMS) to make sure it really was a SQL Server error and guess what?  It worked just fine!  No errors what so ever.  WTH?!  This is where my tunnel vision sets in.  If it works in SSMS but not in PowerShell, then PowerShell must be the problem, right?  Well, sort of.

After repeatedly running the same piece of code and expecting different results (yes, like I said, tunnel vision), I threw my hands up and quit for the day.  I had restless dreams that night.  I was being chased by a giant SQLString Truncator (a very rare dinosaur from esoteric era).  I woke with a start at 4:30am, I had to be missing something.  All I can say is, thank goodness for Twitter and #SQLHelp.  I tweeted my issue and got immediate responses from some very smart folks, but nothing that resolved my issue, until I read between the lines of a tweet from Robert Davis:

SQLSolderTweet

That’s when the light bulb went on and Robert sent his follow up tweet:

SQLSolderTweet2

I copied all my settings from SSMS and added them to my PowerShell script.  One by one I commented them out until I was left with just one.  Low and behold that SQLString Truncator was really one of those pesky ARITHABORT Biters.

Lessons Learned

  • As soon as that tunnel vision kicks in, you need to stop what you are doing and take a break.
  • Ask for help, don’t keep beating your head against the wall.
  • Most importantly, don’t forget about your settings.  They can make all the difference in the world.
  • ARITHABORT Biters are much harder to catch in the wild than SQLString Truncators.
Posted by: sqlswimmer | March 11, 2017

Next Stop, SQL Saturday Richmond

I can’t believe it’s almost time for SQL Saturday #610.  I’ll be there presenting What is Power BI?  If you are in the Richmond area Saturday, March 18, 2017, please stop by and say “Hello”, I’d love to see you.  There are tons of other sessions as well, so sit a spell and get your SQL Learnin’ On!

Posted by: sqlswimmer | March 10, 2017

Triad Developers Conference – My Debut

I did it!  I did my first “real world” presentation this morning at the Triad Developers Conference in Winston-Salem.  What I mean by “real world” is not a PASS audience.  These were total strangers off the street that I didn’t know, well, there were some familiar faces and even a friend or two, but for the most part total and complete strangers that had varying backgrounds, not all technical in nature.

The feedback I received was very positive and even helpful, so I can make this presentation even better when I present it in Richmond, VA next weekend at SQL Saturday #610.

Huge thank you goes out to the organizers, volunteers and sponsors who made this event happen.  And a special thank you goes out to Doug Purnell (Blog | Twitter) for recommending me in the first place.

Posted by: sqlswimmer | February 9, 2017

Speaking at Triad Developer’s Conference

I am honored to have been recommended by a colleague and selected as a speaker for the Triad Developer’s Conference in Winston-Salem on March 10, 2017.

The Triad Developer’s Conference is a low cost one day learning event put on by a myriad of local user groups in the Piedmont Triad area of North Carolina.  I attended the inaugural conference two years ago and it was fantastic.  So excited to be a speaker this year.

If you are in the Winston-Salem area on Friday, March 10, 2017, stop and say “Hi”.  I’d love to see you.

Posted by: sqlswimmer | January 31, 2017

Pesky Percent File Growth

As DBAs we all know setting your file growth to grow by percent is not optimal.  It can cause all kinds of issues, which rear their ugly heads as performance problems (see these articles by Brent Ozar, & Tim Ford).  So, when I have to support a third party application that automatically adds data files using percent instead of fixed size, it really irritates me.  I got tired of seeing these new files show up on my daily exceptions report, so I decided to do something about it.  This post explains what I did.

I have a home grown process that goes out and collects all kinds of information about my servers on a daily basis.  Once that process is complete it sends reports, via email, so I can get a quick look at things when I first arrive at work in the morning.  One of those reports is my file exception report.  It reports things like excessive data/log file growth, data/log files that are almost full, data/log files that use the percent file growth, etc.  The first time I had a file show up on my exceptions report with a percent file growth, I decided I needed to be notified before the report landed in my inbox, so I created a server level trigger that is triggered by the ALTER DATABASE command.  This trigger captures all the relevant information and sends me an email.  Here’s the code I used for my trigger:

CREATE TRIGGER [ddl_trig_alterdatabase]
ON ALL SERVER
FOR ALTER_DATABASE
AS
   DECLARE @Subject nvarchar(255)
      , @Body nvarchar(MAX)
   SELECT @Subject = N’A database was altered on ‘ + @@Servername
      , @Body = EVENTDATA().value(‘(/EVENT_INSTANCE/TSQLCommand/CommandText)[1]’,’nvarchar(max)’)
   exec msdb.dbo.sp_send_dbmail
      @recipients = ‘myemail@emaildomain.com’, — varchar(max)
      @subject = @Subject, — nvarchar(255)
      @body = @Body
GO
ENABLE TRIGGER [ddl_trig_alterdatabase] ON ALL SERVER
GO

 

This worked great, I found out before my report showed up and I could address the issue when it happened.  Unfortunately I discovered that one of the applications was making this change in the middle of the night.  I certainly didn’t want to have to wake up in the middle of the night to address this issue, since it really isn’t a “production down” type of problem (and let’s face it, no DBA wants to be woken in the middle of the night for anything, let alone something that is not production down).

I decided I needed to do something else other than just send an email notification, I needed to take corrective action when it occurred.  So I wrote a little stored procedure that will take the ALTER DATABASE statement as a parameter, parse it and take the appropriate corrective action. 

Simple enough, right?  Now I just need to add the call to my newly created stored procedure in my server level trigger and we are good to go.  But wait, you can’t ALTER a database within an ALTER DATABASE statement (don’t believe me? Use this as a learning exercise to see what happens when you try).  So what could I do?  There are several things you could do, but I chose to create a table that could hold this newly created ALTER DATABASE statement and insert the record there.  Then I created a SQL Agent job that runs once every hour and reads that table and executes any entries it finds, then deletes them after successfully executing.

Here’s the code for my stored procedure:

CREATE PROCEDURE [dbo].[ChangePercentGrowthMaxSizeUnlimited]
@SQLText nvarchar(max)
AS
SET NOCOUNT ON

/* We start with something like this
ALTER DATABASE [DatabaseName]
ADD FILE (NAME = N’DataLogFileName’
         ,FILENAME = N’X:\DataLogFileName.ndf’
         , SIZE = 20
         , FILEGROWTH = 5%
         , MAXSIZE = UNLIMITED)
*/

/*  We want to produce something like this
ALTER DATABASE [DatabaseName]
MODIFY FILE ( NAME = N’DataLogFileName’
            , MAXSIZE = 102400KB
            , FILEGROWTH = 10240KB )
*/

— Local Vars
DECLARE @AddFileText VARCHAR(8) = ‘ADD FILE’
   , @ContainsAddFileText BIT = 0
   , @AddFileStartPosition BIGINT
   , @FileGrowthPercentText VARCHAR(13) = ‘FILEGROWTH = %!%’
   , @ContainsFileGrowthPercent BIT = 0
   , @ContainsMaxSizeUnlmitedText BIT = 0
   , @StartPosition INT
   , @EndPosition INT
   , @Length INT
   , @DatabaseName VARCHAR(128)
   , @FileName VARCHAR(128)
   , @AlterDatabaseLength INT = LEN(‘ALTER DATABASE ‘)
   , @AlterDatabaseSQL NVARCHAR(MAX)

   — Is it an ADD File operation?
   SELECT @AddFileStartPosition = PATINDEX(‘%’ + @AddFileText + ‘%’, @SQLText)
  
   IF @AddFileStartPosition > 0
   BEGIN
      –It’s an ADD File operation
      SET @ContainsAddFileText = 1

      IF @SQLText LIKE ‘%’ + @FileGrowthPercentText + ‘%’ ESCAPE ‘!’
      BEGIN
         — it’s adding a file using percent file growth
         SET @ContainsFileGrowthPercent = 1
     
         — Is it setting MAXSIZE to UNLIMITED?
         IF PATINDEX(‘%’ + @MaxSizeUnlimitedText + ‘%’, @SQLText) > 0
         BEGIN
            SET @ContainsMaxSizeUnlmitedText = 1
         END
        
         — Now we need to parse the ADD FILE expression and build a MODIFY FILE operation from the parts
         — Get database name
         SELECT @StartPosition = @AlterDatabaseLength + 1
         SELECT @Length = CHARINDEX(‘ADD FILE’, @SQLText, @AlterDatabaseLength) – @AlterDatabaseLength
         SELECT @DatabaseName = LTRIM(RTRIM(SUBSTRING(@SQLText, @AlterDatabaseLength + 2, @AddFileStartPosition – 1 – @AlterDatabaseLength – 2)))

         — Get filename
         — Start by finding the start of the logical file name
         SELECT @StartPosition = CHARINDEX(””, @SQLText, PATINDEX(‘%’ + ‘[^FILE]NAME %”’ + ‘%’, @SQLText)) + 1
         SELECT @Length = CHARINDEX(””, @SQLText, @StartPosition) – @StartPosition
         SELECT @FileName = SUBSTRING(@SQLText, @StartPosition, @Length)

         — Now Create the alter database operation
         SELECT @AlterDatabaseSQL = N’ALTER DATABASE ‘ + @DatabaseName + N’ MODIFY FILE ( NAME = N”’ + @FileName + N”’, ‘
         IF @ContainsFileGrowthPercent = 1
            SELECT @AlterDatabaseSQL = @AlterDatabaseSQL + N’FILEGROWTH = 10240KB’

         IF @ContainsMaxSizeUnlmitedText = 1 AND @ContainsFileGrowthPercent = 1
            SELECT @AlterDatabaseSQL = @AlterDatabaseSQL  + ‘, MAXSIZE = 102400KB’
         ELSE
            IF @ContainsMaxSizeUnlmitedText = 1 AND @ContainsFileGrowthPercent = 0
               SELECT @AlterDatabaseSQL = @AlterDatabaseSQL + N’MAXSIZE = 102400KB’
           
         SELECT @AlterDatabaseSQL = @AlterDatabaseSQL + N’ )’

         INSERT dbo.DBAAlterDatabase
         (SQLText)
         VALUES
         (@AlterDatabaseSQL)
      END
   END
   ELSE
   BEGIN
      — It’s not an ADD FILE operation
      PRINT ‘It”s not an ADD FILE operation, no work to do.’
   END

RETURN 0

 

Here’s the code for my modified server level trigger:

CREATE TRIGGER [ddl_trig_alterdatabase]
ON ALL SERVER
FOR ALTER_DATABASE
AS
   DECLARE @Subject nvarchar(255)
      , @Body nvarchar(MAX)
   SELECT @Subject = N’A database was altered on ‘ + @@Servername
      , @Body = EVENTDATA().value(‘(/EVENT_INSTANCE/TSQLCommand/CommandText)[1]’,’nvarchar(max)’)
   exec msdb.dbo.sp_send_dbmail
      @recipients = ‘myemail@emaildomain.com’, — varchar(max)
      @subject = @Subject, — nvarchar(255)
      @body = @Body 
   EXEC dbo.ChangePercentGrowthMaxSizeUnlimited @SQLText = @Body
GO
ENABLE TRIGGER [ddl_trig_alterdatabase] ON ALL SERVER
GO

Works like a charm!  But wait, you might notice that I’m making more than a few assumptions in my stored procedure, and you would be correct.  I feel like I need to add a disclaimer to this post, the same way they add disclaimers to pharmaceutical commercials.

Here are my assumptions: 

  1. Only one file is being created at a time. 
  2. We always want to change our file growth to 10MB. 
  3. We don’t want a max file size of unlimited and we always want to set our max file size to 100MB. 
  4. There will be no errors.
  5. I am in NO way responsible if this code breaks something on your server.

Explanation of assumptions:

  1. What fun would it be if I did all the hard work for you?  This can easily be adapted to work with multiple files being created at the same time.  You can do it, I have faith in you.
  2. For my particular instance, I know the model database settings for this server and I hard coded them, because that’s what I wanted.  You could easily adapt the code to use your model database settings or any other value for that matter (HINT: think sys.sysfiles).
  3. See explanation of assumption 2 above.
  4. I removed all my standard stored procedure framework code (which includes error checking) for brevity.  You should ALWAYS have error checking in your stored procedures!
  5. You should NEVER assume code is not malicious in nature and add it to production without a thorough understanding of what it’s doing.  Shame on you if you did.

 

Posted by: sqlswimmer | January 31, 2017

My First SQL Saturday – As A Speaker!

I am so honored and excited to have been selected to speak at SQL Saturday Richmond on March 18, 2017.

I will be presenting my What is Power BI? session.  There have been a ton of changes to Power BI since I last presented this, so I’m off to start updating my slide deck. 

If you are close to Richmond, VA on March 18, 2017 and want to learn more about what Power BI is, please register for this event and stop by and see me.  I’d love to have you as would the SQL Saturday Richmond team.

Older Posts »

Categories