Recently I was working on a project that had many different input tables that needed to be filtered into one. I’ve worked on similar projects before and it’s usually not too bad to compare a few tables. But this time, with so many different tables, using my normal approach would have been too time-consuming. I created a more effective method for comparing a large number of different SQL Server tables.
This blog reviews a script used in MSSQL to compare all columns and datatypes per table in an easy-to-use format. Here’s an example of the output:
The first thing to point out is the @Orientation variable; it can be set to either H or V. That is to determine whether column names will be displayed as rows (H) or columns (V). After that, temp tables are needed to store table and column data to keep later queries simple and clean.
1 2 3 4 5 6 |
CREATE TABLE #Tables( ID INT IDENTITY(1,1), TableName VARCHAR(MAX), ColumnName VARCHAR(MAX), ColumnType VARCHAR(MAX), ColumnLength INT, ColumnPercision INT, ColumnScale INT, DataType VARCHAR(MAX) ); CREATE TABLE #Columns( ID INT IDENTITY(1,1), TableName VARCHAR(MAX), ColumnName VARCHAR(MAX), ColumnType VARCHAR(MAX), COLTXT VARCHAR(MAX) ); |
#Tables is used for storing all of the tables you wish to compare along with their column information. #Columns is used for storing the combined columns of all of the tables. Be mindful of the number of tables you’re comparing because there is a max column limit to MSSQL table (1024 columns).
Below, #table is filled by selecting the table and column data. Then pull in the length, precision, and scale to see if one table column could be different than another. The DataType column is used for the display of the datatype fill to change it how you like. #Column is simply a distinct select on the values for the columns, which can be switched based on the @orientation variable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
INSERT INTO #Tables( TableName, ColumnName, ColumnType, ColumnLength, ColumnPercision, ColumnScale ) SELECT T.name, C.name, CT.name, C.max_length, C.[precision], C.scale FROM SYS.tables T INNER JOIN SYS.columns C ON T.object_id = C.object_id INNER JOIN SYS.types CT ON C.system_type_id = CT.system_type_id WHERE T.name LIKE '%REPORT[0-9]%'; UPDATE #Tables SET DataType = CASE WHEN ColumnType IN ('varchar','char') THEN ColumnType + '( ' + CAST(ColumnLength AS VARCHAR) + ' )' WHEN ColumnType IN ('datetime2') THEN ColumnType + '( ' + CAST(ColumnScale AS VARCHAR) + ' )' WHEN ColumnType IN ('decimal') THEN ColumnType + '( ' + CAST(ColumnPercision AS VARCHAR) + ', ' + CAST(ColumnScale AS VARCHAR) + ' )' ELSE ColumnType END; INSERT INTO #Columns( TableName, ColumnName, ColumnType, COLTXT ) SELECT 'tmpTblComparison', ColumnName, 'varchar', '[' + ColumnName + '] [varchar](100)' COLTXT FROM ( SELECT DISTINCT CASE WHEN @Orientation = 'H' THEN ColumnName ELSE TableName END ColumnName FROM #Tables ) A ORDER BY ColumnName; |
After the temp tables are full, create a physical table to store the values. Since the columns are dynamic based on the comparison, the table needs to be made through dynamic SQL. This could also be done by using another temp table.
1 2 3 4 5 6 7 8 9 10 11 12 |
SET @SQL = (SELECT 'CREATE TABLE [dbo].[tmpTblComparison] ([ComparisonValue] [varchar](100) NOT NULL' + ( SELECT ', ' + B.COLTXT FROM #Columns B WHERE B.TableName = A.TableName ORDER BY B.ColumnType DESC, B.ColumnName FOR XML PATH('') ) + ');' FROM #Columns A WHERE A.TableName = 'tmpTblComparison' GROUP BY A.TableName); EXEC( @SQL ); |
Next, the table needs to be filled, but this requires building more Dynamic SQL and even a PIVOT select. If you are not familiar with PIVOT statements in MSSQL or how to create Dynamic ones, please refer to an earlier article: SQL Server PIVOT and UNPIVOT.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
SELECT @Columns_Pivot = ( SELECT ', [' + B.ColumnName + ']' FROM #Columns B ORDER BY B.ColumnName FOR XML PATH('') ), @Columns_Select = CASE WHEN @Orientation = 'H' THEN 'TableName' ELSE 'ColumnName' END + ( SELECT ', ISNULL(MAX([' + B.ColumnName + ']), '''') ' + B.ColumnName FROM #Columns B ORDER BY B.ColumnName FOR XML PATH('') ); SET @Columns_Insert = @Columns_Pivot; SET @Columns_Pivot = RIGHT(@Columns_Pivot, LEN(@Columns_Pivot) - 2); SET @SQL = N'INSERT INTO tmpTblComparison( [ComparisonValue]' + @Columns_Insert + ')'; SET @SQL += N' SELECT ' + @Columns_Select; SET @SQL += N' FROM #Tables A'; SET @SQL += N' PIVOT( MAX(DATATYPE) FOR ' + CASE WHEN @Orientation = 'H' THEN 'ColumnName' ELSE 'TableName' END + ' IN ( ' + @Columns_Pivot + ' ) ) P' SET @SQL += N' GROUP BY ' + CASE WHEN @Orientation = 'H' THEN 'TableName' ELSE 'ColumnName' END; SET @SQL += N' ORDER BY ' + CASE WHEN @Orientation = 'H' THEN 'TableName' ELSE 'ColumnName' END + ';'; EXEC( @SQL ); SELECT * FROM [tmpTblComparison]; |
You now have an easy format to compare multiple tables and their columns. For the full script, please click here. I hope this helps you as much as it has helped me. I really enjoy writing scripts like this to help myself and anyone else who has similar struggles.
Save