SQL survival kit

Here are basic queries to collect data from SQL.

Data collecting

SELECT * FROM DATA.TABLE;

Check NULL

  • ISNULL: replace NULL with the specified replacement value.
  • NULLIF: return NULL value IF … = …

Sort

  • TOP N / TOP N Percent: select N rows or N percentage, OFFSET, FETCH
SELECT TOP 10 Percent Name
FROM SalesLT.Product
ORDER BY Weight DESC;

SELECT Name
FROM SalesLT.Product
ORDER BY Weight DESC
OFFSET 10 ROWS FETCH NEXT 100 ROWS ONLY;

Filtering

  • WHERE, DISTINCT, BETWEEN, IN, <>=, IN, AND, OR, NOT
SELECT DISTINCT City, StateProvince
FROM SalesLT.Address;

SELECT ProductNumber, Name
FROM SalesLT.Product
WHERE Color IN ('Black', 'Red', 'White') AND Size IN ('S', 'M');

SELECT ProductNumber, Name, ListPrice
FROM SalesLT.Product
WHERE ProductNumber LIKE 'BK-[^R]%-[0-9][0-9]';

Format

  • ‘AB%’: value begin with AB string

  • ‘%AB’: value end with AB string

  • ‘%AB%’: value contain AB string

  • CONCAT: combine string

JOIN

  • [INNER] JOIN: return row matched 2 tables.

  • [OUTER] JOIN: return all rows in one table and matching rows from other tables. LEFT/RIGHT JOINS

  • CROSS JOIN: returns a Cartesian product that includes every combination of the selected columns from both tables.

  • SELF JOIN: A self-join is an inner, outer, or cross join that matches rows in a table to other rows in the same table. When defining a self-join, you must specify an alias for at least one instance of the table being joined.

figure_pic

UNION

  • Use UNION to combine the rowsets returned by mulitple queries.

  • Each unioned query must return the same number of columns with compatible data types.

  • By default, UNION eliminates duplicate rows. Specify the ALL option to include duplicates (or to avoid the overhead of checking for duplicates when you know in advance that there are none).

INTERCEPT / EXCEPT

  • Use INTERSECT to return only rows that are returned by both queries.

  • Use EXCEPT to return rows from the first query that are not returned by the second query.

  • Scalar functions return a single value based on zero or more input parameters.

  • Logical functions return Boolean values (true or false) based on an expression or column value.

SELECT ProductID, UPPER(Name) AS ProductName, ROUND(Weight, 0) AS ApproxWeight,
       YEAR(SellStartDate) as SellStartYear, DATENAME(m, SellStartDate) as SellStartMonth,
       LEFT(ProductNumber, 2) AS ProductType
FROM SalesLT.Product
WHERE ISNUMERIC(Size) = 1;
  • Window functions are used to rank rows across partitions or “windows”. Window functions include RANK, DENSE_RANK, NTILE, and ROW_NUMBER.
SELECT CompanyName, TotalDue AS Revenue,
       RANK() OVER (ORDER BY TotalDue DESC) AS RankByRevenue
FROM SalesLT.SalesOrderHeader AS SOH
JOIN SalesLT.Customer AS C
ON SOH.CustomerID = C.CustomerID;
  • Aggregate functions are used to provide summary values for mulitple rows - for example, the total cost of products or the maximum number of items in an order. Commonly used aggregate functions include SUM, COUNT, MIN, MAX, and AVG.
  • HAVING after result > < Where before result.
  • You can use GROUP BY with aggregate functions to return aggregations grouped by one or more columns or expressions.
  • All columns in the SELECT clause that are not aggregate function expressions must be included in a GROUP BY clause.
  • The order in which columns or expressions are listed in the GROUP BY clause determines the grouping hierarchy.
  • You can filter the groups that are included in the query results by specifying a HAVING clause.
SELECT Name, SUM(LineTotal) AS TotalRevenue
FROM SalesLT.SalesOrderDetail AS SOD
JOIN SalesLT.Product AS P ON SOD.ProductID = P.ProductID
WHERE P.ListPrice > 1000
GROUP BY P.Name
HAVING SUM(LineTotal) > 20000
ORDER BY TotalRevenue DESC;
  • The APPLY operator enables you to execute a table-valued function for each row in a rowset returned by a SELECT statement. Conceptually, this approach is similar to a correlated subquery.
  • CROSS APPLY returns matching rows, similar to an inner join. OUTER APPLY returns all rows in the original SELECT query results with NULL values for rows where no match was found.
SELECT ProductID, Name, StandardCost, ListPrice,
(SELECT AVG(UnitPrice)
 FROM SalesLT.SalesOrderDetail AS SOD
 WHERE P.ProductID = SOD.ProductID) AS AvgSellingPrice
FROM SalesLT.Product AS P
WHERE StandardCost >
(SELECT AVG(UnitPrice)
 FROM SalesLT.SalesOrderDetail AS SOD
 WHERE P.ProductID = SOD.ProductID)
ORDER BY P.ProductID;

SELECT CA.CustomerID, CI.FirstName, CI.LastName, A.AddressLine1, A.City
FROM SalesLT.Address AS A
JOIN SalesLT.CustomerAddress AS CA
ON A.AddressID = CA.AddressID
CROSS APPLY dbo.ufnGetCustomerInformation (CA.CustomerID) AS CI
ORDER BY CA.CustomerID;
  • Temporary Tables and Tables Variables

  • Temporary tables are prefixed with a # symbol (You can also create global temporary tables that can be accessed by other processes by prefixing the name with ##)
  • Local temporary tables are automatically deleted when the session in which they were created ends. Global temporary tables are deleted when the last user sessions referencing them is closed.
  • Table variables are prefixed with a @ symbol.
  • Table variables are scoped to the batch in which they are created.
DECLARE @Colors AS Table (Color nvarchar(15));

INSERT INTO @Colors
SELECT DISTINCT Color FROM SalesLT.Product;

SELECT ProductID, Name, Color
FROM SalesLT.Product
WHERE * IN (SELECT Color FROM @Colors);

SELECT CompanyContact, SUM(SalesAmount) AS Revenue
FROM
	(SELECT CONCAT(c.CompanyName, CONCAT(' (' + c.FirstName + ' ', c.LastName + ')')), SOH.TotalDue
	 FROM SalesLT.SalesOrderHeader AS SOH
	 JOIN SalesLT.Customer AS c
	 ON SOH.CustomerID = c.CustomerID) AS CustomerSales(CompanyContact, SalesAmount)
GROUP BY CompanyContact
ORDER BY CompanyContact;
  • Grouping Sets

  • Use GROUPING SETS to define custom groupings.
  • Use ROLLUP to include subtotals and a grand total for hierarchical groupings.
  • Use CUBE to include all possible groupings.
SELECT a.CountryRegion, a.StateProvince, a.City,
CHOOSE (1 + GROUPING_ID(a.CountryRegion) + GROUPING_ID(a.StateProvince) + GROUPING_ID(a.City),
        a.City + ' Subtotal', a.StateProvince + ' Subtotal',
        a.CountryRegion + ' Subtotal', 'Total') AS Level,
SUM(soh.TotalDue) AS Revenue
FROM SalesLT.Address AS a
JOIN SalesLT.CustomerAddress AS ca
ON a.AddressID = ca.AddressID
JOIN SalesLT.Customer AS c
ON ca.CustomerID = c.CustomerID
JOIN SalesLT.SalesOrderHeader as soh
ON c.CustomerID = soh.CustomerID
GROUP BY ROLLUP(a.CountryRegion, a.StateProvince, a.City)
ORDER BY a.CountryRegion, a.StateProvince, a.City;
  • Use PIVOT to re-orient a rowset by generating mulitple columns from values in a single column.

  • Use UNPIVOT to re-orient mulitple columns in a an existing rowset into a single column.

SELECT * FROM
(SELECT cat.ParentProductCategoryName, CompanyName, LineTotal
 FROM SalesLT.SalesOrderDetail AS sod
 JOIN SalesLT.SalesOrderHeader AS soh ON sod.SalesOrderID = soh.SalesOrderID
 JOIN SalesLT.Customer AS cust ON soh.CustomerID = cust.CustomerID
 JOIN SalesLT.Product AS prod ON sod.ProductID = prod.ProductID
 JOIN SalesLT.vGetAllCategories AS cat ON prod.ProductcategoryID = cat.ProductCategoryID) AS catsales
PIVOT (SUM(LineTotal) FOR ParentProductCategoryName
IN ([Accessories], [Bikes], [Clothing], [Components])) AS pivotedsales
ORDER BY CompanyName;
  • Use the INSERT statement to insert one or more rows into a table.
  • Use the UPDATE statement to modify the values of one or more columns in specified rows of a table.
  • Use the DELETE statement to delete specified rows in a table.
  • Use the MERGE statement to insert, update, and delete rows in a target table based on data in a source table.
INSERT INTO SalesLT.Product (Name, ProductNumber, StandardCost, ListPrice, ProductCategoryID, SellStartDate)
VALUES
('LED Lights', 'LT-L123', 2.56, 12.99, 37, GETDATE());

-- Get last identity value that was inserted
SELECT SCOPE_IDENTITY();

-- Finish the SELECT statement
SELECT * FROM SalesLT.Product
WHERE ProductID = SCOPE_IDENTITY();
  • Declare variables by using the DECLARE keyword, specifying a name (prefixed with @) and a data type. You can optionally assign an initial value.
  • Assign values to variables by using the SET keyword or in a SELECT statement.
  • Use the IF keyword to execute a task based on the results of a conditional test.
  • Enclose mulitple statements in an IF or ELSE clause between BEGIN and END keywords.
  • Use a WHILE loop if you need to repeat one or more statements until a specified condition is true.
  • Use BREAK and CONTINUE to exit or restart the loop.
  • Avoid using loops to iteratively update or retrieve single records - in most cases, you should use set-based operations to retrieve and modify data.
DECLARE @OrderDate datetime = GETDATE();
DECLARE @DueDate datetime = DATEADD(dd, 7, GETDATE());
DECLARE @CustomerID int = 1;
INSERT INTO SalesLT.SalesOrderHeader (OrderDate, DueDate, CustomerID, ShipMethod)
VALUES (@OrderDate, @DueDate, @CustomerID, 'CARGO TRANSPORT 5');
DECLARE @OrderID int = SCOPE_IDENTITY();

-- Additional script to complete
DECLARE @ProductID int = 760;
DECLARE @Quantity int = 1;
DECLARE @UnitPrice money = 782.99;

IF EXISTS (SELECT * FROM SalesLT.SalesOrderDetail WHERE SalesOrderID = @OrderID)
BEGIN
	INSERT INTO SalesLT.SalesOrderDetail (SalesOrderID, OrderQty, ProductID, UnitPrice)
	VALUES (@OrderID, @Quantity, @ProductID, @UnitPrice)
END
ELSE
BEGIN
	PRINT 'The order does not exist'
END
  • System errors have pre-defined numbers, messages, severity levels, and other characteristics that you can use to troubleshoot issues.
  • Use RAISERROR and THROW to raise custom errors.
  • Use TRY…CATCH blocks in your Transact-SQL code to catch and handle exceptions.
  • A common exception handling pattern is to log the error, and then if the operation cannot be completed successfully, throw it (or a new custom error) to the calling application.
DECLARE @OrderID int = 0
DECLARE @error varchar(25) = 'Order #' + cast(@OrderID as varchar) + ' does not exist';

-- Wrap IF ELSE in a TRY block
BEGIN TRY
    IF NOT EXISTS (SELECT * FROM SalesLT.SalesOrderHeader WHERE SalesOrderID = @OrderID)
    BEGIN
    	THROW 50001, @error, 0
    END
    ELSE
    BEGIN
    	DELETE FROM SalesLT.SalesOrderDetail WHERE SalesOrderID = @OrderID;
      DELETE FROM SalesLT.SalesOrderHeader WHERE SalesOrderID = @OrderID;
    END
END TRY
-- Add a CATCH block to print out the error
BEGIN CATCH
	-- Catch and print the error
	PRINT ERROR_MESSAGE();
END CATCH
  • Transactions are used to protect data integrity by ensuring that all data changes within a transaction succeed or fail as a unit.
  • Individual Transact-SQL statements are inherently treated as transactions, and you can define explicit transactions that encompass mulitple statements.
  • Use the BEGIN TRANSACTION, COMMIT TRANSACTION, and ROLLBACK TRANSACTION statements to manage transactions.
  • Enable the XACT_ABORT option to automatically rollback all transactions if an exception occurs.
  • Use the @@TRANCOUNT system variable and XACT_STATE system function to determine transaction status.
DECLARE @OrderID int = 0
DECLARE @error varchar(25) = 'Order #' + cast(@OrderID as varchar) + ' does not exist';

BEGIN TRY
	IF NOT EXISTS (SELECT * FROM SalesLT.SalesOrderHeader WHERE SalesOrderID = @OrderID)
	BEGIN
		THROW 50001, @error, 0
	END
	ELSE
	BEGIN
    -- Add code to treat as single transactional unit of work
	  BEGIN TRANSACTION
		DELETE FROM SalesLT.SalesOrderDetail
		WHERE SalesOrderID = @OrderID;
		DELETE FROM SalesLT.SalesOrderHeader
		WHERE SalesOrderID = @OrderID;
	  COMMIT TRANSACTION
	END
END TRY
BEGIN CATCH
	IF @@TRANCOUNT > 0
	BEGIN
		-- Rollback the transaction
		ROLLBACK TRANSACTION;
	END
	ELSE
	BEGIN
		-- Report the error
		PRINT ERROR_MESSAGE();
	END
END CATCH
  • CASE or INDEX to change the format of variable
    CAST(SC.ArtNr as varchar(35))
    
  • MOD() works in dividing big file
  • Rank(), ROW_NUMBER(), PARTITION()
WITH table1 as (
SELECT Eingangsdatum, Rabattcode, COUNT(Rabattcode) as NO1, Rank() OVER (PARTITION BY Eingangsdatum ORDER BY NO1 DESC) rank#
FROM OHBI.CUBE_Sales_Costs_TOT
WHERE Eingangsdatum between 1180105 and 1180109
GROUP BY 1,2
)
Select * from table1 where rank# < 4
Written on May 16, 2017