Wednesday, March 25, 2009

What’s ownership chaining?

Edit: 04-27-2009 K. Brian Kelley corrrectly pointed out in a comment that objects can still be owned by users in 2005 and later.  See his blog post to see how.

Edit:  While the code section is not as nice as I would like it is better.  Thanks for sticking with me.

I recently answered a question on SSC about how you granting execute rights on a stored procedure allows you to access tables without having any permissions granted on the tables.  The way that SQL Server implements this it through ownership chaining.  Here is the BOL (SQL Server 2005) definition of Ownership Chains:
When multiple database objects access each other sequentially, the sequence is known as a chain. Although such chains do not independently exist, when SQL Server 2005 traverses the links in a chain, SQL Server evaluates permissions on the constituent objects differently than it would if it were accessing the objects separately. These differences have important implications for managing security.
Ownership chaining enables managing access to multiple objects, such as multiple tables, by setting permissions on one object, such as a view. Ownership chaining also offers a slight performance advantage in scenarios that allow for skipping permission checks.

How Permissions Are Checked in a Chain

When an object is accessed through a chain, SQL Server first compares the owner of the object to the owner of the calling object. This is the previous link in the chain. If both objects have the same owner, permissions on the referenced object are not evaluated.
Prior to SQL Server 2005, objects were owned by users, and, in most cases, designers and developers used the dbo user to own common objects.  In my experience it was very rare for objects to be owned by a “real” user. Thus if the customers table and orders table are both owned by dbo, then you can create a customer_orders_get stored procedure owned by dbo, grant user Joe execute permissions on the stored procedure, and Joe would be able to successfully use the stored procedure to query the customers and orders tables without having select permissions on the tables.  This is pretty easy to understand, as long all the objects are dbo.object_name I can access them using ownership chaining.
SQL Server 2005 introduced schemas which removed the use of users as owners.  Now objects belong to schemas, not users.  From BOL:
Beginning in SQL Server 2005, each object belongs to a database schema. A database schema is a distinct namespace that is separate from a database user. You can think of a schema as a container of objects. Schemas can be created and altered in a database, and users can be granted access to a schema. A schema can be owned by any user, and schema ownership is transferable.
The question now is, how does ownership chaining work when using schemas?  As the last line of the BOL quote states, a schema is owned as well, and ownership chaining in SQL Server 2005 and greater is based on schema ownership.  So if schema A is owned by dbo then you would be able to use ownership chaining to access objects in dbo from schema A and vice versa.  I think the best way to explain is with an example:
USE [master]
GO

-- drop the DB if it exists
IF EXISTS ( SELECT
name
FROM
sys.databases
WHERE
name = N'oc_test' ) 
DROP DATABASE [oc_test]
GO

USE [master]
GO

-- create the test database
CREATE DATABASE [oc_test] ON PRIMARY 
( 
NAME = N'oc_test', 
FILENAME = N'C:\Data\oc_test.mdf' , 
SIZE = 194560KB, 
MAXSIZE = 1024000KB, 
FILEGROWTH = 10240KB 
)
LOG ON 
( 
NAME = N'oc_test_log', 
FILENAME = N'C:\Logs\oc_test_log.ldf', 
SIZE = 204800KB, 
MAXSIZE = 204800KB, 
FILEGROWTH = 10240KB 
)
GO

-- switch to the new databse
USE oc_test ; 

GO

-- drop the user if it exists
IF EXISTS ( SELECT
1
FROM
sys.database_principals AS DP
WHERE
DP.[name] = 'test_user' ) 
BEGIN
DROP USER test_user ;
END

-- create our test user    
CREATE USER test_user WITHOUT LOGIN 
WITH DEFAULT_SCHEMA = dbo ;
GO

-- drop proc and schema
IF OBJECT_ID('ownership_chaining.table_one_ins') 
IS NOT NULL 
BEGIN
DROP PROCEDURE ownership_chaining.table_one_ins ;
END

GO

IF SCHEMA_ID('ownership_chaining') IS NOT NULL 
BEGIN
DROP SCHEMA ownership_chaining ;
END
GO

-- create our test schema, notice it is "owned" by dbo
CREATE SCHEMA ownership_chaining AUTHORIZATION dbo ;

GO

-- drop our test table if it exists
IF OBJECT_ID('dbo.table_one') IS NOT NULL 
BEGIN
DROP TABLE dbo.table_one ;
END

/*
create a test table NOTICE it is in the dbo 
schema which is owned by dbo
*/
CREATE TABLE dbo.table_one
(
column_one INT IDENTITY(1, 1),
column_two VARCHAR(10) NOT NULL
) ;

GO

/* 
Create the test procedure in the 
ownership_chaining schema which 
is owned by dbo
*/    
CREATE PROCEDURE ownership_chaining.table_one_ins
@column_two VARCHAR(10)
AS 
INSERT INTO
dbo.table_one (column_two)
VALUES
(@column_two) ;

RETURN ;

GO

/*
give out test user execut permissions on the 
procedure
*/
GRANT EXECUTE ON ownership_chaining.table_one_ins 
TO test_user ;

GO

-- change user context
EXECUTE AS USER = 'test_user' ;

-- execute the procedure
EXEC ownership_chaining.table_one_ins 
@column_two = 'Test 1' ;

-- go back to my login
Revert ;

-- verify that it worked which it should
SELECT
*
FROM
dbo.table_one AS _TO ;

/*
Now change the owner of the ownership_chaining 
schema so that it is no longer owned by dbo 
thus breaking the chain
*/
ALTER AUTHORIZATION ON SCHEMA::ownership_chaining 
TO db_owner ;

-- change user context
EXECUTE AS USER = 'test_user' ;

/* 
using try - catch so I get the revert as 
I expect failure
*/
BEGIN TRY 
EXEC ownership_chaining.table_one_ins 
@column_two = 'Test 1' ;
END TRY 
BEGIN CATCH
-- show the error message
SELECT
ERROR_MESSAGE() AS error ;

-- return to my user
revert ;
END CATCH 

-- verify data
SELECT
*
FROM
dbo.table_one AS _TO ;
I hope this helps clear up what I found to be a somewhat confusing topic.

PASS Summit 2009 Website Up

PASS has put up the PASS Summit web site early this year.  Check it out.  There is good information about the Summit and why you should attend.  I must admit that I have never been, but I know many people who have and they all love it.  Now I want to go and need to work on getting it done.

Make sure you check out the ROI Justification page.

Friday, March 20, 2009

PASS Summit 2009 Call for Speakers

The PASS Summit 2009 Call for Speakers is open.  It will be open until April 10th.  If you have already setup a PASSPort profile you can submit sessions here

Make sure you check out the Speaker Resource page before you submit. 

Hope to see you there.

Thursday, March 19, 2009

Bug in Scope_Identity() Function

Aaron Bertrand (@AaronBetrand) tweeted this tonight.  Apparently there is a bug in SQL Server 2005/2008 with Scope_Identity() related to parallelism.  You can see the bug on Connect, here.  If a fix is important to you go and vote it up. 

This is a shock to me as I use Scope_Identity() regularly and now I have to re-evaluate it’s use.  I typically use stored procedures to do single row updates and use the Scope_Identity() function to return the id to the calling application using an output parameter.  Now it looks like I’ll have to use the Output clause which requires a table variable to hold it’s results.  Within the last 2 weeks I just told someone on SSC that is had been bullet-proof for me, it has, and was bug free.  I’ll have to go back and link to this.

Thursday, March 12, 2009

OPASS March Meeting

I attended the March OPASS meeting last night, March 11. The speaker was Kendal Van Dyke and the subject was Performance Impact of Disk Configuration (partition offset, stripe size, and allocation unit size). Kendal had the opportunity to use some hardware at work to run some tests for testing the effects of properly aligning your direct attached storage and the effects of different stripe sizes and allocation units on RAID 1, RAID 1o, and RAID 5 arrays. It was a highly technical, but interesting session. I have to admit that it was a little over my head as I am not a RAID expert, although I do understand how it works. I've always been in a position where I have had a system administrator that has handled configuring my storage, most recently according to my recommendations. The basic theme was that properly aligning your disks, with a 64 kb partition offset will provide about a solid 10% or more performance improvement. Kendal's tests also showed that, for direct attached storage, a 64 kb offset, 64 kb stripe size, and 64 kb allocation unit size was the best overall configuration for performance for an OLTP system. You can see all the results of his tests on his blog as he has a 7 part series detailing his tests starting with this post. If you are interested in more here are a some other resources relating to disk alignment from Linchi Shea, TechNet, and Denny Cherry.

I have to say that I was disappointed with the turnout. We had about 15, possibly because we met on a Wednesday instead of our usual Tuesday, but still disappointing. I was "lucky" enough to get tagged in a discussion going around a group of regular SQL bloggers about "Things I Wish I Had Known" and a common theme in most of the posts was that we wish we had gotten involved in the community sooner. One of the main reasons is because you get the opportunity to learn something on a regular basis and you also get to interact with some really smart people. I know that since I have been attending I have learned a lot about SQL Server, professionalism, and professional development, and mostly from conversations before and after the actual meetings. Last night about half the attendees stayed for at least 30 minutes after the meeting and Kendal, Andy Warren, and I had a great discussion until a little after 10pm. These are the reasons I go to meetings, events like SQLSaturday, and, hopefully, the PASS Summit this fall.

Monday, March 2, 2009

SSIS Variables and the Script Component

About 2 weeks ago, yeah I know, why wait 2 weeks to blog on something, I was working with a script component in SSIS and I thought I needed to use some Package or Dataflow level variables in order to accomplish the task before me. Key words in that last sentence, "I thought". They are key, not because it shows I actually thought about what I was doing, but because I had thought incorrectly. Not exactly a new thing for me either, but I digress. The task I was trying, and eventually was successful at, was taking a delimited string of ministries or job titles, and breaking them out into individual rows for each ministry while also creating a column on which to group. The format of the source data is something like this:
PersonNo Location StartDate EndDate  Ministries
-------- -------- --------- ------- ----------
12346 A 08/01/92 08/31/92 Teacher
12346 A 09
/01/92 12/14/92 Teacher, Language Consultant
12346 A 12
/15/92 12/31/95 Teacher, Language Consultant, Administrator
12346 A 01
/01/96 05/31/96 Teacher, Language Consultant, Administrator, Bookkeeper
12346 A 06
/01/96 01/06/08 Bookkeeeper
The desired output of the script component is:
PersonNo   Location  StartDate  EndDate    Ministry            Groups
---------- --------- ---------- ---------- ---------- -------
12346 A 08/01/92 08/31/92 Teacher 1
12346 A 09
/01/92 12/14/92 Teacher 1
12346 A 09
/01/92 12/14/92 Language Consultant 2
12346 A 12
/15/92 12/31/95 Teacher 3
12346 A 12
/15/92 12/31/95 Language Consultant 4
12346 A 12
/15/92 12/31/95 Administrator 5
12346 A 01
/01/96 05/31/96 Teacher 6
12346 A 01
/01/96 05/31/96 Language Consultant 7
12346 A 01
/01/96 05/31/96 Administrator 8
12346 A 01
/01/96 05/31/96 Bookkeeper 9
12346 A 06
/01/96 01/06/08 Bookkeeper 9
The desired final output of the Dataflow is this:
PersonNo Location  StartDate EndDate  Ministry
-------- --------- --------- -------- ----------
12346 A 08/01/92 12/14/92 Teacher
12346 A 09
/01/92 12/14/92 Language Consultant
12346 A 12
/15/92 12/31/95 Teacher
12346 A 12
/15/92 12/31/95 Language Consultant
12346 A 12
/15/92 12/31/95 Administrator
12346 A 01
/01/96 05/31/96 Teacher
12346 A 01
/01/96 05/31/96 Language Consultant
12346 A 01
/01/96 05/31/96 Administrator
12346 A 01
/01/96 01/06/08 Bookkeeper
In order to get the Groups I needed to be able to compare the current row to the previous row. In a script component you only have access to the current row, so I assumed (yes, I know what that means), that I would need dataflow scoped variables to contain the pertinent values from the previous row. I created the needed variables and then entered them in the ReadWrite Variables property of the script component to simplify access to them. Well, then when I ran the dataflow it failed at the script component with the error:
The collection of variables locked for read and write access is not available outside of PostExecute
The issue with this is that PostExecute only runs after ALL the rows have been processed which did not meet my needs as I could not modify them when the values changed. So I googled the error message trying to find out how to get around the error. First, I decided to remove the variables from the ReadWrite property of the dataflow. Then I used the variable dispenser to access the variables. Hey, believe it or not this works! You can unlock the variables for read and write using the variable dispenser and make the modifications. I thought I had solved my problem. Then over the weekend I realized that since the PostExecute function only runs after ALL the rows have been processed, this means that the Script Component is created once within the dataflow. So instead of needing the variables to be scoped to the dataflow, I only needed class level variables within the script component. Lesson learned.

The only reason you would need variables scoped outside the the script component is if you need the data OUTSIDE the script component.

The great part about the whole process was that I learned how to use an asyncronous script component and more details about how the script component works.