sobota, 16 grudnia 2017

[EN] SQL Server 2014 SP2 - surrogate keys

What is it surrogate key? Let us quote the definition contained in wikipedia:

"A surrogate key (or synthetic key, entity identifier, system-generated key, database sequence number, factless key, technical key, or arbitrary unique identifier[citation needed]) in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data, unlike a natural (or business) key which is derived from application data." (https://en.wikipedia.org/wiki/Surrogate_key)

In other words, surrogate key is not natural key for rows in the table. The natural key is - for example - social security number or growing number such as 1, 2, 3 etc.  Surrogate key provides unique identifier for rows on database level and not only, not just a single table.

To put on a surrogate key in the table, we create a uniqueidentifier column with default action NEWSEQUENTIALID(). 

USE tempdb
GO

CREATE TABLE table_test01 (
ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID(),
ColumnB varchar(100)
); 

Then, insert a new row into the table.

INSERT INTO table_test01 (ColumnB) VALUES ('test row');

We check the contents of the table and our eyes will see our surrogate key.

SELECT * FROM table_test01



Second option surrogate key is NEWID() instead NEWSEQUENTIALID(). 

USE tempdb
GO

CREATE TABLE table_test02 (
ColumnA uniqueidentifier DEFAULT NEWID(),
ColumnB varchar(100)
); 

INSERT INTO table_test02 (ColumnB) VALUES ('test row');

SELECT * FROM table_test02


Microsoft not recommended using NEWSEQUENTIALID if  privacy is a concern, because is possible to guess the value of the next generated GUID, but NEWSEQUENTIALID may it can run faster than NEWID().

The disadvantage of surrogate key is its size. One NEWID() value need 16 byes, one BIGINT value need 8 bytes, one INT value need only 4 bytes. 

DECLARE @test03 uniqueidentifier;
SET @test03 = NEWID();
SELECT DATALENGTH(@test03) AS '@test03 - NEWID size in bytes';

DECLARE @test04 bigint;
SET @test04 = 701;
SELECT DATALENGTH(@test04) AS '@bigint - BIGINT size in bytes';

DECLARE @test05 int;
SET @test05 = 701;
SELECT DATALENGTH(@test05) AS '@int - BIGINT size in bytes';


Another option than NEWID and NEWSEQUENTIALID in the uniqueidentifier data type is using SEQUENCE OBJECT. I call them generators. Below is an example.

-- we use command CREATE SEQUENCE 
-- and specify the name of the object 
-- (in our case "first")
-- and the name of the schema in which 
-- it is to be create
-- (in our case: "dbo")
CREATE SEQUENCE dbo.first
-- we define the type of data 
-- for this SEQUENCE OBJECT
-- if not specified, then 
-- BIGINT will be used 
AS INT
-- we define the initial value
START WITH 1
-- we determine how much to grow of object
-- 1 means that it will increase by 1, e.g. 1, 2, 3
INCREMENT BY 1
-- minimum supported value 
-- default it is as minimum data type value
MINVALUE 1
-- we specify that we do not 
-- want after reaching MAXVALUE 
-- (the same as the given data 
-- type predicts by default) 
-- to re-use the object from 
-- the beginning
NO CYCLE;

CREATE TABLE dbo.test05
(
ColumnA INT NOT NULL PRIMARY KEY,
ColumnB VARCHAR(255)
);

INSERT INTO dbo.test05
(ColumnA, ColumnB)
VALUES
(NEXT VALUE FOR dbo.first, 'test');

SELECT * FROM test05


We can use one sequene object in many tables. This will ensure the uniqueness of the rows at the database level. In one sequence object we can save as much data as the BIGINT data type allows - from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. 

Brak komentarzy:

Prześlij komentarz