Archive

Archive for December, 2006

How do I estimate database growth?

December 2, 2006 5 comments

Unfortunately, SQL Server does not come with any built-in tools to estimate database size.
If you don’t have a third-party estimating tool, you have to do it the hard way, which is by calculating the space taken up by each table, and then adding the total space needed for each table to get a grand total for the database.

Here are some guidelines for estimating the space needed for a database.

  1. For each table, find out how many bytes each row will use up on average. It is easy to calculate the size of fixed length columns, but calculating the space used by variable length fields is much more difficult. About the only way to do this is to get some of the actual data that will be stored in each variable length column, and then based on the data you have, estimate the average byte length of each variable length column. Once you know the typical sizes of each column, you can then calculate the size of each typical row in your table.
  2. The above step is a good start, but it usually significantly underestimates the amount of space you need. Besides the actual data, you must also estimate the type and number of indexes you will use for each table. Indexes can use a huge amount of space, and you must estimate how much space you think they will take. This is a function of the type of index (clustered or non-clustered, the number of indexes, and the width of the indexes).
  3. Besides estimating the size of the indexes, you also must take into consideration the Fillfactor and Pad Index used when the indexes are created. Both of these affect how much empty space is left in an index, and this empty space must be included in your estimate.
  4. And one more factor affecting how much space it takes to store data in a table is how many rows can be fitted onto one SQL Server 8K data page. Depending on the size of each row, it is likely that not all of the space in each data page is fully used. This must also be accounted for when estimating page size.
  5. While tables, and their associated indexes take up most of the physical space in most databases, keep in mind that every object in SQL Server takes up space, and must be accounted for.

As you can see, without a tool to help out, manually estimating database size is not a fun task, and it is, at best, only a rough estimate.

Another option you might consider, assuming that you already have an existing database with data, is to extrapolate the current size of your database on a table by table basis. For example, if you know that a particular table has 100,000 rows, and it is 1MB in size, then assuming that neither indexing or the fillfactor changes, than when the table gets to 200,000 rows, that it should be about 2MB in size. If you do this for every table, then you can get a fairly good idea on how much disk space you will need in the future. To find out how much space a particular table uses, use this command:

sp_spaceused ' '

Categories: Database, SQL Server

What happens when my integer IDENTITY runs out of scope?

December 1, 2006 2 comments

Before we actually look at the answer, let’s recall some basics of the IDENTITY property and SQL Server’s numerical data types.

You can define the IDENTITY property on columns of the INT data type and on DECIMAL with scale 0. This gives you a range of:

TINYINT

0 – 255

SMALLINT

-32.768 – 32.767

INT

-2.147.483.648 – 2.147.483.647

BIGINT

-2^63 – 2^63-1

When you decide to use the DECIMAL datatype you have a potential range from -10^38 to 10^38-1.

So, keeping this in mind, we’re now ready to answer the original question here. What happens when an INTEGER IDENTITY value is about to run out of scope?

CREATE TABLE id_overflow ( col1 INT IDENTITY(2147483647,1) )

GO

INSERT INTO id_overflow DEFAULT VALUES INSERT INTO id_overflow DEFAULT VALUES SELECT * FROM id_overflow

DROP TABLE id_overflow

(1 row(s) affected) Server: Msg 8115, Level 16, State 1, Line 2 Arithmetic overflow error converting IDENTITY to data type int. Arithmetic overflow occurred.

This script creates a simple table with just one column of type INT. We have also created the IDENTITY property for this column. But instead of now adding more than 2 billion rows to the table, we rather set the seed value to the positive maximum value for an INTEGER. The first row inserted is assigned that value. Nothing unusual happens. The second insert, however, fails with the above error. Apparently SQL Server does not start all over again or tries to fill the maybe existing gaps in the sequence. Actually, SQL Server does nothing automatically here. You have to do this by yourself. But what can you do in such a case?

Probably the easiest solution is to alter the data type of the column to BIGINT, or maybe right on to DECIMAL(38,0) like so:

CREATE TABLE id_overflow ( col1 INT IDENTITY(2147483647,1) )

GO

INSERT INTO id_overflow DEFAULT VALUES ALTER TABLE id_overflow ALTER COLUMN col1 BIGINT INSERT INTO id_overflow DEFAULT VALUES

SELECT * FROM id_overflow

DROP TABLE id_overflow

col1

——————–

2147483647 2147483648

(2 row(s) affected)

If you know in advance that your table needs to keep that many rows, you can do:

CREATE TABLE bigint_t ( col1 BIGINT IDENTITY(-9223372036854775808, 1) )

GO

INSERT INTO bigint_t DEFAULT VALUES

SELECT * FROM bigint_t

DROP TABLE bigint_t

col1

——————–

-9223372036854775808

(1 row(s) affected)

Or the DECIMAL(38,0) variation:

CREATE TABLE decimal_t ( col1 DECIMAL(38,0) IDENTITY(-99999999999999999999999999999999999999, 1) )

GO

INSERT INTO decimal_t DEFAULT VALUES

SELECT * FROM decimal_t

DROP TABLE decimal_t

col1

—————————————-

-99999999999999999999999999999999999999

(1 row(s) affected)

One might be distressed in one’s aesthetical taste by those negative numbers, but it’s a fact, that one now shouldn’t have to worry about running out of scope for quite some time.

Categories: Database, SQL Server