Databases can use a lot of storage space. It is always good to know how the space is distributed among the tables and indexes. SQL Server allows us to query all relevant data and create our desired reports. In this article I will show different queries and alternative options to access the storage sizes at various levels of detail.
What requires storage space?
A database has a total size which includes the actual data, transaction log data and
reserved storage space, since the database is expected to grow. The data is stored in
one primary data file with the file extension .mdf
, optional user-defined
data files with the file extension .ndf
and at least one transaction log
file with the file extension .ldf
. See this
article about database files and filegroups
to learn more about it.
The data is basically separated into two types: tables and indexes. A table has either a clustered index with all data or it is a heap table. Indexes are copies of the data and need space too. Note that a view can have indexes and therefore can require storage space too. Additional space is used to store the references to the actual allocations (IAM Pages).
Pages and extents are used to structure the data. All data pages are the same size: 8 KB. An extent is eight physically contiguous pages, or 64 KB. See this article about pages and extents to learn more about it.
Getting storage sizes from properties with SSMS
An easy way to get the total size of database is by using SQL Server Management Studio, right clicking on a database in the Object Explorer and open the properties window. There are the properties “Size” and “Space available”. In the “Files” page are all files listed including the reserved size by file.
In the property window of tables on the “Storage” page are the “Data space” and the “Index space” listed.
Using SSMS Reports
SQL Server Management Studio also offers reports to get the storage size of tables. These can be accessed by right clicking on a database and then go to “Reports”. The most interesting report is “Disk Usage by Table” which uses the following SQL query:
SELECT
(row_number() over(order by a3.name, a2.name)) % 2 as l1,
a3.name AS [schemaname],
a2.name AS [tablename],
a1.rows as row_count,
(a1.reserved + ISNULL(a4.reserved,0))* 8 AS reserved,
a1.data * 8 AS data,
(CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN (a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 AS index_size,
(CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN (a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8 AS unused
FROM (
SELECT
ps.object_id,
SUM(CASE WHEN (ps.index_id < 2) THEN row_count ELSE 0 END) AS [rows],
SUM(ps.reserved_page_count) AS reserved,
SUM(CASE WHEN (ps.index_id < 2) THEN (ps.in_row_data_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count) ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count) END) AS data,
SUM(ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
WHERE ps.object_id NOT IN (SELECT object_id FROM sys.tables WHERE is_memory_optimized = 1)
GROUP BY ps.object_id
) AS a1
LEFT OUTER JOIN (SELECT
it.parent_id,
SUM(ps.reserved_page_count) AS reserved,
SUM(ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
WHERE it.internal_type IN (202,204)
GROUP BY it.parent_id
) AS a4 ON (a4.parent_id = a1.object_id)
INNER JOIN sys.all_objects a2 ON (a1.object_id = a2.object_id)
INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id)
WHERE a2.type <> N'S'
and a2.type <> N'IT'
ORDER BY a3.name, a2.name
The data is read from the sys.dm_db_partition_stats
view. The LFET JOIN
with the sub query is used to get the size for columns with the xml
type.
The result shows the count of rows, the reserved size for the whole table (or indexed view),
the size of the data (this is the size of the cluster index or heap without the indexes),
the size of all indexes including xml
columns and the size of unused space.
Using sp_spaceused
sp_spaceused
is a stored procedure to get the size of an object. You can either provide the name
of a table or nothing to get the size of the whole database.
The @updateusage
parameter with the value 'true'
will run
DBCC UPDATEUSAGE
for the given object or the whole database. This can be used
to update the stats, since the stats are not guaranteed to be up to date.
-- show the storage space of the whole database
EXECUTE sp_spaceused
-- show the storage space of a single table
EXECUTE sp_spaceused 'Person.Address'
-- update the stats for the given table and show the storage space
EXECUTE sp_spaceused 'Person.Address', 'true'
Getting file sizes with SQL query
The sys.database_files
view contains all files of the selected database. The
table contains the count of pages per file and in combination with the FILEPROPERTY
function and 'SpaceUsed'
argument we can get the used pages. By multiplying
the count of pages with 8 we get the kilobytes. The final query looks like this:
SELECT
F.file_id AS FileId,
F.type_desc AS FileType,
F.name AS FileName,
F.physical_name AS PhysicalFileName,
CASE WHEN F.max_size <= 0 THEN -1 ELSE CAST(CAST(F.max_size AS DECIMAL(18, 2)) * 8 / 1024 AS DECIMAL(18, 2)) END AS MaxSizeMb,
CAST(CAST(F.size AS DECIMAL(18, 2)) * 8 / 1024 AS DECIMAL(18, 2)) AS TotalMb,
CAST(CAST(FILEPROPERTY(F.name, 'SpaceUsed') AS DECIMAL(18, 2)) * 8 / 1024 AS DECIMAL(18, 2)) AS UsedMb,
CAST(CAST((F.size - FILEPROPERTY(F.name, 'SpaceUsed')) AS DECIMAL(18, 2)) * 8 / 1024 AS DECIMAL(18, 2)) AS UnusedMb
FROM sys.database_files AS F
Getting index sizes with SQL query
The sys.allocation_units
view contains the allocations in the partitions
(container_id
). It has a column for the total pages and one for the used
pages. The view can be joined with sys.partitions
to get the index it
belongs to.
The sys.partitions
view does contain the allocations by index,
the sys.dm_db_partition_stats
view, which is used by the “Disk Usage by Table”,
is only for tables.
The final query looks like this:
SELECT
S.name AS SchemaName,
T.name AS TableName,
ISNULL(I.name, '') AS IndexName,
T.object_id AS TableObjectId,
T.type_desc AS TableType,
I.index_id AS IndexId,
I.type_desc AS IndexType,
A.Rows AS Rows,
A.TotalPages + ISNULL(X.XmlTotalPages, 0) AS TotalPages,
A.UsedPages + ISNULL(X.XmlUsedPages, 0) AS UsedPages,
(A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS UnusedPages,
(A.TotalPages + ISNULL(X.XmlTotalPages, 0)) * 8 AS TotalKb,
(A.UsedPages + ISNULL(X.XmlUsedPages, 0)) * 8 AS UsedKb,
((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0))) * 8 AS UnusedKb,
CAST(CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS TotalMb,
CAST(CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS UsedMb,
CAST(CAST(((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0))) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS UnusedMb,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS DECIMAL(18, 4)) / CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) * 100
END AS DECIMAL(18, 4)) AS PercentUsed,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
100 - (CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS DECIMAL(18, 4)) / CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) * 100)
END AS DECIMAL(18, 4)) AS PercentUnused,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) / SUM(CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4))) OVER() * 100
END AS DECIMAL(18, 4)) AS PercentFromTotal
FROM sys.objects AS T
INNER JOIN sys.schemas AS S ON S.schema_id = T.schema_id
INNER JOIN sys.indexes AS I ON I.object_id = T.object_id
INNER JOIN (SELECT
PA.object_id,
PA.index_id,
SUM(CASE WHEN AU.type_desc = 'IN_ROW_DATA' THEN PA.rows ELSE 0 END) AS Rows,
SUM(AU.total_pages) AS TotalPages,
SUM(AU.used_pages) AS UsedPages
FROM sys.allocation_units AS AU
INNER JOIN sys.partitions AS PA ON PA.partition_id = AU.container_id
GROUP BY PA.object_id, PA.index_id
) AS A ON A.object_id = I.object_id AND A.index_id = I.index_id
LEFT JOIN (SELECT
IT.parent_id,
SUM(AU.total_pages) AS XmlTotalPages,
SUM(AU.used_pages) AS XmlUsedPages
FROM sys.allocation_units AS AU
INNER JOIN sys.partitions AS PA ON PA.partition_id = AU.container_id
INNER JOIN sys.internal_tables AS IT ON IT.object_id = PA.object_id
WHERE IT.internal_type_desc = 'XML_INDEX_NODES'
GROUP BY IT.parent_id
) AS X ON X.parent_id = T.object_id AND I.type_desc IN ('HEAP', 'CLUSTERED')
WHERE T.type_desc IN ('USER_TABLE', 'VIEW')
ORDER BY PercentFromTotal DESC, S.name ASC, T.name ASC, I.index_id ASC
The result does contain the storage sizes in different units and percentages. The
xml
columns are appended to the clustered index or heap. Indexes on
views are also included in the result. The PercentFromTotal
column
can be used to identify big tables or indexes in the database.
Getting table sizes with SQL query
The query for the indexes is the basis for the table size query. The grouping by index is removed in the allocations sub query and the xml size is appended to the entire table:
SELECT
S.name AS SchemaName,
T.name AS TableName,
T.object_id AS TableObjectId,
T.type_desc AS TableType,
A.Rows AS Rows,
A.TotalPages + ISNULL(X.XmlTotalPages, 0) AS TotalPages,
A.UsedPages + ISNULL(X.XmlUsedPages, 0) AS UsedPages,
(A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS UnusedPages,
(A.TotalPages + ISNULL(X.XmlTotalPages, 0)) * 8 AS TotalKb,
(A.UsedPages + ISNULL(X.XmlUsedPages, 0)) * 8 AS UsedKb,
((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0))) * 8 AS UnusedKb,
CAST(CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS TotalMb,
CAST(CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS UsedMb,
CAST(CAST(((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) - (A.UsedPages + ISNULL(X.XmlUsedPages, 0))) * 8 AS DECIMAL(18, 4)) / 1024 AS DECIMAL(18, 4)) AS UnusedMb,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS DECIMAL(18, 4)) / CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) * 100
END AS DECIMAL(18, 4)) AS PercentUsed,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
100 - (CAST((A.UsedPages + ISNULL(X.XmlUsedPages, 0)) AS DECIMAL(18, 4)) / CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) * 100)
END AS DECIMAL(18, 4)) AS PercentUnused,
CAST(CASE
WHEN A.TotalPages + ISNULL(X.XmlTotalPages, 0) = 0 THEN
0
ELSE
CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4)) / SUM(CAST((A.TotalPages + ISNULL(X.XmlTotalPages, 0)) AS DECIMAL(18, 4))) OVER() * 100
END AS DECIMAL(18, 4)) AS PercentFromTotal
FROM sys.objects AS T
INNER JOIN sys.schemas AS S ON S.schema_id = T.schema_id
INNER JOIN (SELECT
PA.object_id,
SUM(CASE WHEN AU.type_desc = 'IN_ROW_DATA' AND PA.index_id IN (0, 1) THEN PA.rows ELSE 0 END) AS Rows,
SUM(AU.total_pages) AS TotalPages,
SUM(AU.used_pages) AS UsedPages
FROM sys.allocation_units AS AU
INNER JOIN sys.partitions AS PA ON PA.partition_id = AU.container_id
GROUP BY PA.object_id
) AS A ON A.object_id = T.object_id
LEFT JOIN (SELECT
IT.parent_id,
SUM(AU.total_pages) AS XmlTotalPages,
SUM(AU.used_pages) AS XmlUsedPages
FROM sys.allocation_units AS AU
INNER JOIN sys.partitions AS PA ON PA.partition_id = AU.container_id
INNER JOIN sys.internal_tables AS IT ON IT.object_id = PA.object_id
WHERE IT.internal_type_desc = 'XML_INDEX_NODES'
GROUP BY IT.parent_id
) AS X ON X.parent_id = T.object_id
WHERE T.type_desc IN ('USER_TABLE', 'VIEW')
ORDER BY PercentFromTotal DESC, S.name ASC, T.name ASC
Summary
I have shown how to get the storage sizes by file, table, and index with different options. Knowing how the data is distributed in the database can help to identify large tables that may need special attention. Feel free to use the queries for your database.