It seems easy enough. An email pops through from your manager and he’s concerned about the space utilization in the database. Then comes the question:
“So what exactly is using up most of the space?”
For the sake of simplicity, lets assume that you already know that most of the space comes from a single database schema, so you connect to that schema and throw a query against USER_SEGMENTS. And that is when the frustration starts..
USER_SEGMENTS is about segments
That statement might seem right up there in a the list of all-time most obvious statements, but it is important to understand that strictly speaking, it is not tables and indexes and the like that consume database space. They can be thought of as a logical construct. The only thing that consumes database space is a segment, and even that really is just a logical grouping of extents. The Database Concepts guide contains an thorough explanation of these terms, but in a nutshell, a database table could consist of no segment (ie, it does not yet contain any data), a single segment, or even multiple segments (for partitioned table). The same goes for indexes. Thus if you query USER_TABLES or USER_INDEXES to find the largest space consumer you are not getting a true indicator of the space usage on the database. You can only get that by looking at segments. But when you query USER_SEGMENTS, you might end up with a result like this:
SQL> select segment_name, bytes
2 from user_segments
3 order by bytes desc
4 fetch first 1 row only;
SEGMENT_NAME BYTES
------------------------------ ----------
SYS_LOB0000089108C00006$ 268435456
It doesn’t take a lot of imagination to see how that will play out with your manager:
- You: “Yes, I’ve taken a look at the database as you requested”
- Manager: “So what is the the biggest consumer? Is it our SALES table? Our CUSTOMERS table?”
- You: “It is SYS_LOB0000089108C00006$$”
- Manager: “The….what?!?!?!”
Suitably chastened you head back to the database. Since this looks like a LOB segment, you dive into USER_LOBS to see what the table is for that particular LOB segment.
SQL> select table_name
2 from user_lobs
3 where segment_name = 'SYS_LOB0000089108C00006
Now you can go back to the manager.
- You: “Whoops…sorry about that. My mistake. That is a LOB segment – it is part of one our tables”
- Manager: “No problems. So the biggest table is?”
- You: “It is DR$TEXT_COL_IDX$X”
- Manager: “The….what?!?!?!”
Segment Hierarchy
Whilst the database is quite happy to manage and report segments in this way, as humans (and particular managers ) we have an expectation that everything that consume space in the database, ultimately must roll up to the tables that we hold our true data in. The indexes, the LOBs, the LOB indexes, the tables automatically generated to support things like Text indexes, all really just contribute to our understanding of the space a table consumes.
With that in mind, I’ve put together a script which hopefully will assist. It will attribute space to a table in the database based on:
- the data for the table itself,
- any partitions for that table,
- any indexes for that table,
- any index partitions for that table,
- any LOB segments,
- any secondary tables,
- any indexes on those secondary tables,
- any LOB segments on those secondary tables
For example, the AskTOM schema contains tables, partitioned tables, text indexes, LOBs and the like. A straight query from USER_SEGMENTS looks like this:
SQL> select segment_name, bytes
2 from user_segments
3 order by bytes desc;
SEGMENT_NAME BYTES
------------------------------ ----------
ATE_QUESTION_REVIEWS 268435456
SYS_LOB0000089108C00006$ 151191552
DR$ATE_SEARCH_IDX$I 134217728
ATE_SUBMITTED_QUESTIONS 109051904
SYS_LOB0000089079C00013$ 67305472
BIN$VuDAwtUUQHSNPYBK64buUQ==$0 54525952
SYS_LOB0000089079C00014$ 50528256
SYS_LOB0000089108C00014$ 50528256
DR$ATE_SEARCH_IDX$X 30408704
but with my space distribution script, we get a much better idea of where the true space consumption lies
SEG SZ
------------------------------ ------------
ATE_QUESTION_REVIEWS 474M
ATE_SUBMITTED_QUESTIONS 388M
$RECYCLE_BIN$ 53M
ATE_ERROR_CAPTURE 40M
ATE_QUESTION_VIEWS 17M
ATE_QUESTION_REVIEW_LINKFIX 12M
ATE_PRESENTATIONS 6M
ATE_SUBMITTED_QUESTION_LINKFIX 6M
ATE_SUPPORTING_FILES 4M
SOLVER 2M
ATE_TRIBUTE 2M
ATE_QUESTION_STATUS_HIST 2M
ATE_ADMINS 2M
The script is below and you can also get from my github miscellaneous scrips repos. Hope you find it useful
with
indexes_with_secondary as
( --
-- normal indexes
--
select index_name, table_name
from user_indexes
where table_name not in ( select secondary_object_name from user_secondary_objects)
union all
--
-- secondary tables
--
select uso.secondary_object_name, ui.table_name
from user_secondary_objects uso,
user_tables ut,
user_indexes ui
where ut.table_name = uso.secondary_object_name
and uso.index_name = ui.index_name
union all
--
-- indexes on secondary tables
--
select ui.index_name, ui_parent.table_name
from user_indexes ui,
user_secondary_objects uso,
user_tables ut,
user_indexes ui_parent
where ut.table_name = uso.secondary_object_name
and ut.table_name = ui.table_name
and uso.index_name = ui_parent.index_name
),
lobs_with_secondary as
( --
-- normal lobs
--
select segment_name, table_name
from user_lobs
where table_name not in ( select secondary_object_name from user_secondary_objects)
union all
--
-- secondary tables
--
select ul.segment_name, ut.table_name
from user_lobs ul,
user_secondary_objects uso,
user_tables ut,
user_indexes ui
where ul.table_name = uso.secondary_object_name
and uso.index_name = ui.index_name
and ui.table_name = ut.table_name
),
seg_space as
(
select
coalesce(
i.table_name,
l.table_name,
case when s.segment_name like 'BIN$%' then '$RECYCLE_BIN$' else s.segment_name end) seg,
sum(s.bytes) byt
from user_segments s,
indexes_with_secondary i,
lobs_with_secondary l
where s.segment_name = i.index_name(+)
and s.segment_name = l.segment_name(+)
group by coalesce(i.table_name,l.table_name,
case when s.segment_name like 'BIN$%' then '$RECYCLE_BIN$' else s.segment_name end)
)
select seg,
lpad(case
when byt > 1024*1024*1024 then round(byt/1024/1024/1024)||'G'
when byt > 1024*1024 then round(byt/1024/1024)||'M'
else round(byt/1024)||'K'
end,8) sz
from seg_space s
order by byt desc;
Hi Connor,
one request, probably when you write new blog post, please have simple words to explain the stuff.
it is very difficult for non-english speaking people to understand about these kind of words (chastened, whilst)
we can’t go / switch back to dictionary to understand the meaning of these words, while reading this blog post & doing so will lose the continuity for us
so kindly accept this request.
( please read some of the articles from Tom Kytes blog post, it’s all explained in very simple words for any one to understand and easy to go )