MEMORY MANAGEMENT 11g
what are the shared memory IDs for my instance
corresponding SysV SHM segments:
$ ipcs -m
mapped memory for an Oracle instance process – as the SGA should be definitely mapped there!
$ pmap `pgrep -f lgwr`
$ ls -l /dev/shm
Removing Shared Memory
Sometimes after an instance crash you may have to remove Oracle’s shared memory segment(s) manually.
To see all shared memory segments that are allocated on the system, execute:
$ ipcs -m
—— Shared Memory Segments ——–
key shmid owner perms bytes nattch status
0x8f6e2129 98305 oracle 600 77694523 0
0x2f629238 65536 oracle 640 2736783360 35
0x00000000 32768 oracle 640 2736783360 0 dest
In this example you can see that three shared memory segments have been allocated. The output also shows that shmid 32768 is an abandoned shared memory segment from a past ungraceful Oracle shutdown. Status “dest” means that this memory segment is marked to be destroyed. To find out more about this shared memory segment you can run:
$ ipcs -m -i 32768
Shared memory Segment shmid=32768
uid=500 gid=501 cuid=500 cgid=501
bytes=2736783360 lpid=3688 cpid=3652 nattch=0
att_time=Sat Oct 29 13:36:52 2005
det_time=Sat Oct 29 13:36:52 2005
change_time=Sat Oct 29 11:21:06 2005
To remove the shared memory segment, you could copy/paste shmid and execute:
$ ipcrm shm 32768
pmap output reveals that Oracle 11g likes to use /dev/shm for shared memory implementation instead.
There are multiple 16MB “files” mapped to Oracle server processes address space
Btw, if your MEMORY_MAX_TARGET parameter is larger than 1024 MB then Oracle’s memory granule size is 16MB on Linux,
otherwise it’s 4MB
top, vmstat, sar, ps, pstree, ipcs
top, mpstat, tload, /proc/cpuinfo, x86info
free, /proc/meminfo, slabtop, /proc/slabinfo, ipcs
iostat, vmstat, sar
To display CPU utilization:
sar 3 100
To display paging activity:
sar -B 3 100
To display swapping activity:
sar -W 3 100
To display block I/O activity:
sar -b 3 100
To display block I/O activity for each block device:
sar -d 3 100
To display network activity:
sar -n DEV 3 100
PGA_AGGREGATE_TARGET itself is just a recommended number, leaving over from MEMORY_TARGET – SGA_TARGET (if it’s set).
And Oracle uses that number to decide how big PGAs it will “recommend” for sessions that are using WORKAREA_SIZE_POLICY=AUTO.
Manual PGA Memory Management
In manual PGA memory management, the parameters that have the largest impact on the size of your
PGA, outside of the memory allocated by your session for PL/SQL tables and other variables, are:
• SORT_AREA_SIZE: The total amount of RAM that will be used to sort information
before swapping out to disk.
• SORT_AREA_RETAINED_SIZE: The amount of memory that will be used to hold sorted
data after the sort is complete. That is, if SORT_AREA_SIZE is 512KB and
SORT_AREA_RETAINED_SIZE is 256KB, your server process would use up to 512KB of
memory to sort data during the initial processing of the query. When the sort was
complete, the sorting area would “shrink” down to 256KB, and any sorted data
that did not fit in that 256KB would be written out to the temporary tablespace.
• HASH_AREA_SIZE: The amount of memory your server process can use to store hash
tables in memory. These structures are used during a hash join, typically when
joining a large set with another set. The smaller of the two sets would be hashed
into memory and anything that didn’t fit in the hash area region of memory would
be stored in the temporary tablespace by the join key.
These parameters control the amount of space Oracle will use to sort or hash data before writing
(swapping) it to disk, and how much of that memory segment will be retained after the sort is done. The
SORT_AREA_SIZE-SORT_AREA_RETAINED_SIZE calculated value is generally allocated out of your PGA, and
the SORT_AREA_RETAINED_SIZE value will be in your UGA. You can discover your current use of PGA and
UGA memory and monitor its size by querying special Oracle V$ views, also referred to as dynamic
In Oracle9i, direct I/O to temporary space was not labeled as such, so we used a WHERE clause that
included (and a.name like ‘%ga %’ or a.name like ‘%physical % direct%’) in it.
Automatic PGA Memory Management
Starting with Oracle9i Release 1, a new way to manage PGA memory was introduced that avoids using
the SORT_AREA_SIZE, BITMAP_MERGE_AREA_SIZE, and HASH_AREA_SIZE parameters. It was introduced to
address a few issues:
• Ease of use: Much confusion surrounded how to set the proper *_AREA_SIZE
parameters. There was also much confusion over how those parameters actually
worked and how memory was allocated.
• Manual allocation was a “one-size-fits-all” method: Typically, as the number of
users running similar applications against a database went up, the amount of
memory used for sorting and hashing went up linearly as well. If 10 concurrent
users with a sort area size of 1MB used 10MB of memory, 100 concurrent users
would probably use 100MB, 1,000 would probably use 1000MB, and so on. Unless
the DBA was sitting at the console continually adjusting the sort/hash area size
settings, everyone would pretty much use the same values all day long. Consider
the previous example, where you saw for yourself how the physical I/O to temp
decreased as the amount of RAM we allowed ourselves to use went up. If you run
that example for yourself, you will almost certainly see a decrease in response time
as the amount of RAM available for sorting increases. Manual allocation fixes the
amount of memory to be used for sorting at a more or less constant number,
regardless of how much memory is actually available. Automatic memory
management allows us to use the memory when it is available; it dynamically
adjusts the amount of memory we use based on the workload.
• Memory control: As a result of the previous point, it was hard, if not impossible, to
keep the Oracle instance inside a “box” memory-wise. You couldn’t control the
amount of memory the instance was going to use, as you had no real control over
the number of simultaneous sorts and hashes taking place. It was far too easy to
attempt to use more real memory (actual physical free memory) than was
available on the machine.
Enter automatic PGA memory management. Here, you first simply set up and size the SGA. The SGA
is a fixed-size piece of memory so you can very accurately see how big it is, and that will be its total size
(unless and until you change it). You then tell Oracle, “This is how much memory you should try to limit
yourself to across all work areas” (a new umbrella term for the sorting and hashing areas you use). Now,
you could in theory take a machine with 2GB of physical memory and allocate 768MB of memory to the
SGA and 768MB of memory to the PGA, leaving 512MB of memory for the OS and other processes. I say
“in theory” because it doesn’t work exactly that cleanly, but it’s close. Before I discuss why that’s true,
let’s take a look at how to set up automatic PGA memory management and turn it on.
The process of setting this up involves deciding on the proper values for two instance initialization
WORKAREA_SIZE_POLICY: This parameter may be set to either MANUAL, which will
use the sort area and hash area size parameters to control the amount of
memory allocated, or AUTO, in which case the amount of memory allocated will
vary based on the current workload in the database. The default and
recommended value is AUTO.
• PGA_AGGREGATE_TARGET: This parameter controls how much memory the instance
should allocate, in total, for all work areas used to sort or hash data. Its default
value varies by version and may be set by various tools such as the DBCA. In
general, if you are using automatic PGA memory management, you should
explicitly set this parameter.
PGA usage statistics:
select * from v$pgastat;
where name in (‘aggregate PGA target parameter’,
‘aggregate PGA auto target’,
‘total PGA inuse’,
‘total PGA allocated’,
‘over allocation count’,
‘extra bytes read/written’,
‘cache hit percentage’);
Determine a good setting for pga_aggregate_target:
select * from v$pga_target_advice order by pga_target_for_estimate;
Show the maximum PGA usage per process:
select max(pga_used_mem), max(pga_alloc_mem), max(pga_max_mem) from v$process;
V$SQL_WORKAREA, what we need to examine for tuning PGA
Optimal_executions — number of times this work area ran in
optimal mode without interacting with temporary tablespace
Onepass_executions — number of times this work area ran in
one-pass mode, which means sql execution need to interact with
temporary tablespace once to get it finished
Multipasses_executions — number of times this work area ran
below the one-pass memory requirement, which means sql
execution need to interact with temporary tablespace more than
once to get it finished
Here, we’d like to get summary information about executions criteria
from view v$sql_workarea.
SQL> select sum(OPTIMAL_EXECUTIONS) OPTIMAL,
sum(ONEPASS_EXECUTIONS) ONEPASS ,
select n.name,sum(s.value) value
from v$sesstat s,v$statname n
and n.name like ‘workarea executions%’
group by n.name;
The Database Buffer Cache
The database buffer cache is Oracle’s work area for executing SQL. When updating data, users’ sessions don’t directly update the data on disk. The data blocks containing the data of interest are first copied into the database buffer cache. Changes (such inserting new rows and deleting or modifying existing rows) are applied to these copies of the data blocks in the database buffer cache. The blocks will remain in the cache for some time afterward, until the buffer they are occupying is needed for caching another block.
When querying data, the data also goes via the cache. The session works out which blocks contain the rows of interest and copies them into the database buffer cache; the relevant rows are then transferred into the session’s PGA for further processing. And again, the blocks remain in the database buffer cache for some time afterward.
Take note of the term block. Datafiles are formatted into fixed-sized blocks. Table rows, and other data objects such as index keys, are stored in these blocks. The database buffer cache is formatted into memory buffers each sized to hold one block. Unlike blocks, rows are of variable length; the length of a row will depend on the number of columns defined for the table, whether the columns actually have anything in them, and if so, what. Depending on the size of the blocks (which is chosen by the DBA) and the size of the rows (which is dependent on the table design and usage), there may be several rows per block or possibly a row may stretch over several blocks. The structure of a data block will be described in the section “The Datafiles” later in this chapter.
Ideally, all the blocks containing data that is frequently accessed will be in the database buffer cache, therefore minimizing the need for disk I/O. As a typical use of the database buffer cache, consider an end user retrieving an employee record and updating it, with these statements:
select last_name, salary, job_id from employees where employee_id=100;
update employees set salary=salary * 1.1 where employee_id=100;
The user process will have prompted the user for the employee number and constructed the SELECT statement. The SELECT retrieves some details to be sent to the user process, where they will be formatted for display. To execute this statement, the session’s server process will read the data block containing the relevant row from a datafile into a buffer. The user process will then initiate a screen dialogue to prompt for some change to be made and verified; then the UPDATE statement and the COMMIT statement will be constructed and sent to the server process for execution. Provided that an excessive period of time has not elapsed, the block with the row will still be available in the cache when the UPDATE statement is executed. In this example, the buffer cache hit ratio will be 50 percent: two accesses of a block in the cache, but only one read of the block from disk. A well-tuned database buffer cache can result in a cache hit ratio well over 90 percent.
A buffer storing a block whose image in the cache is not the same as the image on disk is often referred to as a dirty buffer. A buffer will be clean when a block is first copied into it: at that point, the block image in the buffer is the same as the block image on disk. The buffer will become dirty when the block in it is updated. Eventually, dirty buffers must be written back to the datafiles, at which point the buffer will be clean again. Even after being written to disk, the block remains in memory; it is possible that the buffer will not be overwritten with another block for some time.
Note that there is no correlation between the frequency of updates to a buffer (or the number of COMMITs) and when it gets written back to the datafiles. The write to the datafiles is done by the database writer background process.
The size of the database buffer cache is critical for performance. The cache should be sized adequately for caching all the frequently accessed blocks (whether clean or dirty), but not so large that it caches blocks that are rarely needed. An undersized cache will result in excessive disk activity, as frequently accessed blocks are continually read from disk, used, overwritten by other blocks, and then read from disk again. An oversized cache is not so bad (so long as it is not so large that the operating system is having to swap pages of virtual memory in and out of real memory) but can cause problems; for example, startup of an instance is slower if it involves formatting a massive database buffer cache.
The database buffer cache is allocated at instance startup time. Prior to release 9i of the database it was not possible to resize the database buffer cache subsequently without restarting the database instance, but from 9i onward it can be resized up or down at any time. This resizing can be either manual, or (from release 10g onward) automatic according to workload, if the automatic mechanism has been enabled.
The Log Buffer
The log buffer is a small, short-term staging area for change vectors before they are written to the redo log on disk. A change vector is a modification applied to something; executing DML statements generates change vectors applied to data. The redo log is the database’s guarantee that data will never be lost: whenever a data block is changed, the change vectors applied to the block are written out to the redo log, from where they can be extracted and applied to datafile backups if it is ever necessary to restore a datafile.
Redo is not written directly to the redo log files by session server processes. If it were, the sessions would have to wait for disk I/O operations to complete whenever they executed a DML statement. Instead, sessions write redo to the log buffer, in memory. This is much faster than writing to disk. The log buffer (which may contain change vectors from many sessions, interleaved with each other) is then written out to the redo log files. One write of the log buffer to disk may therefore be a batch of many change vectors from many transactions. Even so, the change vectors in the log buffer are written to disk in very nearly real time—and when a session issues a COMMIT statement, the log buffer write really does happen in real time. The writes are done by the log writer background process, the LGWR.
The log buffer is small (in comparison with other memory structures) because it is a very short-term storage area. Change vectors are inserted into it and are streamed to disk in near real time. There is no need for it to be more than a few megabytes at the most, and indeed making it much bigger than the default value can be seriously bad for performance. The default is determined by the Oracle server and is based on the number of CPUs on the server node.
It is not possible to create a log buffer smaller than the default. If you attempt to, it will be set to the default size anyway. It is possible to create a log buffer larger than the default, but this is often not a good idea. The problem is that when a COMMIT statement is issued, part of the commit processing involves writing the contents of the log buffer to the redo log files on disk. This write occurs in real time, and while it is in progress, the session that issued the COMMIT will hang. Commit processing is a critical part of the Oracle architecture. The guarantee that a committed transaction will never be lost is based on this: the commit-complete message is not returned to the session until the data blocks in the cache have been changed (which means that the transaction has been completed) and the change vectors have been written to the redo log on disk (and therefore the transaction could be recovered if necessary). A large log buffer means that potentially there is more to write when a COMMIT is issued, and therefore it may take a longer time before the commit-complete message can be sent, and the session can resume work.
The log buffer is allocated at instance startup, and it can never be resized subsequently without restarting the instance. It is a circular buffer. As server processes write change vectors to it, the current write address moves around. The log writer process writes the vectors out in batches, and as it does so, the space they occupied becomes available and can be overwritten by more change vectors. It is possible that at times of peak activity, change vectors will be generated faster than the log writer process can write them out. If this happens, all DML activity will cease (for a few milliseconds) while the log writer clears the buffer.
The process of flushing the log buffer to disk is one of the ultimate bottlenecks in the Oracle architecture. You cannot do DML faster than the LGWR can flush the change vectors to the online redo log files.
The Shared Pool
The shared pool is the most complex of the SGA structures. It is divided into dozens of substructures, all of which are managed internally by the Oracle server. This discussion of architecture will only mention four of the shared pool components, and these only briefly:
The library cache
The data dictionary cache
The PL/SQL area
The SQL query and PL/SQL function result caches
Some other structures will be described in later chapters. All the structures within the shared pool are automatically managed. Their size will very according to the pattern of activity against the instance, within the overall size of the shared pool The shared pool itself can be resized dynamically, either in response to the DBA’s instructions or through being managed automatically.
The Library Cache
The library cache is a memory area for storing recently executed code, in its parsed form. Parsing is the conversion of code written by programmers into something executable, and it is a process which Oracle does on demand. By caching parsed code in the shared pool so that it can be reused without reparsing, performance can be greatly improved. Parsing SQL code takes time. Consider a simple SQL statement:
select * from employees where last_name=’KING’;
Before this statement can be executed, the Oracle server has to work out what it means, and how to execute it. To begin with, what is employees? Is it a table, a synonym, or a view? Does it even exist? Then the “*”—what are the columns that make up the employees table (if it is a table)? Does the user have permission to see the table? Answers to these questions and many others have to be found by querying the data dictionary.
Having worked out what the statement actually means, the server has to decide out how best to execute it. Is there an index on the last_name column? If so, would it be quicker to use the index to locate the row, or to scan the whole table? More queries against the data dictionary … It is quite possible for a simple one-line query against a user table to generate dozens of queries against the data dictionary, and for the parsing of a statement to take many times longer than eventually executing it. The purpose of the library cache of the shared pool is to store statements in their parsed form, ready for execution. The first time a statement is issued, it has to be parsed before execution—the second time, it can be executed immediately. In a well-designed application, it is possible that statements may be parsed once and executed millions of times. This saves a huge amount of time
The Data Dictionary Cache
The data dictionary cache is sometimes referred to as the row cache. Whichever term you prefer, it stores recently used object definitions: descriptions of tables, indexes, users, and other metadata definitions. Keeping such definitions in memory in the SGA, where they are immediately accessible to all sessions, rather than each session having to read them repeatedly from the data dictionary on disk, enhances parsing performance.
The data dictionary cache stores object definitions so that when statements do have to be parsed, they can be parsed fast—without having to query the data dictionary. Consider what happens if these statements are issued consecutively:
select sum(salary) from employees;
select * from employees where last_name=’KING’;
Both statements must be parsed because they are different statements—but parsing the first SELECT statement will have loaded the definition of the employees table and its columns into the data dictionary cache, so parsing the second statement will be faster than it would otherwise have been, because no data dictionary access will be needed.
The PL/SQL Area
Stored PL/SQL objects are procedures, functions, packaged procedures and functions, object type definitions, and triggers. These are all stored in the data dictionary, as source code and also in their compiled form. When a stored PL/SQL object is invoked by a session, it must be read from the data dictionary. To prevent repeated reading, the objects are then cached in the PL/SQL area of the shared pool.
The first time a PL/SQL object is used, it must be read from the data dictionary tables on disk, but subsequent invocations will be much faster, because the object will already be available in the PL/SQL area of the shared pool
The SQL Query and PL/SQL Function Result Cache
The result cache is a release 11g new feature. In many applications, the same query is executed many times, by either the same session or many different sessions. Creating a result cache lets the Oracle server store the results of such queries in memory. The next time the query is issued, rather than running the query the server can retrieve the cached result.
The result cache mechanism is intelligent enough to track whether the tables against which the query was run have been updated. If this has happened, the query results will be invalidated and the next time the query is issued, it will be rerun. There is therefore no danger of ever receiving an out-of-date cached result.
The PL/SQL result cache uses a similar mechanism. When a PL/SQL function is executed, its return value can be cached ready for the next time the function is executed. If the parameters passed to the function, or the tables that the function queries, are different, the function will be reevaluated, but otherwise, the cached value will be returned.
By default, use of the SQL query and PL/SQL function result cache is disabled, but if enabled programmatically, it can often dramatically improve performance. The cache is within the shared pool: unlike the other memory areas described previously, it does afford the DBA some control: he/she can specify a maximum size.
The Large Pool
The large pool is an optional area that, if created, will be used automatically by various processes that would otherwise take memory from the shared pool. One major use of the large pool is by shared server processes, described in Chapter 6 in the discussion of the shared (or multithreaded) server. Parallel execution servers will also use the large pool, if there is one. In the absence of a large pool, these processes will use memory on the shared pool. This can cause bad contention for the shared pool: if shared servers or parallel servers are being used, a large pool should always be created. Some I/O processes may also make use of the large pool, such as the processes used by the Recovery Manager when it is backing up to a tape device.
Sizing the large pool is not a matter for performance. If a process needs large pool of memory, it will fail with an error if that memory is not available. Allocating more memory than is needed will not make statements run faster. Furthermore, if a large pool exists, it will be used: it is not possible for a statement to start off by using the large pool, and then revert to the shared pool if the large pool is too small.
The Java Pool
The Java pool is only required if your application is going to run Java-stored procedures within the database: it is used for the heap space needed to instantiate the Java objects. However, a number of Oracle options are written in Java, so the Java pool is considered standard nowadays. Note that Java code is not cached in the Java pool: it is cached in the shared pool, in the same way that PL/SQL code is cached.
The optimal size of the Java pool is dependent on the Java application, and how many sessions are running it. Each session will require heap space for its objects. If the Java pool is undersized, performance may degrade due to the need to continually reclaim space. In an EJB (Enterprise JavaBean) application, an object such as a stateless session bean may be instantiated and used, and then remain in memory in case it is needed again: such an object can be reused immediately. But if the Oracle server has had to destroy the bean to make room for another, then it will have to be reinstantiated next time it is needed. If the Java pool is chronically undersized, then the applications may simply fail
The Streams Pool
The Streams pool is used by Oracle Streams. This is an advanced tool that is beyond the scope of the OCP examinations or this book, but for completeness a short description follows.
The mechanism used by Streams is to extract change vectors from the redo log and from these reconstruct the statements that were executed—or statements that would have the same effect. These statements are executed at the remote database. The processes that extract changes from redo and the processes that apply the changes need memory: this memory is the Streams pool. From database release 10g it is possible to create and to resize the Streams pool after instance startup; this creation and sizing can be completely automatic. With earlier releases it had to be defined at startup and was a fixed size.
The parameters that have the greatest effect on the overall size of the SGA are as follows:
• JAVA_POOL_SIZE: Controls the size of the Java pool.
• SHARED_POOL_SIZE: Controls the size of the shared pool (to some degree).
• LARGE_POOL_SIZE: Controls the size of the large pool.
• DB_*_CACHE_SIZE: Eight of these cache_size parameters control the sizes of the
various buffer caches available.
• LOG_BUFFER: Controls the size of the redo buffer (to some degree).
• SGA_TARGET: Used with automatic SGA memory management in Oracle 10g and
• SGA_MAX_SIZE: Used to control the maximum size to which the SGA can be resized
while the database is up and running.
• MEMORY_TARGET: Used with automatic memory management (both PGA and SGA
automatic memory management).
• MEMORY_MAX_SIZE: Used to control the maximum amount of memory Oracle should
strive to use over both the PGA and SGA sizes under automatic memory
management. This is really just a target; the PGA may exceed the optimum size if
the number of users increases beyond some level or a session(s) allocates large
untunable bits of memory as demonstrated above.
Multiple Block Sizes
ops$tkyte%ORA11GR2> create tablespace ts_16k
2 datafile ‘/tmp/ts_16k.dbf’
3 size 5m
4 blocksize 16k;
CHAPTER 4 MEMORY STRUCTURES
create tablespace ts_16k
ERROR at line 1:
ORA-29339: tablespace block size 16384 does not match configured block sizes
ops$tkyte%ORA11GR2> show parameter 16k
select pool, name, bytes
order by pool, name;
show parameter sga_target
select component, granule_size from v$sga_dynamic_components;
select tch, file#, dbablk,
case when obj = 4294967295
then ‘rbs/compat segment’
else (select max( ‘(‘||object_type||’) ‘ ||
owner || ‘.’ || object_name ) ||
decode( count(*), 1, ”, ‘ maybe!’ )
where data_object_id = X.OBJ )
select tch, file#, dbablk, obj
where state <> 0
order by tch desc
where rownum <= 5