Search This Blog

Thursday, August 24, 2017

The last secret tweak for improving datapump performance (Part 1)

There are 10,000 blog posts and oracle docs on the internets (thank you, Mr Bush) for improving datapump performance.  This is one of the features used very frequently in Oracle shops around the world.  And DBA's are under huge stress to meet impossible downtime SLA's for their export/import. I think they all miss the best tweak (ADOP)...but they basically summarize to one simple fact...outside of parallelism, if you have a well-tuned database on fast storage, there's not a lot more you can do to improve performance more than a few percentage.  The obvious improvements are:




1. Parallelize! To quote Vizzini, "I do not think it means what you think it means."




This will create multiple processes and each will take one object and work with it, each one serialized (usually).  This is great, and if all your objects are equally sized, this is perfect...but that's not typically reality.  Usually you have a few objects that are much bigger than the rest and each of them by default will only get a single process.  This limit really hurts during imports, when you need to rebuild large indexes...more on this later.

Check that its working as expected by hitting control-c and typing in status.  Ideally, you should see all parallel slaves working.  Check not just that they exist, but that they're working (verify you used %U in your dump filename if they aren't.)  ie: DUMPFILE=exaexp%U.dmp PARALLEL=100




2. If you're importing into a new database, you have the flexibility to make some temporary changes to tweak things.  Verify Disk Asynchronous IO is set to true (DISK_ASYNCH_IO=true) and disable all the block verification (DB_BLOCK_CHECK=FALSE, DB_BLOCK_CHECKSUM=FALSE)  These aren't "game changers" but they'll give you 10-20% improvements, depending on how you had them set previously.




3. Memory Settings - Datapump parallelization uses some of the streams API's, and so the streams pool is used.  Make sure you have enough memory for the shared_pool, streams_pool and the db_cache_size parameters.  Consult your gv$streams_pool_advice, gv$shared_pool_advice, gv$db_cache_advice and gv$sga_target_advice  views.  I like to tune it so as the delta in ESTD_DB_TIME_FACTOR from one row to the row below it approaches zero, the corresponding size of the pool is close to 1.  (Any more than that is a waste, any less than that is lost performance.) 


Sometimes you'll see a huge dropoff and its more clear than this example...but you get the idea.  If you're importing into a new database, you'll need to run the import dry run and then check these views to make sure this is tuned well.


SGA_SIZE SGA_SIZE_FACTOR ESTD_DB_TIME ESTD_DB_TIME_FACTOR Time Factor Diff
158720 0.5 12162648 1.2045
178560 0.5625 11421479 1.1311 0.0734
198400 0.625 10937800 1.0832 0.0479
218240 0.6875 10607608 1.0505 0.0327
238080 0.75 10604578 1.0502 0.0003
257920 0.8125 10375361 1.0275 0.0227
277760 0.875 10222886 1.0124 0.0151
297600 0.9375 10101716 1.0004 0.012
317440 1 10097674 1 0.0004
337280 1.0625 10017905 0.9921 0.0079
357120 1.125 9955300 0.9859 0.0062
376960 1.1875 9908850 0.9813 0.0046
396800 1.25 9905821 0.981 0.0003
416640 1.3125 9870479 0.9775 0.0035
436480 1.375 9844225 0.9749 0.0026
456320 1.4375 9826049 0.9731 0.0018
476160 1.5 9821001 0.9726 0.0005






4. Something often missed when importing into a new database, size your redo logs to be relatively huge.  The redo logs will work like a cache and cycle around.  Eventually if you're adding data extremely fast, the last log will fill and can't switch until the next log is cleared.  "Huge" is relative to the size, speed of your database and hardware.  While you're running your import, select * from v$log and make sure you see at least one "inactive" logs in front of the current log.


The best, virtually unused tweak is in the next post....







This post:  The last secret tweak for improving datapump performance (Part 1)
Next post: The last secret tweak for improving datapump performance (Part 2)

No comments:

Post a Comment