Re: RAC Experiences

  • From: "Don Granaman" <granaman@xxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 14 May 2004 12:27:38 -0700

Since I have posted some of the things that might make one think that any
consideration of RAC is on the lunatic fringe, I'll make some amends
(perhaps)...

OPS/RAC can be useful and relatively stable.  Sometimes you just hit the
limit (either financially or physically) of vertical scalability and have to
go horizontal.  The nature of the application and its RAC implementation is,
IMHO, crucial.  Simply trying to prop up a poorly-designed application with
a truck load of RAC hardware can be something of a disaster.

One such (painful) OPS experience of mine was an application that

(1) used no bind variables for queries (DML did and was tight code, but we
had over 2500 users submitting queries)

(2) had no coherent locking strategy - leading to constant deadlocks (in
RAC/OPS deadlocks can be global - and, if frequent enough, sometimes deadly)

(3) forced on *every* query on all the largest tables a hint on a
sequence-generated primary key index - and a stop key (e.g. "select /*+
index (ORDERS ORDER_PK) */ ... where ACCOUNT_NO = 10289 and rownum <= 50")
to sort the data in chronological order and limit the result set to what
could be shown on the screen, even for queries that did not use the primary
key column in the where clause - because the outsourced designers/developers
knew that "Oracle sorts are too expensive".  Sure.  This is far better than
fetching the 27 records for that account via the index on ACCOUNT_ID and
sorting them.  [If you want the absolute ultimate in inefficiency, try
querying on COL1 and/or COL2 against a multi-million row table using a hint
on the index on COLPK!]  By the way, the application designers' eventual
"solution" for this year+ long lunacy (after finally having to admit, in the
light of a pickup load of hard evidence, that the code might *possibly* have
something to do with performance) was to build the hint dynamically based on
the where clause (still using a stopkey), and append the ORDER_PK column to
every index - all 13 of them!  [e.g. Primary key was ORDER_PK, index on
(ACCOUNT_NO, ORDER_PK), index on (CLIENT_ID, ORDER_PK), etc.]

(4) when a user submitted a query, the designers' homegrown "muddleware"
would submit the query on the user's behalf, but their GUI would time out
after 3 seconds if the query wasn't finished - and tell the user to "Try
Again".  The query submitted by the muddleware would continue to run against
the database though - sometimes for 45 minutes or more.  Of course, the user
didn't know this, so they would [loop until lunch] resubmit the query, it
would "time out" but keep running against Oracle [end loop].  I once found
38 exactly identical copies of the same 40+ minute query (using the
aforementioned PK hint method) running, all submitted by the same user -
after the user called and complained that, even after repeated attempts, it
always timed out.  The eventual "solution":  They changed the GUI timeout to
10 seconds.. Then 20 seconds.  Ad nauseum...

(5) Led the users and others to call me at least a half-dozen times per day
to ask "Why is the database so slow?  Can't you tune it?"  [Answer: "No"]

If you have something even vaguely like this, run for the door!  Now!
Putting in on RAC will likely only make it worse.

However, if you have a fairly well-designed and tuned application, some
influence over its construction, and (most importantly) the respect and ears
of development, you can make RAC (or OPS) work.

-Don Granaman
OraSaurus - "Entropy ain't what it used to be."

----- Original Message ----- 
From: "John P Weatherman" <asahoshi@xxxxxxxxxxxxxx>
To: "oracle-l" <oracle-l@xxxxxxxxxxxxx>
Sent: Thursday, May 13, 2004 7:51 AM
Subject: RAC Experiences


> All,
> Given some of the recent postings on RAC, we are probably crazy to
continue looking at it.  Still, we have an app that we can see scaling to
40+ CPUs in the next year and want some way to avoid buying ever bigger
servers (with the attendant ever bigger $$) to handle it.  Currently we are
looking at IBM and are left with a few questions:
>
> 1) Is anyone running production RAC on AIX and what has the experience
been from a reliability standpoint? Administrative difficulties? Etc?
>
> 2) Is anyone running production RAC on Linux on IBM hardware and what has
the experience been from a reliability standpoint? Administrative
difficulties? Etc?
>
> 3) IBM claims that their cluster software handles heterogeneous boxes
quite well.  We are considering the possibility of getting 3 4-way 530s to
"prove out" and get RAC running.  Later we would add in 650s for scaling up.
Eventually the 530s go away and become something else, like a test RAC.  Has
anyone been running RAC in a heterogeneous environment and what have your
experiences been?
>
> Any practical experience feedback would be greatly appreciated.
>
> TIA,
>
> John P Weatherman
> Oracle Database Administrator
> Advance America


----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: