Friday, December 24, 2004

Determining the IP address of the cluster interconnect in 9i

In Oracle database version 9i there is no way to determine the IP address through a database view. The only way to determine this IP address is with the oradebug command (Please keep in mind oradebug is not a supported product from Oracle, so if there are problems/crashes you are on your own). The oradebug ipc command creates a trace file. The example shows the process:
 SQL> oradebug setmypid

SQL> oradebug ipc
SQL> oradebug tracefile_name
The last command will show the name of the created trace file. The next step is opening the trace file and looking for the string SKGXPCTX. The following is a typical example.
SKGXPCTX: 0xcd1e730 ctx

admno 0x7a0be402 admport:
SSKGXPT 0xcd1e884 flags SSKGXPT_READPENDING info for network 0
socket no 8 IP 140.87.79.67 UDP 9152
sflags SSKGXPT_WRITESSKGXPT_UP
info for network 1
The cluster interconnect IP address is shown in bold. In the example above
it is 140.87.79.67.

How to determine SQL statements that cause hard parses

From a tuning point of view hard parses can be quite limiting to the scalability of database and an application. If the number of hard parses is high this is a serious problem. To tackle the problem the first step is to determine which SQL statements are causing hard parses. With the column FIRST_LOAD_TIME of the view V$SQL this can be determined. The following example shows how this knowledge can be used. It reports all statement that have been hard parsed in the last ten minutes
SELECT first_load_time, sql_text 

FROM v$sql
WHERE first_load_time > to_char(sysdate-1/(24*6), 'YYYY-MM-DD/HH24:MI:SS')
Keep in mind that not necessarily each hard parse is bad. Actually every statements needs at least one hard parse. But if you see on the other hand a lot SQL statements that are not using bind variables the problem at hand is obvious.

Wednesday, December 22, 2004

Deleting the wrong cluster interconnect information from the OCR

The current configuration of the cluster interconnect information can be checked with the oifcfg command. The following example shows this:
$ oifcfg getif
eth0 142.2.166.0 global public
ib0 192.169.1.0 global cluster_interconnect
Let's assume that the address for the cluster interconnect was specified incorrectly. This will prevent cluster communication. The situation can be resolved by deleting the cluster interconnect IP from the OCR configuration. With the command below the interconnect IP address gets deleted.
$ oifcfg delif -global ib0
Afterwards the configuration can be verified for correctness.
$ oifcfg getif
eth0 142.2.166.0 global public
Now everything looks fine. The correct interconnect information can now be added with the oifcfg setif command.

Tuesday, December 21, 2004

Determining what IP address was configured in the OCR

In 10gR1 the IP address for the cluster interconnect is determined by default from the OCR (Oracle Cluster Repository). The configured address can be determined with the following command:
# oifcfg getif

eth1 140.87.81.0 global cluster_interconnect
If the database parameter cluster_interconnects is specified, the value that is obtained from OCR is not used.

Friday, December 17, 2004

Determining what IP address was picked for the cluster interconnect

In the past it was difficult for a user or DBA to determine which cluster interconnect was picked. It was possible to obtain that information with an oradebug comnmand. With Oracle Version 10gR1 this information is available with the X$ table X$KSXPIA.

The following SQL shows which information is returned by querying that view.
SQL> SELECT * FROM x$ksxpia;


ADDR INDX INST_ID P PICK NAME_KSXPIA IP_KSXPIA
-------- ---------- ---------- - ---- --------------- ----------------
B0757098 0 10 OSD eth0 140.87.79.74
It shows that the OSD layer picked the NIC and to which network device it belongs. In this case it is eth0.
SQL> SELECT * FROM x$ksxpia;


ADDR INDX INST_ID P PICK NAME_KSXPIA IP_KSXPIA
---------------- ---------- ---------- - ---- --------------- ----------------
00000004FEC82C40 0 2 CI 172.16.193.1
This example shows if the cluster_interconnects database initialization parameter has been set. The PICK column shows CI which stands for cluster interconnects. If the cluster interconnect is set in that way, no name is determined for the NIC. This is only done if the NIC is determined by the OSD.

OOW on Thursday

Thursday was my second day at OOW 2004. Yesterday, we presented our first session, everyhting went fine. Today we are presenting to more sessions: "Project MegaGrid: Performance Management in Large Clusters" and "MegaGrid: Capacity Planning for Large Commodity Clusters". Both of the presentations were a success. If you plan on migrating from a single instance to a grid/cluster environment especially the capacity planning presenation will provide you with a methodology to achieve that task.

Abstract: "Project MegaGrid: Performance Management in Large Clusters"
In database clusters consisting of a large number of Oracle instances on small machines that provide the resources for one or more applications, the collection and analysis of performance data are challenging. With functions such as dynamic workload distribution and dynamic Oracle Database 10g Real Application Clusters (RAC) kernel adjustments to optimize certain workloads, as well as the trend of using nodes with different capacity, this task becomes more complex. This session presents techniques and methods leveraging the specific features for RAC built into the Automatic Workload Repository (AWR) and other functions of the Server Manageability framework for RAC. The approach is practical and uses examples from real-world applications. The presentation also attempts to integrate SQL advisories, execution plan, and data access optimizations into the method. It provides a view of future requirements for complete and easy performance management in large database clusters. The presentation can be downloaded from here

Abstract: "MegaGrid: Capacity Planning for Large Commodity Clusters"
Customers are evaluating the deployment of low-cost, high-performance servers and networked storage to build an enterprise grid computing environment in which resources can be dynamically added and removed to meet processing demands. The primary resources in an enterprise grid computing environment consist of CPUs, servers for database and application processing, networks and network devices such as routers, switches or IP virtualization devices, disks, and networked storage arrays. The expected performance or service-level objectives of the business should determine the server hardware, network topology, networked storage systems, OS, and application configuration requirements and drive the initial configuration for deployment. The planning also should involve consideration of the future growth of the services hosted in the environment. Using a workload model designed for clusters of Oracle Database 10g Real Application Cluster instances, this session reviews design characteristics and how to predict response times and throughput with Oracle statistics and suggests the optimal way to balance load in an existing or simulated system and how to translate service-level requirements and workload characteristics into required hardware resources. The presentation can be downloaded from here

Thursday, December 16, 2004

OOW 2004 on Wednesday

Monday and Tuesday I haven't attended the OOW, so Wednesday was my first day. Altogether we were presenting 3 sessions. On Wednesday afternoon it was out first session "Project MegaGrid: Deploying large clusters".

Abstract:
The successful deployment of an enterprise grid computing environment requires careful and thorough planning in order to build a flexible, easily scalable architecture. This architecture must be designed across application and database layers that can be realized by the concrete infrastructure of servers, networks, and network storage based on the enterprise grid computing model. At the core of this effort is an integrated application, database, server, and network storage management framework. This framework must accurately provision for growth and capacity expansion, involving multiple steps, including the fast and simple addition and removal of servers, storage volumes, and redundant networks for high availability. It is also important to develop a set of practical guidelines to deal with ongoing maintenance tasks such as software and patch upgrades, configuration and change tracking, health checking, and monitoring. Last, responses to unplanned outages, such as repair or replace and space capacity, need to be considered. This session reviews best-practice approaches to managing all of these aspects across an end-to-end enterprise grid computing environment consisting of applications, databases, low-cost and high-performance Linux based servers, and networked storage.

More people were attending than we initially expected. At the end of the session we answered a lot of questions. This definitely seems to be a topic a lot of people are interested in or a planning to deploy very soon. The presentation can be downloaded from the oracle web site (Presentation).

Sunday, December 05, 2004

OOW 2004

Oracle OpenWorld 2004 ıs just around the corner. I had the
pleasure to go to SF on Sunday afternoon to setup the MegaGrid
demo. I arrived together with a collegue and received my speakers
pass. But as we learned with the the Speaker pass we are not
allowed to enter the demo ground. We went back to registration and
obtained additionally the demogrounds pass.

We were lucky everyhting went fine and after three hours I was
on my way back to the Pensinsula.