DB MONITORING SCRIPTS DBACLASS
GitHub - fatdba/Oracle-Database-Scripts: My Oracle DB Scripts
oracle-developer (oracle-developer.net) · GitHub
ORACLE RAC STARTUP SEQUENCE - Oracle Consulting Services | USA | 99% Customer Retention | Doyensys
download Oracle database virtual machines
Databases Are Fun – dohdatabase.com
Oracle Scratchpad | Just another Oracle weblog
Home - All Things Oracle (Oracle Patch)
https://database-heartbeat.com/
Oracle Exadata Command Reference
Understanding Exadata disk layout | Page On-Call DBA Exadata
Step By Step Patching an Exadata cell node Exadata Storage Cell Patching
OracleDBPro - Pini Dibask Blog: Data Protection Flashback
SQL Plan Baseline, SQL Patch, SQL Profile: Differences and Use Cases – Osman’s DBlog SQL PROFILE ,SQL PATCH ,SQL BASE LINE
In Oracle RAC (Real Application Clusters), the components CSSD, CRS, EVMD, and DIAG are core parts of the Clusterware stack. They work together to ensure cluster stability, node coordination, failover, and diagnostics.
1. CSSD (Cluster Synchronization Services Daemon)
🔹 What it is
CSSD (ocssd.bin) is the heartbeat and cluster membership manager.
🔹 Core Responsibilities
✅ Node Membership Management
- Tracks which nodes are alive or dead
- Maintains the cluster node list
✅ Heartbeat Mechanism
Uses two types:
- Network heartbeat → via private interconnect
- Disk heartbeat → via voting disks
👉 If a node misses heartbeat → it is considered dead.
🔹 Split-Brain Prevention
This is CSSD’s most critical job.
- Uses voting disks
- Requires majority quorum
- If a node loses quorum → it gets evicted (rebooted)
🔹 Failure Scenario
Example:
- Node loses network connectivity
- CSSD checks voting disks
- If quorum lost → node is killed to protect data integrity
🔹 Key Files / Logs
$GRID_HOME/log/<node>/cssd/ocssd.log
🔹 Key Process
ocssd.bin⚙️ 2. CRS (Cluster Ready Services)
🔹 What it is
CRS (
crsd.bin) is the resource manager of Oracle Clusterware.
🔹 Core Responsibilities
✅ Resource Management
Manages:
- Databases
- Instances
- Listeners
- VIPs
- ASM
✅ Start/Stop Resources
- Starts resources in correct order
- Handles dependencies
Example:
ASM → Database → Services
✅ Failover Management
- If a resource fails → CRS restarts it
- If node fails → relocates resources to another node
🔹 Resource Dependency Example
Database depends on ASM
Listener depends on network
🔹 Logs
$GRID_HOME/log/<node>/crsd/crsd.log
🔹 Key Process
crsd.bin
📡 3. EVMD (Event Manager Daemon)
🔹 What it is
EVMD (
evmd.bin) is the event notification system.
🔹 Core Responsibilities
✅ Event Publishing
- Publishes cluster events:
- Node up/down
- Instance start/stop
- Failover events
✅ Event Subscription
- Applications/scripts can subscribe to events
✅ FAN (Fast Application Notification)
- Sends events to clients (e.g., JDBC, OCI)
- Helps apps react instantly to failures
🔹 Example Use Case
- Node crashes
- EVMD sends event
- Application connection pool drops dead connections immediately
🔹 Logs
$GRID_HOME/log/<node>/evmd/evmd.log
🔹 Key Process
evmd.bin
🩺 4. DIAG (Diagnostic Daemon)
🔹 What it is
DIAG (
diagdaemon/ integrated diag framework) is responsible for diagnostics and health monitoring.
🔹 Core Responsibilities
✅ Log Collection
- Collects logs from all cluster components
✅ Health Monitoring
- Tracks component health
- Works with Cluster Health Monitor (CHM)
✅ Incident Detection
- Detects critical issues
- Generates trace files and dumps
🔹 Integration
- Works with:
- ADR (Automatic Diagnostic Repository)
- Trace infrastructure
🔹 Logs Location
$GRID_HOME/log/<node>/diag/
🔗 How They Work Together
🔄 Startup Flow
- OHASD starts
- CSSD starts → establishes cluster membership
- CRS starts → manages resources
- EVMD starts → enables event system
- DIAG runs in background
🔄 Failure Flow Example
Node Crash:
- CSSD detects heartbeat loss
- Node evicted
- CRS relocates resources
- EVMD sends notifications
- DIAG logs everything
🧩 Architecture Relationship
OHASD (Oracle High Availability Service)
│
├── CSSD → Cluster membership + heartbeat
├── CRS → Resource management
├── EVMD → Event system
└── DIAG → Diagnostics
🚨 Quick Comparison Table
Component Role Critical Function CSSD Cluster control Node membership, heartbeat CRS Resource manager Start/stop/failover EVMD Event system FAN notifications DIAG Diagnostics Logs, health, incidents
🛠️ Important Commands
Check cluster status
crsctl stat res -tCheck cluster health
crsctl check cluster -allCheck CSS
crsctl check cssd
No comments:
Post a Comment