Introduction
Migrating large Oracle databases to modern data platforms like Azure DataBricks requires a well-structured approach to ensure data consistency, minimal downtime, and operational reliability. In this article, we walk through a real-world example of using Oracle GoldenGate 23.5.1 for an initial load migration from an on-premises Oracle 19c source to Azure DataBricks, using SCN-based extraction to ensure transactional consistency. We also include untrimmed logs from the extract process, highlighting key configuration points, SCN usage, warnings, and errors encountered.
Source and Target Environment
Source:
- Oracle Database 19c (Version 19.24.0.0.0)
- Linux x86_64, Oracle UEK 5.4.17
- GoldenGate Version: 23.5.1.24.07
- Character Set: AL32UTF8
- Deployment: SAMPLEEXT1 Extract running as Initial Load
- Source Catalog: SRCDB001
- Extract Method: SQLPREDICATE using a fixed SCN
Target:
- Azure DataBricks (Delta Lake-based sink)
- Data integrated via GoldenGate Distribution and Receiver Services
- Batch ingestion from EXTFILE output (format RELEASE 19.1/21.1/23.1)
GoldenGate Initial Load Parameters
Below are the key extract parameters used for the Initial Load with a consistent SCN (11537541176857
) across all tables:
EXTRACT SAMPLEEXT1
EXTFILE sample_output
USERIDALIAS GG_USER_ALIAS DOMAIN OracleGoldenGate
GETTRUNCATES
SOURCECATALOG SRCDB001
TABLE APPDATA.CUSTOMER_INFO_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE APPDATA.ORDER_HISTORY_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE APPDATA.ORDER_LINE_ITEMS_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE APPDATA.PRODUCT_MASTER_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE APPDATA.SUPPLIER_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE STAGING.FINANCIAL_TXN_T; SQLPREDICATE "AS OF SCN 11537541176857";
TABLE STAGING.TXN_AUDIT_LOG_T; SQLPREDICATE "AS OF SCN 11537541176857";
Extract Process Highlights and Logs
GoldenGate processed all tables with SQLPREDICATE AS OF SCN
and resolved their keys as shown in the logs:
OGG-01017 Wildcard resolution set to IMMEDIATE because SOURCEISTABLE Extract is used.
OGG-01360 EXTRACT is running in Initial Load mode.
OGG-06507 MAP (TABLE) resolved (entry SRCDB001.APPDATA.CUSTOMER_INFO_T)
OGG-06509 Using the following key columns for source table SRCDB001.APPDATA.CUSTOMER_INFO_T: CUST_ID.
...
Error Encountered: ORA-01555 (Snapshot Too Old)
Despite a clean setup, the extract process failed on one large table:
OGG-00663 OCI Error ORA-01555: snapshot too old
SQL <SELECT ... FROM "APPDATA"."ORDER_HISTORY_T" AS OF SCN 11537541176857>
Root Cause
- A single SCN was applied to multiple large and active tables, leading to undo segment overflow and rollback data loss.
- The database was unable to retain sufficient undo for the duration of the consistent read.
Final Solution: Remove SQLPREDICATE
After evaluating options such as:
- Capturing dynamic SCNs per table
- Tuning undo retention
- Splitting large tables
- Parallel extract (not feasible in current infra)
We concluded that the most effective and simple resolution was to remove SQLPREDICATE entirely.
Updated Extract Configuration (Without SCN)
EXTRACT SAMPLEEXT1
EXTFILE sample_output
USERIDALIAS GG_USER_ALIAS DOMAIN OracleGoldenGate
GETTRUNCATES
SOURCECATALOG SRCDB001
TABLE APPDATA.CUSTOMER_INFO_T;
TABLE APPDATA.ORDER_HISTORY_T;
TABLE APPDATA.PRODUCT_MASTER_T;
TABLE STAGING.FINANCIAL_TXN_T;
...
GoldenGate re-evaluated each table at runtime and captured consistent snapshots based on Oracle’s default behavior, which is optimized for initial load scenarios and avoids long undo dependencies.
Results
- The extract completed successfully without ORA-01555 errors.
- Data consistency was preserved, as the target was downstream processed post-load.
- Undo pressure was drastically reduced, and GoldenGate handled internal SCN queries automatically.
- No further changes were needed to undo retention, segment sizing, or extract batching.
Key Takeaways
- Avoid SQLPREDICATE “AS OF SCN” for large or active tables during initial loads.
- Let GoldenGate manage SCNs natively unless your data consistency model demands strict point-in-time extraction.
- Use EXTFILE format to write full data snapshots to file-based staging when targeting platforms like Azure DataBricks.
- Monitor GoldenGate logs closely for any ORA-01555 errors.
- Reserve SCN-based filtering for CDC (change data capture), not bulk extraction.
References
- Oracle GoldenGate 23c Documentation: https://docs.oracle.com/en/middleware/goldengate/
- Oracle ORA-01555 Error Guide: https://docs.oracle.com/error-help/db/ora-01555/
- Azure DataBricks Ingestion via File Load: https://learn.microsoft.com/en-us/azure/databricks/