- spark session object
- metadata object
- additional parameters
Used by streaming query to add a datframe to hive acid table.
Used by streaming query to add a datframe to hive acid table.
- dataframe to insert
- Transaction under which the operation is being done
Delete rows from the table based on condtional
expression.
Delete rows from the table based on condtional
expression.
Note: This API is transactional in nature.
- Boolean SQL Expression filtering rows to be deleted
- Transaction under which DELETE will be performed
Return an RDD on top of Hive ACID table
Return an RDD on top of Hive ACID table
- columns needed
- filters that can be pushed down to file format
- read conf
- current Transaction under which ead will be done
- metadata object
Appends a given dataframe df into the hive acid table
Appends a given dataframe df into the hive acid table
Note: This API is transactional in nature.
- dataframe to insert
- current transaction underwhich INSERT operation needs to run.
Optional. In a same transaction, multiple statements like INSERT/UPDATE/DELETE (like in case of MERGE) can be issued. statementId has to be different for them to ensure delta collision is avoided for them during writes.
Overwrites a given dataframe df onto the hive acid table
Overwrites a given dataframe df onto the hive acid table
Note: This API is transactional in nature.
- dataframe to insert
- current transaction under which insertOverWrite need to be run
Optional. In a same transaction, multiple statements like INSERT/UPDATE/DELETE (like in case of MERGE) can be issued. statementId has to be different for them to ensure delta collision is avoided for them during writes.
Merge from sourceDf to the current Table
Merge from sourceDf to the current Table
NOT TO BE USED EXTERNALLY Not protected by transactionality, it is assumed curTxn is already set before calling this
NOT TO BE USED EXTERNALLY Not protected by transactionality, it is assumed curTxn is already set before calling this
DataFrame to be used to update is supposed to have same schema as tableSchemaWithRowId
Transaction under which this operation will be performed
Optional. In a same transaction, multiple statements like INSERT/UPDATE/DELETE (like in case of MERGE) can be issued. statementId has to be different for them to ensure delta collision is avoided for them during writes.
Uodated Df is written to the table
Uodated Df is written to the table
DataFrame to be used to update is supposed to have same schema as tableSchemaWithRowId
- Transaction under which this operation will be performed
Optional. In a same transaction, multiple statements like INSERT/UPDATE/DELETE (like in case of MERGE) can be issued. statementId has to be different for them to ensure delta collision is avoided for them during writes.
- additional parameters
- spark session object
Update rows in the hive acid table based on condition and newValues
Update rows in the hive acid table based on condition and newValues
Note: This API is transactional in nature.
- Optional condition string to identify rows which needs to be updated, if not specified then it means complete table.
- Map of (column, value) to set
Update rows in the hive acid table based on condition and newValues
Update rows in the hive acid table based on condition and newValues
Note: This API is transactional in nature.
- condition string to identify rows which needs to be updated
- Map of (column, value) to set
HiveAcidTable uses Delegate pattern to delegate it's API to this object