Package

com.qubole.spark.hiveacid

merge

Permalink

package merge

Visibility
  1. Public
  2. All

Type Members

  1. case class MergeCondition(expression: Expression) extends Product with Serializable

    Permalink
  2. class MergeImpl extends Logging

    Permalink

    Implements Algorithm to do Merge It does target right outer join source on the merge condition specified.

    Implements Algorithm to do Merge It does target right outer join source on the merge condition specified. MergeJoin = target right outer join source

    Filter(MergeJoin, target.rowId != null) will provide the matched rows for UPDATE/DELETE Filter(MergeJoin, target.rowId == null) will provide non-matched rows for INSERT

    DFs with the rows to UPDATE/DELETE/INSERT are created and corresponding operations are performed on HiveAcidTable. Under same transactions, different statementIds are assigned to each operation so that they don;t collide while writing delta files.

    Performance considerations:

    Special handling is done where only INSERT clause is provided. In such case source left anti join target gives the rows to be inserted instead of expensive right outer join.

    We donot want right outer join to happen for every operation. So join dataframe is converted to RDD and back to Dataframe. This ensures that transformations on the converted DataFrame donot recompute the RDD i.e., join is executed just once.

    According to SQL standard we need to error when multiple source rows match same target row. We use the same join done above for other operations to figure that out instead of running more joins.

  3. sealed trait MergeWhenClause extends AnyRef

    Permalink
  4. case class MergeWhenDelete(matchCondition: Option[Expression]) extends MergeWhenClause with Product with Serializable

    Permalink
  5. case class MergeWhenNotInsert(condition: Option[Expression], insertValues: Seq[Expression]) extends MergeWhenClause with Product with Serializable

    Permalink
  6. case class MergeWhenUpdateClause(matchCondition: Option[Expression], setExpression: Map[String, Expression], isStar: Boolean) extends MergeWhenClause with Product with Serializable

    Permalink

Value Members

  1. object MergeWhenClause

    Permalink

Ungrouped