recover_partitions()
: Recovers all the partitions in the directory of a
table and update the catalog. This only works for partitioned tables and not
un-partitioned tables or views.
refresh_by_path()
: Invalidates and refreshes all the cached data (and the
associated metadata) for any Dataset that contains the given data source
path. Path matching is by prefix, i.e. "/" would invalidate everything that
is cached.
refresh_table()
: Invalidates and refreshes all the cached data and
metadata of the given table. For performance reasons, Spark SQL or the
external data source library it uses might cache certain metadata about a
table, such as the location of blocks. When those change outside of Spark
SQL, users should call this function to invalidate the cache. If this table
is cached as an InMemoryRelation
, drop the original cached version and make
the new version cached lazily.
Usage
recover_partitions(sc, table)
refresh_by_path(sc, path)
refresh_table(sc, table)
Arguments
- sc
A spark_connection
.
- table
character(1)
. The name of the table.
- path
character(1)
. The path to refresh.
Value
NULL
, invisibly. These functions are mostly called for their side effects.