Configuration

Property name	Default	Meaning	Since Version
spark.hyperspace.system.path	“(value of `spark.sql.warehouse.dir`)/indexes”	Root directory to store Hyperspace index files.	0.1.0
spark.hyperspace.index.numBuckets	value of `spark.sql.shuffle.partitions`, which defaults to 200	Number of buckets to use when creating covering indexes.	0.1.0
spark.hyperspace.index.cache.expiryDurationInSeconds	300	Number of seconds since the last index modification action before index metadata cache is marked as stale.	0.1.0
spark.hyperspace.explain.displayMode	“plaintext”	Display mode for Hyperspace explain() output. The valid set of values is: “console”, “plaintext”, “html”.	0.1.0
spark.hyperspace.explain.displayMode.highlight.beginTag	”” (An empty string)	Tag to mark beginning of highlight portion in explain() output according to the display mode.	0.1.0
spark.hyperspace.explain.displayMode.highlight.endTag	”” (An empty string)	Tag to mark ending of highlight portion in explain() output according to the display mode.	0.1.0
spark.hyperspace.index.lineage.enabled	false	Add lineage column to index upon creation to track source data file for each index record. Lineage is required to handle deleted files in Hybrid Scan, or to refresh an index in the incremental mode. Adding lineage will increase the size of the index, proportional to the number of distinct source data files the index is built on.	0.3.0
spark.hyperspace.index.optimize.fileSizeThreshold	256MB	Threshold of size of index files in bytes to optimize. Files with size below this threshold are eligible for merge during index optimization.	0.3.0
spark.hyperspace.index.hybridscan.enabled	false	Enable Hybrid Scan at query execution time. If enabled, Hyperspace considers an index with appended and/or deleted source data files as a candidate during query optimization, with some additional optimization overhead.	0.3.0
spark.hyperspace.index.hybridscan.maxAppendedRatio	0.3	Ratio threshold of total bytes of appended files to apply Hybrid Scan. If there’s more appended data than this threshold, Hybrid Scan won’t be applied. It’s 30% (0.3) by default.	0.4.0
spark.hyperspace.index.hybridscan.maxDeletedRatio	0.2	Ratio threshold of total bytes of deleted files to apply Hybrid Scan. If there’s more deleted data than this threshold, Hybrid Scan won’t be applied. It’s 20% (0.2) by default. Lineage is required to handle deleted files with Hybrid Scan.	0.4.0
spark.hyperspace.source.globbingPattern	N/A	DataFrameReader option (not a spark config) to allow indexes on globbing patterns. E.g. `spark.read.option("spark.hyperspace.source.globbingPattern", "/temp//").parquet("/temp//")`	0.4.0