conditions
condition_and(*conditions)
Combines multiple conditions using logical AND.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*conditions
|
ColumnOrName
|
Multiple PySpark Column objects or SQL expression strings representing conditions. |
()
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A single PySpark Column object representing the combined condition. |
Example
Source code in pysparky/functions/conditions.py
condition_or(*conditions)
Combines multiple conditions using logical OR.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*conditions
|
ColumnOrName
|
Multiple PySpark Column objects or SQL expression strings representing conditions. |
()
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A single PySpark Column object representing the combined condition. |
Example
Source code in pysparky/functions/conditions.py
is_all_numbers_only(column_or_name)
Checks if the given column or string contains only numeric characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry contains only numeric characters. |
Example
Source code in pysparky/functions/conditions.py
is_array_monotonic(col, cmp_fn, null_policy='forbid')
Check if an array column is monotonic according to a comparator and a NULL policy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col
|
ColumnOrName
|
Array column to check. |
required |
cmp_fn
|
callable
|
Binary function (x, y) -> Column[bool]-like. Typical choices: operator.lt # strictly increasing operator.le # non-decreasing operator.gt # strictly decreasing operator.ge # non-increasing |
required |
null_policy
|
str
|
How to treat NULL elements inside the array: - "forbid" : any NULL inside the array makes the result False - "ignore" : drop all NULLs before checking monotonicity - "allow_first" : allow multiple NULLs at the first positions (ignored in check) - "allow_last" : allow multiple NULLs at the last positions (ignored in check) - "allow_ends" : allow multiple NULLs at the first and/or last positions |
'forbid'
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
Boolean column: True if the array is monotonic under the chosen comparator |
Column
|
and NULL policy. Empty / single-element arrays return True. |
Example
Source code in pysparky/functions/conditions.py
255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 | |
is_array_non_decreasing(col, null_policy='forbid')
Check if an array column is non-decreasing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col
|
ColumnOrName
|
Array column to check |
required |
null_policy
|
str
|
How to treat NULL elements inside the array. |
'forbid'
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
Boolean column: True if the array is non-decreasing, False otherwise. |
Example
Source code in pysparky/functions/conditions.py
is_array_non_increasing(col, null_policy='forbid')
Check if an array column is non-increasing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col
|
ColumnOrName
|
Array column to check |
required |
null_policy
|
str
|
How to treat NULL elements inside the array. |
'forbid'
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
Boolean column: True if the array is non-increasing, False otherwise. |
Example
Source code in pysparky/functions/conditions.py
is_array_strictly_decreasing(col, null_policy='forbid')
Check if an array column is strictly decreasing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col
|
ColumnOrName
|
Array column to check |
required |
null_policy
|
str
|
How to treat NULL elements inside the array. |
'forbid'
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
Boolean column: True if the array is strictly decreasing, False otherwise. |
Example
Source code in pysparky/functions/conditions.py
is_array_strictly_increasing(col, null_policy='forbid')
Check if an array column is strictly increasing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
col
|
ColumnOrName
|
Array column to check |
required |
null_policy
|
str
|
How to treat NULL elements inside the array. |
'forbid'
|
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
Boolean column: True if the array is strictly increasing, False otherwise. |
Example
Source code in pysparky/functions/conditions.py
is_n_character_only(column_or_name, n)
Checks if the given column or string contains exactly n alphabetic characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
Column
|
The column or string to be checked. |
required |
n
|
int
|
The exact number of alphabetic characters to match. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry matches the regular expression. |
Example
Source code in pysparky/functions/conditions.py
is_n_numbers_only(column_or_name, n)
Checks if the given column or string contains exactly n numeric characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
n
|
int | str
|
The exact number of numeric characters to match. or "+" for any length number. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry matches the regular expression. |
Example
Source code in pysparky/functions/conditions.py
is_printable_only(column_or_name)
Checks if the given column or string contains only printable characters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry contains only printable characters. |
Example
Source code in pysparky/functions/conditions.py
is_two_character_only(column_or_name)
Checks if the given column or string contains exactly two alphabetic characters (either lowercase or uppercase).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A boolean column indicating whether the input matches the pattern of exactly two alphabetic characters. |
Example
Source code in pysparky/functions/conditions.py
startswiths(column_or_name, list_of_strings)
Creates a PySpark Column expression to check if the given column starts with any string in the list.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
column_or_name
|
ColumnOrName
|
The column to check. |
required |
list_of_strings
|
List[str]
|
A list of strings to check if the column starts with. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Column |
Column
|
A PySpark Column expression that evaluates to True if the column starts with any string in the list, otherwise False. |
Example
>>> df = spark.createDataFrame([("apple",), ("banana",), ("cherry",)], ["fruit"])
>>> df.select("fruit", startswiths(F.col("fruit"), ["ap", "ch"]).alias("starts_with")).show()
+------+-----------+
| fruit|starts_with|
+------+-----------+
| apple| true|
|banana| false|
|cherry| true|
+------+-----------+