conditions
all_numbers_only(column_or_name)
Checks if the given column or string contains only numeric characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry contains only numeric characters. |
Examples:
>>> df = spark.createDataFrame([("123",), ("4567",), ("89a",), ("",), ("0",)], ["value"])
>>> df.select(all_numbers(df["value"]).alias("is_all_numbers")).show()
+-------------+
|is_all_numbers|
+-------------+
| true|
| true|
| false|
| false|
| true|
+-------------+
Source code in pysparky/functions/conditions.py
condition_and(*conditions)
Combines multiple conditions using logical AND.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*conditions
|
ColumnOrName
|
Multiple PySpark Column objects or SQL expression strings representing conditions. |
()
|
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A single PySpark Column object representing the combined condition. |
Examples:
Source code in pysparky/functions/conditions.py
condition_or(*conditions)
Combines multiple conditions using logical OR.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*conditions
|
ColumnOrName
|
Multiple PySpark Column objects or SQL expression strings representing conditions. |
()
|
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A single PySpark Column object representing the combined condition. |
Examples:
Source code in pysparky/functions/conditions.py
n_character_only(column_or_name, n)
Checks if the given column or string contains exactly n
alphabetic characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_or_name
|
Column
|
The column or string to be checked. |
required |
n
|
int
|
The exact number of alphabetic characters to match. |
required |
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry matches the regular expression. |
Source code in pysparky/functions/conditions.py
n_numbers_only(column_or_name, n)
Checks if the given column or string contains exactly n
numeric characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
n
|
int | str
|
The exact number of numeric characters to match. |
required |
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry matches the regular expression. |
Examples:
>>> df = spark.createDataFrame([("123",), ("4567",), ("89a",), ("",), ("0",)], ["value"])
>>> df.select(n_numbers_only(df["value"], 3).alias("is_n_numbers")).show()
+-------------+
|is_n_numbers |
+-------------+
| true|
| false|
| false|
| false|
| false|
+-------------+
Source code in pysparky/functions/conditions.py
printable_only(column_or_name)
Checks if the given column or string contains only printable characters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A column of boolean values indicating whether each entry contains only printable characters. |
Examples:
>>> df = spark.createDataFrame([("Hello!",), ("World",), ("123",), ("",), ("Non-printable",)], ["value"])
>>> df.select(printable_only(df["value"]).alias("is_printable")).show()
+-------------+
|is_printable |
+-------------+
| true|
| true|
| true|
| false|
| false|
+-------------+
Source code in pysparky/functions/conditions.py
two_character_only(column_or_name)
Checks if the given column or string contains exactly two alphabetic characters (either lowercase or uppercase).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_or_name
|
ColumnOrName
|
The column or string to be checked. |
required |
Returns:
Name | Type | Description |
---|---|---|
Column |
Column
|
A boolean column indicating whether the input matches the pattern of exactly two alphabetic characters. |
Examples:
>>> df = spark.createDataFrame([("aa",), ("ZZ",), ("a1",), ("abc",)], ["value"])
>>> df.select(two_character_only(df["value"]).alias("is_two_char")).show()
+-----------+
|is_two_char|
+-----------+
| true|
| true|
| false|
| false|
+-----------+