Spark SQL- operators Cheatsheet- Explanation with Usecases

Spark SQL Operators Cheatsheet

1. Arithmetic Operators

OperatorSyntaxDescriptionExample
+a + bAdds two valuesSELECT 5 + 3;
-a - bSubtracts one value from anotherSELECT 5 - 3;
*a * bMultiplies two valuesSELECT 5 * 3;
/a / bDivides one value by anotherSELECT 6 / 2;
%a % bReturns the remainder of divisionSELECT 5 % 2;

2. Comparison Operators

OperatorSyntaxDescriptionExample
=a = bEqual toSELECT * FROM table WHERE col = 10;
!= or <>a != b or a <> bNot equal toSELECT * FROM table WHERE col != 10;
>a > bGreater thanSELECT * FROM table WHERE col > 10;
<a < bLess thanSELECT * FROM table WHERE col < 10;
>=a >= bGreater than or equal toSELECT * FROM table WHERE col >= 10;
<=a <= bLess than or equal toSELECT * FROM table WHERE col <= 10;

3. Logical Operators

OperatorSyntaxDescriptionExample
ANDa AND bReturns true if both conditions are trueSELECT * FROM table WHERE a > 10 AND b < 20;
ORa OR bReturns true if any condition is trueSELECT * FROM table WHERE a > 10 OR b < 20;
NOTNOT aReverses the boolean value of a conditionSELECT * FROM table WHERE NOT a > 10;

4. String Operators

OperatorSyntaxDescriptionExample
```ab`Concatenates two strings`SELECT ‘Hello’‘ World’;`
LIKEa LIKE patternMatches a string to a patternSELECT * FROM table WHERE name LIKE 'A%';
RLIKEa RLIKE patternMatches a string to a regex patternSELECT * FROM table WHERE name RLIKE '^A';
ILIKEa ILIKE patternCase-insensitive LIKE pattern matchingSELECT * FROM table WHERE name ILIKE 'a%';

5. Null Handling Operators

OperatorSyntaxDescriptionExample
IS NULLa IS NULLChecks if a value is NULLSELECT * FROM table WHERE col IS NULL;
IS NOT NULLa IS NOT NULLChecks if a value is NOT NULLSELECT * FROM table WHERE col IS NOT NULL;
NULLIFNULLIF(a, b)Returns NULL if a equals b, otherwise aSELECT NULLIF(10, 10);
COALESCECOALESCE(a, b, ...)Returns the first non-NULL valueSELECT COALESCE(NULL, 'default');

6. Set Operators

OperatorSyntaxDescriptionExample
UNIONquery1 UNION query2Combines results of two queries (removes duplicates)SELECT col FROM table1 UNION SELECT col FROM table2;
UNION ALLquery1 UNION ALL query2Combines results of two queries (includes duplicates)SELECT col FROM table1 UNION ALL SELECT col FROM table2;
INTERSECTquery1 INTERSECT query2Returns common rows from both queriesSELECT col FROM table1 INTERSECT SELECT col FROM table2;
EXCEPTquery1 EXCEPT query2Returns rows from the first query not in the secondSELECT col FROM table1 EXCEPT SELECT col FROM table2;

7. Aggregate Functions (Operators)

FunctionSyntaxDescriptionExample
COUNTCOUNT(*) or COUNT(col)Counts rows or non-NULL valuesSELECT COUNT(*) FROM table;
SUMSUM(col)Sums up numeric valuesSELECT SUM(sales) FROM table;
AVGAVG(col)Calculates the averageSELECT AVG(price) FROM table;
MINMIN(col)Finds the minimum valueSELECT MIN(price) FROM table;
MAXMAX(col)Finds the maximum valueSELECT MAX(price) FROM table;
GROUP BYGROUP BY colGroups rows based on a column valueSELECT category, SUM(sales) FROM table GROUP BY category;

8. Window Functions

FunctionSyntaxDescriptionExample
ROW_NUMBERROW_NUMBER() OVER (PARTITION BY col ORDER BY col2)Assigns a unique number to each row within a partitionSELECT ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales) AS rank FROM table;
RANKRANK() OVER (PARTITION BY col ORDER BY col2)Assigns rank to rows with gaps for duplicatesSELECT RANK() OVER (PARTITION BY category ORDER BY sales DESC) AS rank FROM table;
DENSE_RANKDENSE_RANK() OVER (PARTITION BY col ORDER BY col2)Assigns rank to rows without gapsSELECT DENSE_RANK() OVER (PARTITION BY category ORDER BY sales DESC) AS rank FROM table;
NTILENTILE(n) OVER (PARTITION BY col ORDER BY col2)Divides rows into n bucketsSELECT NTILE(4) OVER (ORDER BY sales) AS quartile FROM table;
LEADLEAD(col, offset, default) OVER (PARTITION BY col ORDER BY col2)Accesses the value of the next rowSELECT LEAD(sales, 1, 0) OVER (ORDER BY date) FROM table;
LAGLAG(col, offset, default) OVER (PARTITION BY col ORDER BY col2)Accesses the value of the previous rowSELECT LAG(sales, 1, 0) OVER (ORDER BY date) FROM table;

9. Miscellaneous Operators

OperatorSyntaxDescriptionExample
CASTCAST(expr AS type)Converts a value to a specified typeSELECT CAST(price AS STRING) FROM table;
CASECASE WHEN condition THEN result ELSE result ENDConditional logicSELECT CASE WHEN sales > 100 THEN 'High' ELSE 'Low' END AS category FROM table;
DISTINCTDISTINCT colReturns unique valuesSELECT DISTINCT category FROM table;

This cheatsheet provides a comprehensive overview of commonly used Spark SQL operators and functions with their syntax, descriptions, and examples. Use it as a reference to efficiently write and optimize Spark SQL queries!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest Posts

Discover more from HintsToday

Subscribe now to keep reading and get access to the full archive.

Continue reading