SQL is one of the most important skills for a data analyst. If you want a data analyst job, SQL is not optional. Most analyst roles expect you to pull data from databases, clean messy records, filter business parameters, combine multiple tables, calculate business metrics, and answer stakeholder questions.
Think of it this way: if Excel helps you work with individual spreadsheets, SQL helps you interact directly with entire company databases. This guide explains what SQL is, how SQL syntax works, and the core SQL commands every beginner should learn.
What Is SQL?
SQL stands for Structured Query Language. It is the standard language used to work with relational databases. You use SQL to read, filter, sort, group, and aggregate data, as well as create and update records.
SELECT name, salary
FROM employees
WHERE salary > 50000;Query Logic: This returns the name and salary attributes from the employees table, filtering out anyone who earns less than or equal to 50,000.
Why SQL Matters for Data Analysts
SQL is one of the most requested skills in data analyst job descriptions. Companies store massive amounts of business data in relational databases (customer profiles, transaction history, website click events, subscription metrics, and marketing performance).
Using SQL, a data analyst can independently answer questions like:
- Which products sold the most last month?
- How many active subscribers do we currently support?
- Which marketing acquisition channels convert best?
- Which customers have not purchased in the last 90 days?
SQL vs Excel
Beginners often ask whether they should learn Excel first or SQL. The short answer is: learn both. They serve different analytical scales and purposes.
| Feature | Microsoft Excel | SQL Databases |
|---|---|---|
| Ideal Dataset Size | Small datasets (up to 1M rows max) | Millions to billions of rows |
| Workflow Style | Manual analysis & quick charts | Relational querying & automation |
| Data Connection | Static file uploads | Live production server connection |
| Best Used For | Ad hoc calculations & reporting | Data cleaning, merging, and ingestion |
How SQL Works
Think of a database like a digital filing cabinet. Inside this database, tables store the data, rows store individual records, and columns store specific attributes.
Here is how a simple customer table looks:
| customer_id | name | city | signup_date |
|---|---|---|---|
| 101 | Rahul | Mumbai | 2025-01-10 |
| 102 | Priya | Delhi | 2025-02-14 |
| 103 | Arjun | Bangalore | 2025-03-01 |
Basic SQL Syntax Structure
Most SQL queries follow a consistent, linear structure:
SELECT column_name
FROM table_name
WHERE condition;Chooses the columns you want to view.
Specifies the table containing the target columns.
Filters records based on specific criteria or conditionals.
SQL Syntax Rules for Beginners
SQL syntax is highly readable and resembles plain English, but precise formatting rules must be followed:
1. SQL Keywords
Keywords (SELECT, FROM, JOIN) are case-insensitive, but writing them in UPPERCASE is standard practice for readability.
2. Semicolons
Used to signal the end of a query statement. Though optional in some database shells, it is a recommended syntax standard.
3. Quote Usage
Text string parameters require single quotes (e.g. 'Mumbai'), whereas numerical inputs must not be quoted (e.g. salary > 50000).
4. Comments
Single line comments begin with --. Multi-line blocks can be wrapped inside /* ... */ comments.
Core SQL Commands
Here are the fundamental commands used by every data analyst to request and modify tables:
Retrieves specified columns
Used to select target attributes. Using SELECT * pulls all fields, which can slow queries and should generally be avoided in production environments.
Applies conditional filters
Allows filtering rows to return only those matching conditional inputs (e.g. WHERE amount > 1000).
Sorts database outputs
Sorts results based on specific columns. Supports ascending order (ASC, lowest first) and descending order (DESC, highest first).
Limits rows / Deduplicates
LIMIT 10 restricts output logs to 10 rows for fast pre-views. DISTINCT filters out duplicate rows to return unique values.
SQL Operators & Wildcards
Operators expand your power to query and filter tables based on multi-parameter structures:
AND / OR
Combines multiple filters. AND expects all conditions true; OR requires only one to be true.
IN (val1, val2)
A clean alternative to writing multiple nested OR conditions (e.g. WHERE city IN ('Mumbai', 'Delhi')).
BETWEEN min AND max
Filters value ranges, inclusive of boundary values (e.g. WHERE salary BETWEEN 30000 AND 70000).
LIKE (Pattern matching)
Finds custom text patterns using wildcards. WHERE name LIKE 'A%' matches names starting with 'A'.
SQL Aggregate Functions
Aggregations perform calculations on multiple rows to return a single summarized value. These are heavily used in sales, revenue, and KPI modeling:
GROUP BY & HAVING Clauses
The GROUP BY clause arranges similar data rows into groups before aggregation. For example, grouping sales transactions by city.
SELECT city, COUNT(*)
FROM customers
GROUP BY city;Output Table:
| city | count |
|---|---|
| Mumbai | 120 |
| Delhi | 95 |
Crucially, if you want to filter these grouped results, you must use HAVING instead of WHERE.
Key Interview Distinction:
WHERE filters records before grouping occurs.
HAVING filters records after the aggregation is calculated.
SQL Joins Explained
Joins combine columns from one or more tables using values common to each. This is essential, as transaction and customer profile records are usually normalized in separate tables.
INNER JOIN
Returns records with matching keys in both tables. Unmatched rows are excluded.
SELECT c.name, o.order_id
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;LEFT JOIN (or LEFT OUTER JOIN)
Returns all rows from the left table, plus matched rows from the right. If no match exists, NULLs are returned for right-side fields.
SELECT c.name, o.order_id
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;Common SQL Beginner Mistakes
Avoid these typical design patterns and syntax bugs when getting started:
Missing Quotes on Strings: Text search conditions must use single quotes (e.g. WHERE city = 'Mumbai', not WHERE city = Mumbai).
Forgetting GROUP BY: Selecting a normal column along with an aggregate function requires adding that column to the GROUP BY clause.
Aggregates in WHERE clause: Using filters like WHERE COUNT(*) > 10 is invalid. You must filter aggregates using HAVING COUNT(*) > 10.
Data Analyst SQL Learning Roadmap
To become job-ready, we recommend mastering SQL syntax blocks in this specific order:
Basic Queries
SELECT, FROM, and WHERE queries.
Sorting & Deduplication
ORDER BY, LIMIT, and DISTINCT commands.
Summary Metrics
COUNT, SUM, AVG, MIN, and MAX aggregations.
Advanced Grouping
GROUP BY and HAVING statements.
Relational Merges
INNER, LEFT, RIGHT, and FULL outer JOINs.
Complex Querying
Subqueries, Common Table Expressions (CTEs), and Window Functions.
Grit Over Excuses.
— The Grito Team