Linkedin Learning

By Bill Weinman

Installation

  • SQL database manager
    • Using SQLite Studio v3.2.1
  • Text editor
    • BBEdit for Mac
    • Notepad++ for Windows

SQL Overview

SQL is a standard query language for relational databases. The main operations on a database are Create, Read, Update, and Delete (CRUD).

-- SQL statement
SELECT * FROM Countries WHERE Continent = 'Europe';

SQL is made up of statements. Each statement contains one or more clauses. A clause starts with a keyword like SELECT followed by an argument or expression. Statements are terminated by semicolons. Semicolons are optional, but it’s considered good practice to use them.

  • Create: INSERT INTO
  • Read: SELECT
  • Update: UPDATE
  • Delete: DELETE

Relational databases are organized into tables. A single database can have one or more tables. Each table has columns and rows. A column is often referred to as a field. Each field typically has single type. A row is also called a record. Each row can have one or more fields.

One fundamental idea is the primary key. This is almost always a field of unique integers that identifies each row in a table. The primary key is used to create relations between tables in a database. When a key in a table refers to the primary key in a different table, the key is called a foreign key.

Reading the database

Most or all example SQL statements in this section will query a “World” table where each row is a country and each column is a basic piece of information about each country (e.g. name of the country, what continent it’s located, etc.)

-- Simple select statement
SELECT 'Ethiopia';

This statement returns a row containing Ethiopia. Single quotes are string literals.

SELECT Name, Capital AS "Capital City" FROM Countries ORDER BY Name;

“Name” and “Capital” are columns in the “Countries” table. The AS clause can be used to rename the column in the returned table. ORDER BY is used to sort the query result by the column “Name”.

The order of keywords matters. See the sample query below:

SELECT Name FROM Country WHERE Continent = 'North America' LIMIT 5 OFFSET 10;

This statement returns the “Name” column from the “Country” table, but only returns countries on the continent of ‘North America’. LIMIT truncates the result to five rows. OFFSET can be used to specify where in the result to return. So, according to the query below, we want to return five rows beginning with the eleventh row in the result.

SELECT COUNT(*) FROM Country WHERE Population > 100000 AND Continent = 'Europe';

You can count the number of rows in a query by using the COUNT() function. Using * as the input counts all possible rows. If a column name is specified, it only counts over rows that have data in the named column.

Creating records

The following query creates a record in the customer table:

INSERT INTO customer (name, address, city, state, zip)
  VALUES ('Josiah Gibbs', '3141 Tau way', 'Philadelphia', 'PA', 31415)

It’s not necessary to specify all fields for a record; any missing field will be filled with a NULL value.

Updating records

-- **Important**: Use a WHERE clause when updating.  Otherwise, all records
will be updated.
UPDATE customer SET address = '628 Pi Ave', zip = 62831 WHERE id = 1;

This updates the record where id=1. It’s extremely important to specify a WHERE clause. If you don’t, all the records in the table will be updated.

Deleting records

DELETE FROM customer WHERE id = 2;

Pretty simple. Always specify a WHERE clause.

Fundamental Concepts

Creating and Deleting Tables

CREATE TABLE periodic (
  symbol TEXT,
  name TEXT,
  -- Can't put comma after last declaration
  "atomic number" INTEGER
);

This statement creates a table named periodic. The stuff between the parentheses is the table’s schema, which describes the fields of the table.

You can delete tables like this:

DROP TABLE periodic

-- or...
DROP TABLE IF EXISTS periodic

The first statement will give an error if periodic doesn’t exist; DROP TABLE IF EXISTS will not. Most of the time, you’ll probably use the latter since it avoids the error.

Rows

-- 1
INSERT INTO periodic VALUES ('H', 'Hydrogen', 1);

-- 2
INSERT INTO periodic (symbol, name) VALUES ('He', 'Helium');

-- 3
INSERT INTO periodic DEFAULT VALUES;

-- 4
INSERT INTO periodic (symbol, name, "atomic number") SELECT col1, col2, col3 FROM othertable;

There are a couple ways to add records to a table. Statement 1 just adds the specified row. The syntax in Statement 2 can be used to add a partial record. The missing field will be filled with a NULL value. Statement 3 creates a new row where all fields are NULL values. Finally, Statement 4 uses a SELECT clause potentially add multiple rows from a different table.

NULL keyword

The NULL keyword represents a lack of value. It is not itself a value, therefore you can’t test for it in a regular conditional expression using the equal sign.

-- 1
SELECT * FROM elements WHERE symbol IS NULL;

-- 2
SELECT * FROM elements WHERE symbol IS NOT NULL;

-- 3
CREATE TABLE physics_symbols (
  symbol TEXT NOT NULL,
  meaning TEXT
);

Statements 1 and 2 show how you would test for the NULL keyword. Statement 3 demonstrates creating a schema using a NULL constraint. When a field is specified with the NOT NULL constraint, SQL will emit an error if you try to add a record with a NULL keyword in the symbol field.

Column constraints

CREATE TABLE elements (
  symbol TEXT UNIQUE NOT NULL,
  name TEXT DEFAULT 'element name'
)

In addition to the NOT NULL constraint from the last section, you can use other column constraints. UNIQUE will ensure that no duplicate values are put into a column. In the example above, symbol will be constrained to have only unique values AND not be a NULL value. DEFAULT can be used to fill in a value if one isn’t given. You can also use multiple constraints at the same time.

Changing the schema

This syntax is for SQLite. Other databases may use different syntax.

ALTER TABLE elements ADD "atomic number" TEXT UNIQUE;

You can add new fields using the ALTER TABLE ADD keywords. After the ADD you can specify a field definition.

ID columns

This syntax is for SQLite. Other databases may use different syntax.

CREATE TABLE elements (
  id INTEGER PRIMARY KEY;
  symbol TEXT;
)

The PRIMARY KEY column constraint will generate a unique, sequential integer for each new record starting from 1. It implicitly enforces the unique constraint.

Filtering data

Besides using AND and OR in boolean expressions, SQL also has the LIKE keyword. This is a sort of regular expression clause. Underscores, _, can be used to specify any character at a specific place in text. In statement 2, the query will match element names where the second letter is y (first letter can be anything). The percent sign, %, is a wildcard where it can be zero or more characters.

SELECT name FROM elements
  WHERE "atomic number" > 8 OR "atomic number" < 80;

SELECT name FROM elements
  WHERE "atomic number" > 8 AND name LIKE '_y%';

SELECT name FROM elements
  WHERE name IN ('beryllium', 'oxygen');

You can filter duplicate values using SELECT DISTINCT. It can also be used with multiple columns.

SELECT DISTINCT Continent FROM country;

There is a way to specify conditional expressions. In SQL, the integer zero is equal to false; other numbers are true. You can either test the “truthiness” of a value implicitly or test explicitly for a value. The WHEN clause can be any boolean expression.

-- Implicitly test truthiness
SELECT
  CASE WHEN a THEN 'true' ELSE 'false' END AS value_a,
  CASE WHEN b THEN 'true' ELSE 'false' END AS value_a
FROM test_table;

-- Explicitly test for a value
SELECT
  CASE a WHEN 1 THEN 'true' ELSE 'false' END AS value_a,
  CASE b WHEN 1 THEN 'true' ELSE 'false' END AS value_a
FROM test_table;

Sorting data

SELECT name, continent FROM Country ORDER BY continent, population DESC;

You can sort results using ORDER BY. When you sort on multiple columns, the order in which you specify columns matters: The data will be sorted by column 1 then by column 2, etc. You can sort in descending order using the DESC keyword (ASC is also available, but is the default ordering).

Relationships: JOIN

More complex data can be modeled by connecting data in different tables using a primary key. This primary key can be used with the JOIN keyword to query and return data from the multiple tables.

The default JOIN is an inner join. This means it returns data that occurs in both left AND right tables. A LEFT OUTER JOIN returns the data that meet the INNER JOIN condition as well as returning all data in the left table. RIGHT OUTER JOIN can be rewritten as a LEFT OUTER JOIN by switching the order of the tables. Therefore, it’s not implemented in many database systems.

SELECT l.description AS left, r.description AS right
FROM left AS l
JOIN right AS r ON l.id = r.id
;