Jump to: Navigation
A gentle Introduction to SQL (Part I)
This is a gentle introduction to SQL (Structured Query Language), a language used to query data stored in a relational database management system (RDBMS).
Chances are, if you are programming using one of the popular frameworks, you will not be writing raw SQL, instead your framework will have an ORM (Object Relation Mapping) layer to manage data access for you (i.e Rails has ActiveRecord, Java has Hibernate etc). In fact, most times you should not write raw SQL. However, I feel that knowing at least the basics of writing SQL is a useful life skill.
This is part one of a multi part tutorial, and covers the basics of querying. This tutorial is aimed at people who have little to no knowledge/experience with SQL.
For my example, I’m going to use MySQL, a popular free/open source (for the Community Edition) database, but it will mostly work for any popular database (Postgress, Microsoft SQL Server etc) or any other ANSI SQL 92 compliant database. I spent part of my misspent youth (well, 20s) working with Microsoft SQL Server, both as a developer and Administrator (I have Microsoft SQL 6.5 Developer and Administrator Certifications gathering dust somewhere in our house), so lets jump right in.
Prerequisites
There are little to no prerequisites, though access to a server running MySQL would be helpful. The scripts I use to create the tables and insert the data we are going to use in the rest of the tutorial are in github here.
The examples I’m using is that of a small College that maintains a small amount of data in a MySQL database. The tables are;
- accounts (information about each students balance)
- enrollments (courses that students are enrolled in)
- CourseInstructors (which courses are taught by which instructors)
- instructors (a list of all the instructors)
- courses (a listing of all the courses)
- students (a listing of all the students)
- coursePricings (a listing of course pricings by course types)
Simple SELECT
At its root, SQL is about answering questions given some data. For our small college, I would like to answer some basic questions. Lets start with queries that can be answered by the data in a single table. For each query, Ill list the question and then the SQL query to get that information (if you create the tables in this script, you can follow along by running the queries at the sql command line or in SQL Workbench (a graphical UI for MySQL)
I’d like to know if my college has any students called Kamaru
SELECT * FROM students WHERE lastName = ‘Kamaru’;
The * operator is a shorthand way to ask for all the columns in the table (or tables) being queried.
Tell me what courses we offer that have “Calculus” in their name
SELECT * FROM courses WHERE name LIKE "%Calculus%";
The “%” character specifies to match any characters. So the above would match “Introduction to Calculus” and “Calculus II”. If we changed the clause to “Calculus%” that would only match the latter (words beginning with Calculus)
Is Samira Khan one of the course instructors?
SELECT * FROM instructors where firstName = ‘Samira’ AND lastName = ‘Khan’;
Is Jacques Anquetil up to date on his course payment?
SELECT a.* FROM accounts a
JOIN students s
ON s.studentId = a.studentId
WHERE s.firstName = 'Jacques'
AND s.lastName = 'Anquetil';
JOIN
The slightly more complex version of a SELECT statement is one where you ask a question that cannot be answered by only looking at a single table.
What courses is Miriam Schuyler enrolled in?
SELECT s.* FROM students s
JOIN enrollments e
ON e.studentId = s.studentId
WHERE s.firstName = ‘Miriam’
AND s.lastName = ‘Schuyler’;
The s and e above are simply aliases, they allow us to give shorter designations to the enrollments and students table to make referring to them easier in the rest of the query. The query above is functionally similar to
SELECT students.* FROM students
JOIN enrollments
ON enrollments.studentId = students.studentId
WHERE students.firstName = 'Miriam'
AND students.lastName = 'Schuyler';
What courses does Wanjiku Njeri teach? (I only want the course name, hence the use of c.name rather than c.*)
SELECT c.name FROM courses c
JOIN CourseInstructors ci
ON ci.courseId = c.courseId
JOIN instructors i
ON i.instructorId = ci.instructorId
WHERE i.firstName = 'Wanjiku'
AND i.lastName ='Njeri';
Which courses and students does Tina Ho teach? (Keep in mind that she could teach multiple courses)
SELECT c.name, s.firstName, s.lastName FROM students s
JOIN enrollments e
ON e.studentId = s.studentId
JOIN CourseInstructors ci
ON ci.courseInstructorId = e.courseInstructorId
JOIN instructors i
ON i.instructorId = ci.instructorId
JOIN courses c
ON c.courseId = ci.courseId
WHERE i.firstName = 'Tina'
AND i.lastName ='Ho';
Small aside on table relationships
A one-to-one relationship means that a row in table A is related to a single Row in Table B. In the database schema above, there is a one to one relationship between a student and a row in the studentsAccounts table
A one-to-many relationship means that a row in table A is related to multiple rows in Table B. students have a one to many relationship to the enrollments table, since a single student can be enrolled in multiple courses
A many-to-many relationship normally involves an intermediary table (sometimes called a cross-walk table), and means that multiple rows in Table A correlate to multiple rows in Table C via Table B. A student has a many to many relationship with instructors (through the enrollments table), because a student can have multiple instructors, and an instructor can have multiple students).
In part II of this series, I’ll go into more complex queries (ie. LEFT JOIN, GROUP BY/ORDER BY)
Article originally published here