Python HOW: Connect to, and Manage a Database

I’ve put my best effort to provide you with a clear, concise, and detailed description on how to connect to, and manage a SQL/SQLite database using Python
Introduction
A database model determines the logical structure of a database. This in turn determines how data can be stored, organized and manipulated. The Relational Model (RM) is the most popular database model since the 1980s. RM uses a table-based format, where tables are related by common columns
Database management system (DBMS)
DBMS is the software that you -or applications- use to interact with the database to create, read, update and manage data. The Relational DBMS (RDBMS) is the DBMS based on RM. According to DB-Engines, the most widely used RDBMS are: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, IBM DB2, Microsoft Access, and SQLite
A database is often referred to by the DBMS used to manipulate it. For example, a database that is manipulated using Microsoft SQL Server is referred to as: Microsoft SQL Server-Database
Although not required, all RDBMS use SQL as a standard data access language
Database connectivity interface
A database connectivity interface allows an application to access data from a variety of DBMSs, using a specific driver for a specific DBMS and operating system. This means that the application can be written without depending on a specific DBMS or the operating system
Open DataBase Connectivity (ODBC) is a standard Microsoft Windows interface that enables applications (typically written in C or C++) to connect to DBMSs. While, Java DataBase Connectivity (JDBC) is a standard Oracle interface that enables applications written in Java to connect to DBMSs
PyODBC: Open DataBase Connectivity for Python
PyODBC is the Python package for ODBC. Through pyodbc, you can easily connect Python applications to a DBMS-database, using the required ODBC driver
Install pyodbc using your preferred package manager:
Then, install the required driver for the DBMS-database you want to connect to. For example, if you want to connect to a Microsoft SQL Server-Database, you need to download and install the driver from Microsoft, after choosing your operating system
To make a connection to a database, we need to pass a connection string to the connect
method of pyodbc. The most important thing to remember is that: pyodbc passes the connection string directly to the DBMS-database driver unmodified. Therefore, connection strings are driver-specific
For example, to connect to a Microsoft SQL Server-Database, we provide the following connection string:
The connection string can also be passed as arguments, which are then concatenated into a connection string:
In the connection string above, we assume that the SQL Server Authentication mode is set to Windows Authentication Mode, so we did not have to provide credentials to get access permissions to the server.
However, if it was a Mixed Authentication Mode (i.e. SQL Server Authentication + Windows Authentication), we can still use the above connection string, or the following one where we provide our User ID and Password to get access permissions to the server:
Pyodbc also supports Data Source Name connections. A DSN contains information about a specific database that an ODBC driver needs in order to connect to it (information such as the database driver, the server and database names, etc):
Lastly, depending on the database you are trying to access, and the version of Python you are using, you might need to set the connection encoding/decoding settings, which you can find here
For Microsoft SQL Server-Database and Python 3.x, using pyodbc default encoding/decoding settings is recommended (i.e. no need to do anything)
SQLite3: SQLite for Python
SQLite3 is the Python package for SQLite, which is the most widely deployed RDBMS in the world (arguably). The main features of SQLite are:
- An embedded database. SQLite does not have a separate server process (i.e. server-less). It reads and writes directly to ordinary disk files
- Free for use for any purpose, commercial or private
SQLite3 is included with Python by default. To create a connection to a disk-based database, we just need to pass the name of the database to the connect
method of sqlite3:
Executing SQL statements
Once we have a connection to the database, either from pyodbc or sqlite3, we can then create a Cursor object which represents a database cursor. We can do this using the cursor
method:
Now we have a Cursor object, we can execute any valid SQL query string using the Cursor execute
method:
Let’s assume that the database we are connected to (database.db
), has 2 tables T_CUSTOMERS
and T_ADDRESSES
as such:
SELECT statement
To select the id
, fname
, and lname
columns from T_CUSTOMERS
, we use SQL SELECT
:
The executed SELECT
statement returns rows, which we can retrieve using one of the Cursor fetch functions:
fetchone
: retrieves one row only, and move the Cursor to the next rowfetchall
: retrieves all rows, and move the Cursor to the endfetchmany(size)
: retrieves a number of rows, and move the Cursor to the next row
Note: if there are no rows left,
fetchone
will return None, whereasfetchall
andfetchmany
will both return empty lists
For example, to retrieve one row only:
row will be a Tuple of values, one for each of the columns in the SELECT
statement:
To get each column’s description, the Cursor description
attribute can be used:
This description will be a list of Tuples, one for each column. Each Tuple has 7 items to describe the column, the one that matters to us is the first item, which is the column’s name (or alias, if specified in the SQL SELECT
):
To retrieve all the remaining rows as a list, we can use fetchall
:
UPDATE statement
To update an existing record in T_CUSTOMERS
, we use SQL UPDATE
with a WHERE
clause
Note: if you do not use a
WHERE
clause in theUPDATE
statement to specify which record(s) should be updated, all records will be updated!
Let’s update lname='Trujillo'
to 'Taqueria'
:
If we want to know how many records were modified by the last SQL statement, we can use the Cursor rowcount
attribute. This will return -1
if no SQL has been executed or if the number of rows is unknown:
Let’s check if that record has been updated successfully:
DELETE statement
To delete a customer’s record from T_CUSTOMERS
, we use SQL DELETE
with a WHERE
clause
Note: if you do not use a
WHERE
clause in theDELETE
statement to specify which record(s) should be deleted, all records will be deleted!
Let’s delete the record with id=4
:
Let’s check how many records were modified:
Let’s check if that record has been deleted successfully:
INSERT INTO statement
To insert a new record into T_CUSTOMERS
, we use SQL INSERT INTO
Let’s insert the same record we have deleted:
Let’s check how many records were modified:
Let’s check if that record has been inserted successfully:
Long SQL statement
Let’s say that we want to select the full name, and city and postal code of a customer with the last name 'Trujillo'
. We (or someone else) has written the following SQL statement to do this:
The easiest way to pass this long SQL statement to excute
is to use the triple-quote string format to encapsulate the statement. White-spaces (including tabs and newlines) should be ignored by the SQL database engine, but if we want to remove them from the left, we can use the dedent
function from the built-in textwrap
module:
Finally, we must call commit
on the connection that created this cursor, otherwise our changes will be lost if we close the Script, and close
the connection:
What is next?
Start using SqlAlchemy and Pandas for a high-level connecting and execution of queries and parameterized queries 👉
Happy coding!