Learn how to run Prodigy with a connection to a remote database as a Docker container. This would make you be able to deploy it to any Web App service with ease (e.g. Azure Web App Service or AWS Beanstalk)

source: author


This article assumes that you have already migrated the default Prodigy database schema to a remote MySQL server. If you have not done this already, this is described in details in 👉 article

Project structure 👷🏻

Your project structure should look like this once you are done:

Please note that provided in my GitHub is a dummy file 😬

For our…

Setting up python with pyenv, venv, pipx, and vsCode on Windows and macOS

Photo by Kevin Ku from Pexels

Anaconda and Miniconda are amazing python distributions that get you up and running out-of-the-box. Once you start deploying your projects into production, however, you will defiantly need more control. In this article, I go through some of the best tools to make that transition happens! 🤘

All the CLI commands used in this article are in for Windows, and for macOS

Before starting up 🚿

If you’ve anaconda or miniconda installed, uninstall it as detailed here. For Windows, also:

  • Delete

Learn how to save Prodigy annotations in a remote database for collaborative annotating

Photo edited by the author from prodi.gy

Prodigy is developed by Explosion AI, the folks behind spaCy, so it integrates with it organically 🍻. This article describes in details how to migrate the default local SQLite database schema into a remote MyQL or PostgreSQL database

TL;DR: full code

Default Database 📍

The first time you run Prodigy, it will create a folder in your home directory with 2 files:

structure of folder

By default, Prodigy looks for its configuration in which has the default database . So, by default, annotations are saved in…

Photo by panumas nikhomkhai from Pexels

This is an extension to a previous article 👀 that covers the low-level methods of establishing a connection to a SQL database and executing queries. We cover here the equivalent high-level methods in SqlAlchemy and Pandas to do the same in fewer lines of code

TL;DR: full code


Install pyodbc, sqlalchemy and pandas using your preferred package manager

You might want to create a virtual environment first

Then, install the required driver for the DBMS-database you want to connect to. …

Disclaimer: This article is kept as an archive only as it seems that pipenv is dead. For alternatives, learn how to take full control over your development environment in my other article 👉🏼 that includes how to create a requirements file using venv

Photo by cottonbro from Pexels

A requirements.txt file lists all the Python dependencies required for a project. It’s a snapshot of all the packages you’ve used. You will need this for building a Docker image for example, or for creating Serverless Functions or Web Apps

What’s pipenv?

is currently the recommended dependency manager for collaborative projects by Python. It uses and

Photo by Kaique Rocha from Pexels

Docker helps you to package up your project with all of the dependencies needed to run it from anywhere

Build, share and run any application, anywhere!”

Is it a Docker image or a container? 😕

Let’s clear this out straight away. You first build a Docker image by reading a set of instructions from a . Once you run this image, it’s called a container

Docker Engine 🚒

To do any thing Docker, you first need to install the Docker Engine. Docker Engine is available on a variety of Linux platforms, Mac and Windows through Docker Desktop, Windows Server, and as a static binary installation. …

Photo by Pixabay from Pexels

At the end of this article you’ll be a master Sklearn plumber. You’ll know how to pipe in numerical and categorical attributes without having to use Pandas get_dummies or Sklearn FeatureUnion

TL;DR: full code

1. Install/Update

At the time of writing this post, 0.21.2 was the latest release of sklearn. Check the docs for dependencies and either install or update

conda install scikit-learn==0.21.2
conda update scikit-learn==0.21.2

2. Toy dataset

We’ll use a sample dataset of audience churn with 1000 instances, and 19 attributes, 10 numerical and 9 categorical. You can download

Photo by Pixabay from Pexels

At the end of this article you will learn how to get valuable data and insights of a page you’re an admin. It assumes you’ve already obtained a permanent page token. If you’ve not, check my 👉 article first

TL;DR: full code

The Graph API 🍇

Data in FB is represented using the idea of a “social graph”. To interact with a graph we use an HTTPS-based API called the Graph API. To return a graph object using the obtained page token:

Check the latest version of the official facebook-sdk release

A graph is made up of 3 hierarchical components:

  1. A node which is…

Learn how to obtain a permanent Page Access Token for your pages using the official facebook-sdk package. Updated Oct 2020

Photo captured by the author from developers.facebook.com

Our aim in this article is to get a Permanent Token. We will get a Short-Lived Token manually only once, then use Python to exchange that Token with a Permanent one. FB states that:️

You should not depend on these tokens lifetimes remaining the same as Access Tokens can be invalidated or revoked anytime 💀

This might be the case if we make frequent requests using the same Token. …

Gabriel Harris Ph.D.

I’m an End-to-End data scientist and a Python educator. Most of my articles start after saying “I wish someone has written about this!”, maybe I should?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store