Python HOW: Getting Facebook Data and Insights Using facebook-sdk

At the end of this article you will learn how to get valuable data and insights of a page you’re an admin. It assumes you’ve already obtained a permanent page token. If you’ve not, check my 👉 article first
The Graph API 🍇
Data in FB is represented using the idea of a “social graph”. To interact with a graph we use an HTTPS-based API called the Graph API. To return a graph object using the obtained page token:
A graph is made up of 3 hierarchical components:
- A node which is an individual object with a unique ID
- An edge which is a connection between one node and another
- Fields which are node properties
Example 1: a page object is a node, all its posts are edges, and the page’s about and category are some of its fields
Example 2: a post object is a node, all its comments are edges, and the post’s creation time and message are some of its fields
Page Data 🎯
To read the page as a node, we use the get_object
method with the page ID:
By default, this returns the name
and id
fields, but we can use the fields
parameter to specify fields:
Page Insights 🔍
The insights object returns very valuable data for a single metric for a day, a week, 28 days, or for life time (depending on the metric). Insights is available for more than a 100 metrics so check them out here
To get the page insights for a specific metric, we use get_connections
with insights
as the connection_name,
and specifying the metric
The following insights return the number of times any content from my Page or about my Page entered a person’s screen during the week before yesterday:
Latest Posts IDs
To get data from any post, we need to get the post’s ID, and also the ID of any uploaded photo or video attached to the post (i.e. object ID). Also, it’s very useful to get the post’s type, name, and creation time
To do this, we can either use:
get_object
with the page ID andposts
as thefields
parameter:
This will only return the latest 25 posts. To specify the number of returned posts we can use the limit
method of field
:
2. get_connections
with the page ID and posts
as the connection_name.
We used the fields
parameter here as normal:
This will only return the latest 25 posts
All/Specific Posts IDs
To iterate over all the posts of the page, get_all_connections
can be used to create a generator object that yields all individual posts. However, we would probably be more interested in posts during a specific period of time
For this task we can use the since
and until
parameters which expect a datatime
object in the format (year
, month
, day
). For example to get all posts info for this year:
Post data
Let’s first get the most 2 recent posts info:
Now let’s get the first post data using either:
get_object
with the first post ID and the required fields:
2. get_connections
with the first post ID and the required connection_name
. This is preferable as it has extra parameters for some edges. For example, for comments, we can include hidden comments, order them from newest to latest, include sub-comments, and show a total count summary:
You can do something similar using likes
and reactions
as the connection_name
, along with parameters for likes or reactions
Post Insights
We can do this as we did to get the Page Insight but using the post ID instead of the page ID, and choosing a metric for posts from the available metrics
The following insights return the total count of stories created about the most recent post, by action type since the post was created up to yesterday:
Video Post Insights
There’re special metrics for video insights. To use them we need to use video_insights
as the connection_name
with the object ID (NOT the post or page ID as before)
The following insights return the total count the video in the second recent post was viewed so far:
TL;DR: full code
Two full codes to extract insights are on my GitHub here, which includes:
- handle_tokens.py Check my 👉 article first to learn about tokens
- PageApp.py that extracts daily insights for your page, over one month period that you can specify in the main function. Insights include demographics, impressions, engagement, and reactions. The result is a csv file
- PostApp.py that gets all the posts published over one month period that you can specify in the main function. For each post a number of insights are extracted. For video posts, additional insights are automatically extracted. The result is a csv file
Happy coding!