First steps with Pelican to set up this GitHub Page

Making first steps in a virtual environment

I've done a little bit of reading to see if it is possible to setup my own GitHub page without having to resort to Ruby/Jekyll since I'm much more comfortable with doing stuff in Python.

As I'm trying to use a bit more restructuredText to eventually write all of my documentation for Python code in a syntax that is Sphinx-friendly, I wanted to have that option to write the content in rst files instead of plain HTML.

Given the assumptions above, I decided to use Pelican and see how far will I get with it. First step was to install it and see how many other plugins and/or dependencies will I need to get the job done.

The first thing to do is to install all the dependencies in a virtual environment so that if this experiment fails, I can just remove the folder with the environment in it instead of going through dependencies in my system and cleaning them one by one.

As I've been installing a typical virtual environment, an idea came to my mind. I was about to create a project which focuses on doing one thing only while being run always in the same environment is the ideal definition of an application. Since there is a great tool to keep all dependencies for an application in a controlled fashion, I thought about managing it using pipenv instead. I've used it in the past for a bit but never really committed to it. Because this idea sounded like a good one, I decided to run with it:

pip install pipenv
pipenv install pelican

Creating the project and figuring out its structure

After installing Pelican in the virtual environment, I checked how to start the project rolling. Fortunately, there is a quickstart step similar to the one you can find in Sphinx.

pipenv run pelican-quickstart

I've configured the project in such a way that I customized the title, author and made sure that it correctly refers to my GitHub page.

Once the quickstart was done, I looked at the structure of the project. I created a simple About me page under content/pages/about.rst and this blog post under content/blog/first-steps-with-pelican-to-set-up-this-github-page.rst. I followed the typical File metadata structure for the blog post and a simplified version for About me page.

After that everything should be simple, right?

pipenv run pelican

I've looked at the generated HTML files stored in output folder in project's default path. To my surprise, opening just the index.html file wasn't as pleasant as I expected, since all of the other static content like CSS were not collected by the browser. Opening checking the documentation, I've noticed that I was supposed to serve the pages on web server instead of just browsing them directly. Sounds logical. The immediate suggestion was to do it by moving to output directory and running HTTP server from Python 3's standard library:

/content $ python -m http.server

After that, I just opened my browser and checked the page served as localhost:8000

Simplifying serving procedure with invoke and its tasks

The page worked correctly but I was not satisfied with this approach as it seemed error-prone. First I needed to make sure that I'm inside a correct folder and then remember to invoke server module from http library. So I looked for another approach in the files autogenerated by pelican with quickstart procedure.

Since I've worked a bit with invoke package and used fabfiles in the past, I immediately noticed tasks.py file in main directory of the project. After skimming through the code, I found build and serve tasks which would enable me to generate and serve output files without needing to think about platform/OS on which the tasks would have to be run on. This sounded great as I'm using Windows whenever I can and resort to Linux virtual machine only if I have to suffer the performance hit.

To use this approach, I had to install invoke package:

pipenv install invoke

After that, the procedure to run invoke tasks was wonderfully simple:

pipenv run invoke build
pipenv run invoke serve

Can we make it shorter? Sure! As this use case is probably one of the most frequently used there is already a reserve task prepared and because typing invoke is such a hassle, we can already use a shorthand inv to save the time needed to type oke at the end of it.

pipenv run inv reserve

Works as expected. How great is that, huh?

Uploading the page to GitHub

After all that hassle, it would be great to finally share this page with others. The creation process of a GitHub page is described in great detail in one of the official guides titled Getting Started with GitHub Pages so I'll save some time and skip it.

Let's clone the remote repository in the main folder of our project:

git init
git remote add origin git@github.com:rotocki/rotocki.github.io.git
git fetch --all
git checkout origin/master -b master

After doing that, we should be able to commit our changes on top of whatever we have autogenerated by GitHub during creation of the repository.

git add .
git commit -m "Initial page."
git push

The project is now version controlled and stored on GitHub... but the pages are not visible! What is going on?

Rearranging the repository

If we think for a moment about a possible root cause of this problem, we can quickly notice that we don't have an index.html file in the main folder of our project. Because of that, README.md file gets served as the index.html page instead.

I went back to tasks.py file to see if there is another task that I can use out of the box and I found one called gh_pages which is documented as Publish to GitHub Pages. Jackpot? Looks like it but I need to understand what is this ghp-import command on which this task relies.

The best way to understand is to check the official documentation stored under ghp-import GitHub page. After going through the readme and making sure Big Fat Warning is understood, I wanted to make sure that gh-pages branch will be the one used by GitHub as the default one for serving content. This use case is so frequently used it has its own section in an article on GitHub Help: Enabling GitHub Pages to publish your site from master or gh-pages

Unfortunately, this applies only to project pages and not personal GitHub pages. So we will need to store content in a non-master branch while our master branch will only contain the output in a way that can be easily presented to the viewer.

First of all, let's change the branch:

git checkout -b content
git push --set-upstream origin content

Now the default configuration will make sure that the output generated from the content stored in a private branch content will be version under master branch.

Publishing the page

Now I should be able to get away with invoking gh_pages task but first I have to install ghp-import as one of the dependencies:

pipenv install ghp-import

After that, I can publish my work:

pipenv run inv gh_pages

Nope. Error message: No idea what 'gh_pages' is! There is a task in tasks.py but it cannot be found. Maybe invoke's can help us?

pipenv run invoke

There seems to be -l parameter which should understand us what tasks are available and how we should call them. After running

pipenv run invoke -l

We get this:

Available tasks:

    build        Build local version of site
    clean        Remove generated files
    gh-pages     Publish to GitHub Pages
    preview      Build production version of site
    publish      Publish to production via rsync
    rebuild      `build` with the delete switch
    regenerate   Automatically regenerate site upon file modification
    reserve      `build`, then `serve`
    serve        Serve site at http://localhost:8000/

Do you notice the difference? Underscore magically became a hyphen! Let's finish the task for today by invoking the gh-pages task!

pipenv run inv gh-pages

Since I'm running a Windows 10 machine, the current syntax does not get accepted and I have to rearrange the quotation marks to get it working.

CONFIG = {
    ...
    # Before: "'Publish site on {}'"
    # After:
    'commit_message': "'Publish site on {}'".format(datetime.date.today().isoformat()),
    ...
}

After introducing this change, I was able to push the output to master branch but the page would still show README.md instead of the expected pages.

Making sure we're on the same page as GitHub

I've done some reading to understand the problem...

... and understood that there might be a problem with the fact that GitHub assumes the pages were generated using Ruby/Jekyll and we need to inform the server that we're running something different. I added .nojekyll file in the main folder of the project, committed the changes, pushed them to remote repository, executed gh-pages task and now you can read this article. Cheers!

social