I've done a little bit of reading to see if it is possible to setup
my own GitHub page without having to resort to Ruby/Jekyll since
I'm much more comfortable with doing stuff in Python.
As I'm trying to use a bit more restructuredText to eventually write
all of my documentation for Python code in a syntax that is Sphinx-friendly,
I wanted to have that option to write the content in rst files instead of
plain HTML.
Given the assumptions above, I decided to use Pelican and see how far
will I get with it. First step was to install it and see how many other
plugins and/or dependencies will I need to get the job done.
The first thing to do is to install all the dependencies in a virtual
environment so that if this experiment fails, I can just remove
the folder with the environment in it instead of going through
dependencies in my system and cleaning them one by one.
As I've been installing a typical virtual environment, an idea came to my mind.
I was about to create a project which focuses on doing one thing only while being run
always in the same environment is the ideal definition of an application.
Since there is a great tool to keep all dependencies for an application
in a controlled fashion, I thought about managing it using pipenv instead.
I've used it in the past for a bit but never really committed to it.
Because this idea sounded like a good one, I decided to run with it:
pip install pipenv
pipenv install pelican
After installing Pelican in the virtual environment, I checked how to start
the project rolling. Fortunately, there is a quickstart step similar to the one
you can find in Sphinx.
pipenv run pelican-quickstart
I've configured the project in such a way that I customized the title,
author and made sure that it correctly refers to my GitHub page.
Once the quickstart was done, I looked at the structure of the project.
I created a simple About me page under content/pages/about.rst
and this blog post under content/blog/first-steps-with-pelican-to-set-up-this-github-page.rst.
I followed the typical File metadata structure for the blog post
and a simplified version for About me page.
After that everything should be simple, right?
I've looked at the generated HTML files stored in output folder in project's default path.
To my surprise, opening just the index.html file wasn't as pleasant as I expected,
since all of the other static content like CSS were not collected by the browser.
Opening checking the documentation, I've noticed that I was supposed to serve the pages
on web server instead of just browsing them directly. Sounds logical. The immediate
suggestion was to do it by moving to output directory and running HTTP server
from Python 3's standard library:
/content $ python -m http.server
After that, I just opened my browser and checked the page served as localhost:8000
The page worked correctly but I was not satisfied with this approach as it seemed error-prone.
First I needed to make sure that I'm inside a correct folder and then remember to invoke
server module from http library. So I looked for another approach in the files autogenerated
by pelican with quickstart procedure.
Since I've worked a bit with invoke package and used fabfiles in the past,
I immediately noticed tasks.py file in main directory of the project.
After skimming through the code, I found build and serve tasks which would enable me
to generate and serve output files without needing to think about platform/OS on which the tasks
would have to be run on. This sounded great as I'm using Windows whenever I can and resort to
Linux virtual machine only if I have to suffer the performance hit.
To use this approach, I had to install invoke package:
After that, the procedure to run invoke tasks was wonderfully simple:
pipenv run invoke build
pipenv run invoke serve
Can we make it shorter? Sure! As this use case is probably one of the most frequently used
there is already a reserve task prepared and because typing invoke is such a hassle,
we can already use a shorthand inv to save the time needed to type oke at the end of it.
Works as expected. How great is that, huh?
After all that hassle, it would be great to finally share this page with others.
The creation process of a GitHub page is described in great detail in one of the official
guides titled Getting Started with GitHub Pages so I'll save some time and skip it.
Let's clone the remote repository in the main folder of our project:
git init
git remote add origin git@github.com:rotocki/rotocki.github.io.git
git fetch --all
git checkout origin/master -b master
After doing that, we should be able to commit our changes on top of whatever
we have autogenerated by GitHub during creation of the repository.
git add .
git commit -m "Initial page."
git push
The project is now version controlled and stored on GitHub...
but the pages are not visible! What is going on?
If we think for a moment about a possible root cause of this problem,
we can quickly notice that we don't have an index.html file
in the main folder of our project. Because of that, README.md file
gets served as the index.html page instead.
I went back to tasks.py file to see if there is another task that
I can use out of the box and I found one called gh_pages which
is documented as Publish to GitHub Pages. Jackpot? Looks like it
but I need to understand what is this ghp-import command on which
this task relies.
The best way to understand is to check the official documentation
stored under ghp-import GitHub page. After going through the readme
and making sure Big Fat Warning is understood, I wanted to make sure
that gh-pages branch will be the one used by GitHub as the default one
for serving content. This use case is so frequently used
it has its own section in an article on GitHub Help:
Enabling GitHub Pages to publish your site from master or gh-pages
Unfortunately, this applies only to project pages and not personal
GitHub pages. So we will need to store content in a non-master branch
while our master branch will only contain the output in a way
that can be easily presented to the viewer.
First of all, let's change the branch:
git checkout -b content
git push --set-upstream origin content
Now the default configuration will make sure that the output
generated from the content stored in a private branch content
will be version under master branch.
Now I should be able to get away with invoking gh_pages task
but first I have to install ghp-import as one of the dependencies:
pipenv install ghp-import
After that, I can publish my work:
Nope. Error message: No idea what 'gh_pages' is!
There is a task in tasks.py but it cannot be found. Maybe invoke's can help us?
There seems to be -l parameter which should understand us what tasks
are available and how we should call them. After running
We get this:
Available tasks:
build Build local version of site
clean Remove generated files
gh-pages Publish to GitHub Pages
preview Build production version of site
publish Publish to production via rsync
rebuild `build` with the delete switch
regenerate Automatically regenerate site upon file modification
reserve `build`, then `serve`
serve Serve site at http://localhost:8000/
Do you notice the difference? Underscore magically became a hyphen!
Let's finish the task for today by invoking the gh-pages task!
Since I'm running a Windows 10 machine, the current syntax does not
get accepted and I have to rearrange the quotation marks to get it working.
CONFIG = {
...
# Before: "'Publish site on {}'"
# After:
'commit_message': "'Publish site on {}'".format(datetime.date.today().isoformat()),
...
}
After introducing this change, I was able to push the output to master branch
but the page would still show README.md instead of the expected pages.
I've done some reading to understand the problem...
... and understood that there might be a problem with the fact that GitHub assumes
the pages were generated using Ruby/Jekyll and we need to inform the server
that we're running something different. I added .nojekyll file in the main folder
of the project, committed the changes, pushed them to remote repository, executed gh-pages task
and now you can read this article. Cheers!