blog.hugopoi.net/v2/content/post/how-this-blog-is-made/index.md

143 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "How this blog is made, Archivarix, some PHP and Hugo"
date: 2022-11-06T15:07:16+01:00
draft: true
toc: true
tags: ["youpi"]
---
## The legacy of my blog, recover with Archivarix
For me this year was like a rollercoaster, and I forget my blog was
hosted on a very old Online.net Dedibox server, now called Scaleway.
This server was in a process to be decomission and I missed the 3 emails
annoncing the end of my services. Then Online.net decided that it was a
good idea to also delete the backups spaces attached to these machines.
To sumup I loose my blog and the recent backups. But I wanted to keep it
and a least serve the existing content that was linked on search
engines and other websites. I looked on the wayback machine and my blog
was in it. I found a cool all-in-one service to restore an entire
website from the Wayback machine called Archivarix, the cost was arround
10€.
I recovered a 300MB zip archive with a lot of content, some
images are missing but all the articles was there.
## Running the Archivarix Loader
Archivarix loader is a single php file using a sqlite database with all
your urls inside and the content is stored as files in `www/.content.EZtzwPjb/binary/`.
Each time a HTTP request is process, the script look in the database for
a matching url and serve the content linked to it. This mini cms is
license under GPL, and I put a copy [here](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/branch/master/www/index.php).
### With docker
You need PHP and SQLite extension, the PHP docker image already
contains that. I have done a small docker-compose for running
archivarix.
Simple as run `docker compose up`.
* [docker-compose.yml](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/branch/master/docker-compose.yml)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/branch/master/docker-compose.yml" data-range="1," class="line-numbers"></pre>
* [nginx.conf](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/branch/master/nginx.conf)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/branch/master/nginx.conf" data-range="1," class="line-numbers language-nginx"></pre>
{{< figureCupper
img="Screenshot 2022-11-20 at 18-54-58 HugoPoi Internet Hardware et Bidouille.png"
caption="First run of my old website"
command="Fill"
options="1024x500 Top" >}}
### With Yunohost
I'm mainly self-hosted with the Yunohost project, those next steps show
you how to easily add a small php inside your Yunohost instance.
1. Install the application `My Webapp` inside the yunohost admin panel
{{< figureCupper
img="Screenshot from 2022-11-21 18-40-13.png"
caption="TODO"
command="Fit"
options="1024x500" >}}
1. Fill the setup form
{{< figureCupper
img="Screenshot 2022-11-21 at 18-41-13 Install my_webapp _ Catalog YunoHost Admin.png"
caption="TODO"
command="Fill"
options="1024x500" >}}
1. You have an empty app inside `/var/www/my_webapp/www/`
{{< figureCupper
img="Screenshot from 2022-11-21 18-42-30.png"
caption="TODO"
command="Fit"
options="1024x500" >}}
1. You need to copy your files, I use rsync with the yunohost admin account
`rsync -rlgoD --checksum --verbose www/ admin@home.hugopoi.net:/var/www/my_webapp/www/`
1. Then you might need to `chmod 664 /var/www/my_webapp/www/.content.*/structure.*`, Archivarix required some write access on the sqlite files.
## Modding Archivarix Loader
### Fixing Wordpress version missing files
The homepage was looking good but some wordpress css and javascript assets were missing. Wordpress use a query params `?ver=`.
{{< figureCupper
img="Screenshot 2022-11-20 at 18-55-57 Linky opendata my ass HugoPoi.png"
caption="First run of my old website, some broken css looking wrong"
command="Fill"
options="1024x500 Top" >}}
{{< figureCupper
img="Screenshot-Firefix-debugger-404-ver-wordpress-archivarix.png"
caption="Missing files because of the Wordpress `?ver=` query params with Archivarix"
command="Fit"
options="1024x500" >}}
So I code a little function to load any version available for a
given url. And I take the most recent one.
* [`getOtherWordpressVersionUrls` function in www/index.php](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/commit/2a154a6eea510e08b2608fd55f6729056c363b25/www/index.php#L295-L305)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/commit/2a154a6eea510e08b2608fd55f6729056c363b25/www/index.php" data-range="295,305" class="line-numbers"></pre>
* [The call in www/index.php](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/commit/2a154a6eea510e08b2608fd55f6729056c363b25/www/index.php#L609-L614)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/commit/2a154a6eea510e08b2608fd55f6729056c363b25/www/index.php" data-range="608,620" class="line-numbers"></pre>
### Cleaning existing pages
After successfully running my backuped blog, I wanted to mod some
content.
* Replace the twitter widget
* Replace the hoster widget
* Add a legacy warning for visitor to redirect to the new blog
Archivarix has a `ARCHIVARIX_INCLUDE_CUSTOM` relying
on regular expression to replace content but I needed a more precise approach. I used the PHP XML extension which has a DOM parser buit in and can
parse HTML pages.
* [The easy config to add/replace/delete some html parts](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/src/commit/cd94d82c1a3dad22b026c9c26311b366f76dcd54/www/index.php#L97-L133)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/commit/cd94d82c1a3dad22b026c9c26311b366f76dcd54/www/index.php" data-range="115,134" class="line-numbers"></pre>
* [The clever mod](https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/commit/598d28551d071774007172782541e1d140b8a3c1#diff-eb630ac88267e24589fd94de0826721dff38beb4)
<pre data-src="https://home.hugopoi.net/gitea/hugopoi/blog.hugopoi.net/raw/commit/598d28551d071774007172782541e1d140b8a3c1/www/index.php" data-range="411,439" class="line-numbers"></pre>
## The new blog with Hugo
Go Hugo !