6.0 KiB
title | date | draft | toc | tags | |
---|---|---|---|---|---|
How this blog is made, Archivarix, some PHP and Hugo | 2022-11-06T15:07:16+01:00 | true | true |
|
The legacy of my blog, recover with Archivarix
For me this year was like a rollercoaster, and I forget my blog was hosted on a very old Online.net Dedibox server, now called Scaleway. This server was in a process to be decomission and I missed the 3 emails annoncing the end of my services. Then Online.net decided that it was a good idea to also delete the backups spaces attached to these machines. To sumup I loose my blog and the recent backups. But I wanted to keep it and a least serve the existing content that was linked on search engines and other websites. I looked on the wayback machine and my blog was in it. I found a cool all-in-one service to restore an entire website from the Wayback machine called Archivarix, the cost was arround 10€.
I recovered a 300MB zip archive with a lot of content, some images are missing but all the articles was there.
Running the Archivarix Loader
Archivarix loader is a single php file using a sqlite database with all
your urls inside and the content is stored as files in www/.content.EZtzwPjb/binary/
.
Each time a HTTP request is process, the script look in the database for
a matching url and serve the content linked to it. This mini cms is
license under GPL, and I put a copy here.
With docker
You need PHP and SQLite extension, the PHP docker image already contains that. I have done a small docker-compose for running archivarix.
Simple as run docker compose up
.
{{< figureCupper img="Screenshot 2022-11-20 at 18-54-58 HugoPoi – Internet Hardware et Bidouille.png" caption="First run of my old website" command="Fill" options="1024x500 Top" >}}
With Yunohost
I'm mainly self-hosted with the Yunohost project, those next steps show you how to easily add a small php inside your Yunohost instance.
-
Install the application
My Webapp
inside the yunohost admin panel{{< figureCupper img="Screenshot from 2022-11-21 18-40-13.png" caption="TODO" command="Fit" options="1024x500" >}}
-
Fill the setup form
{{< figureCupper img="Screenshot 2022-11-21 at 18-41-13 Install my_webapp _ Catalog YunoHost Admin.png" caption="TODO" command="Fill" options="1024x500" >}}
-
You have an empty app inside
/var/www/my_webapp/www/
{{< figureCupper img="Screenshot from 2022-11-21 18-42-30.png" caption="TODO" command="Fit" options="1024x500" >}}
-
You need to copy your files, I use rsync with the yunohost admin account
rsync -rlgoD --checksum --verbose www/ admin@home.hugopoi.net:/var/www/my_webapp/www/
-
Then you might need to
chmod 664 /var/www/my_webapp/www/.content.*/structure.*
, Archivarix required some write access on the sqlite files.
Modding Archivarix Loader
Fixing Wordpress version missing files
The homepage was looking good but some wordpress css and javascript assets were missing. Wordpress use a query params ?ver=
.
{{< figureCupper img="Screenshot 2022-11-20 at 18-55-57 Linky opendata my ass – HugoPoi.png" caption="First run of my old website, some broken css looking wrong" command="Fill" options="1024x500 Top" >}}
{{< figureCupper
img="Screenshot-Firefix-debugger-404-ver-wordpress-archivarix.png"
caption="Missing files because of the Wordpress ?ver=
query params with Archivarix"
command="Fit"
options="1024x500" >}}
So I code a little function to load any version available for a given url. And I take the most recent one.
Cleaning existing pages
After successfully running my backuped blog, I wanted to mod some content.
- Replace the twitter widget
- Replace the hoster widget
- Add a legacy warning for visitor to redirect to the new blog
Archivarix has a ARCHIVARIX_INCLUDE_CUSTOM
relying
on regular expression to replace content but I needed a more precise approach. I used the PHP XML extension which has a DOM parser buit in and can
parse HTML pages.
The new blog with Hugo
Go Hugo !