feat(post): new post about add archivarix archives to hugo
This commit is contained in:
parent
492a060487
commit
567cc42c9c
107
v2/content/post/add-archivarix-archives-to-hugo/index.md
Normal file
107
v2/content/post/add-archivarix-archives-to-hugo/index.md
Normal file
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
title: "Add Archivarix archives to Hugo"
|
||||
date: 2022-11-06T14:27:04+01:00
|
||||
draft: true
|
||||
---
|
||||
|
||||
I want to add all my old articles to the Hugo posts list page.
|
||||
|
||||
Let's write some code.
|
||||
|
||||
* I can use the Archivarix sitemap as source
|
||||
* Or I can use the sqlite database as source
|
||||
* I want to add all the canonical pages to the list
|
||||
* Sorted by reverse date of publication
|
||||
* With the title
|
||||
|
||||
First, I discover that GoHugo handle override over files, if you a file
|
||||
in `/themes/<THEME>/static/js/jquery.min.js`, you can override it with a
|
||||
file in `/static/js/jquery.min.js`. So I think I don't need a custom
|
||||
theme, so let's remove that.
|
||||
|
||||
|
||||
## Proof of concept with a sitemap
|
||||
|
||||
1. First I change the `index.php` and add a sitemap path to enable
|
||||
sitemap generation in Archivarix loader.
|
||||
|
||||
1. Generate a sitemap `wget http://localhost:8080/sitemap.xml`
|
||||
|
||||
1. Then I discover sitemap doesn't have title in specification so it's a
|
||||
dead end.
|
||||
|
||||
1. Place `sitemap.xml` in `/data/legacyblog/sitemap.xml`
|
||||
1. Let's poc the change in our Hugo theme in `layouts/_default/list.html`
|
||||
|
||||
```html
|
||||
# Will load the file and parse it
|
||||
{{ range $.Site.Data.legacyblog.sitemap.url }}
|
||||
<li>
|
||||
<h2>
|
||||
<a href="{{ .loc }}">
|
||||
<svg
|
||||
class="bookmark"
|
||||
aria-hidden="true"
|
||||
viewBox="0 0 40 50"
|
||||
focusable="false"
|
||||
>
|
||||
<use href="#bookmark"></use>
|
||||
</svg>
|
||||
{{ .loc }}
|
||||
</a>
|
||||
</h2>
|
||||
</li>
|
||||
{{ end }}
|
||||
```
|
||||
I will not use this solution we can't have title with it.
|
||||
|
||||
## Proof of concept with webcrawl csv file
|
||||
|
||||
In an other life, I develop a little web crawler or spider that can list
|
||||
all the urls and robot metadatas for a given website.
|
||||
|
||||
1. `git clone `
|
||||
1. `npm install`
|
||||
1. `node console.js http://localhost:8080 --noindex --nofollow --progress` will create a file called `localhost_urls.csv`
|
||||
|
||||
```csv
|
||||
"url","statusCode","metas.title","metas.robots","metas.canonical","metas.lang","parent.url"
|
||||
"http://localhost:8080/",200,"HugoPoi – Internet, Hardware et Bidouille","max-image-preview:large",,"fr-FR",
|
||||
"http://localhost:8080/v2/",200,"HugoPoi Blog",,"http://localhost:1313/v2/","en","http://localhost:8080/"
|
||||
"http://localhost:8080/en/",200,"How to decrypt flows_cred.json from NodeRED data ? – HugoPoi","max-image-preview:large","http://localhost:8080/en/2021/12/28/how-to-decrypt-flows_cred-json-from-nodered-data/","en-US","http://localhost:8080/"
|
||||
```
|
||||
1. Then we put this file outside of data directory as mention in the
|
||||
documentation of Hugo
|
||||
1. Mod the template with CSV parse function
|
||||
```html
|
||||
<!-- Loop against csv lines -->
|
||||
{{ range $i,$line := getCSV "," "./localhost_urls.csv" }}
|
||||
<!-- Fill variables with columns -->
|
||||
{{ $url := index $line 0 }}
|
||||
{{ $title := index $line 2 }}
|
||||
<!-- Skip csv head line and replytocom wordpress urls -->
|
||||
{{ if and (ne $i 0) (eq (len (findRE `replytocom` $url 1)) 0)}}
|
||||
<li>
|
||||
<h2>
|
||||
<a href="{{ $url }}">
|
||||
<svg
|
||||
class="bookmark"
|
||||
aria-hidden="true"
|
||||
viewBox="0 0 40 50"
|
||||
focusable="false"
|
||||
>
|
||||
<use href="#bookmark"></use>
|
||||
</svg>
|
||||
{{ $title }}
|
||||
</a>
|
||||
</h2>
|
||||
</li>
|
||||
{{ end }}
|
||||
{{ end }}
|
||||
```
|
||||
|
||||
This solution is promising
|
||||
// TODO IMAGE
|
||||
|
||||
|
||||
|
|
@ -2,7 +2,7 @@
|
|||
title: "How this blog is made, Archivarix, some PHP and Hugo"
|
||||
date: 2022-12-03T17:17:00+01:00
|
||||
toc: true
|
||||
tags: ["youpi"]
|
||||
tags: ["this blog", "PHP", "gohugo"]
|
||||
---
|
||||
|
||||
## The legacy of my blog, recover with Archivarix
|
||||
|
|
Loading…
Reference in New Issue
Block a user