feat(post): new post about add archivarix archives to hugo

2022-12-05 20:01:43 +01:00 · 2022-12-05 20:01:43 +01:00 · 567cc42c9c
commit 567cc42c9c
parent 492a060487
2 changed files with 108 additions and 1 deletions
--- a/v2/content/post/add-archivarix-archives-to-hugo/index.md
+++ b/v2/content/post/add-archivarix-archives-to-hugo/index.md
@ -0,0 +1,107 @@
+---
+title: "Add Archivarix archives to Hugo"
+date: 2022-11-06T14:27:04+01:00
+draft: true
+---
+
+I want to add all my old articles to the Hugo posts list page.
+
+Let's write some code.
+
+* I can use the Archivarix sitemap as source
+* Or I can use the sqlite database as source
+* I want to add all the canonical pages to the list
+* Sorted by reverse date of publication
+* With the title
+
+First, I discover that GoHugo handle override over files, if you a file
+in `/themes/<THEME>/static/js/jquery.min.js`, you can override it with a
+file in `/static/js/jquery.min.js`. So I think I don't need a custom
+theme, so let's remove that.
+
+
+## Proof of concept with a sitemap
+
+1. First I change the `index.php` and add a sitemap path to enable
+sitemap generation in Archivarix loader.
+
+1. Generate a sitemap `wget http://localhost:8080/sitemap.xml`
+
+1. Then I discover sitemap doesn't have title in specification so it's a
+dead end.
+
+1. Place `sitemap.xml` in `/data/legacyblog/sitemap.xml`
+1. Let's poc the change in our Hugo theme in `layouts/_default/list.html`
+
+  ```html
+      # Will load the file and parse it
+      {{ range $.Site.Data.legacyblog.sitemap.url }}
+      <li>
+        <h2>
+          <a href="{{ .loc }}">
+            <svg
+              class="bookmark"
+              aria-hidden="true"
+              viewBox="0 0 40 50"
+              focusable="false"
+            >
+              <use href="#bookmark"></use>
+            </svg>
+            {{ .loc }}
+          </a>
+        </h2>
+      </li>
+      {{ end }}
+  ```
+I will not use this solution we can't have title with it.
+
+## Proof of concept with webcrawl csv file
+
+In an other life, I develop a little web crawler or spider that can list
+all the urls and robot metadatas for a given website.
+
+1. `git clone `
+1. `npm install`
+1. `node console.js http://localhost:8080 --noindex --nofollow --progress` will create a file called `localhost_urls.csv`
+
+  ```csv
+  "url","statusCode","metas.title","metas.robots","metas.canonical","metas.lang","parent.url"
+  "http://localhost:8080/",200,"HugoPoi – Internet, Hardware et Bidouille","max-image-preview:large",,"fr-FR",
+  "http://localhost:8080/v2/",200,"HugoPoi Blog",,"http://localhost:1313/v2/","en","http://localhost:8080/"
+  "http://localhost:8080/en/",200,"How to decrypt flows_cred.json from NodeRED data ? – HugoPoi","max-image-preview:large","http://localhost:8080/en/2021/12/28/how-to-decrypt-flows_cred-json-from-nodered-data/","en-US","http://localhost:8080/"
+  ```
+1. Then we put this file outside of data directory as mention in the
+documentation of Hugo
+1. Mod the template with CSV parse function
+  ```html
+      <!-- Loop against csv lines -->
+      {{ range $i,$line := getCSV "," "./localhost_urls.csv" }}
+      <!-- Fill variables with columns -->
+      {{ $url := index $line 0 }}
+      {{ $title := index $line 2 }}
+      <!-- Skip csv head line and replytocom wordpress urls -->
+      {{ if and (ne $i 0) (eq (len (findRE `replytocom` $url 1)) 0)}}
+      <li>
+        <h2>
+          <a href="{{ $url }}">
+            <svg
+              class="bookmark"
+              aria-hidden="true"
+              viewBox="0 0 40 50"
+              focusable="false"
+            >
+              <use href="#bookmark"></use>
+            </svg>
+            {{ $title }}
+          </a>
+        </h2>
+      </li>
+      {{ end }}
+      {{ end }}
+  ```
+
+  This solution is promising
+  // TODO IMAGE
+
+
+
--- a/v2/content/post/how-this-blog-is-made/index.md
+++ b/v2/content/post/how-this-blog-is-made/index.md
@ -2,7 +2,7 @@
 title: "How this blog is made, Archivarix, some PHP and Hugo"
 date: 2022-12-03T17:17:00+01:00
 toc: true
-tags: ["youpi"]
+tags: ["this blog", "PHP", "gohugo"]
 ---

 ## The legacy of my blog, recover with Archivarix