Keep track of what you've read online with remwharead

Today I’d like to talk to you about how I archive articles I read online and how I find them again.

I’ve found myself repeatedly in situations where I wanted to reference an article I knew I read, but couldn’t find it anymore. Be it that I didn’t remember the right search terms or that the article had gone offline. I searched for solutions to my problem, but could only find webservices, nothing that would allow me to keep an archive on my local computer. So I decided to fill that gap and write remwharead. It runs on Linux, probably BSD and maybe macOS.

What is remwharead?

remwharead is a tool that allows you to save URIs of things you want to remember in a local database, along with an URI to the archived version, the current date and time, title, description, the full text of the page and optional tags. You can then export all or a portion of your aggregated hyperlinks to different formats, including AsciiDoc, RSS, JSON and Netscape Bookmark File Format.

AsciiDoc output of remwharead, formatted to HTML
Figure 1. Output of remwharead -e asciidoc | asciidoctor --backend=html5 -o file.html -

Get remwharead

You can download the latest release from https://schlomp.space/tastytea/remwharead/releases. If your CPU architecture is X86_64 (if you don’t know it probably is) and you use Debian, Ubuntu, or a distribution based on Debian or Ubuntu, you can use the attached .deb package. Download it and install with apt install ./rewharead_*.deb. Gentoo users can use my repository as described in the readme.

If there is no package for your distribution / operating system yet, you have to compile it yourself, as described in the readme.

The extension for Firefox is available from addons.mozilla.org.

How to use it

Adding an entry

Screenshot of remwhareadFF
Figure 2. remwhareadFF

Saving things is simple: Just type remwharead followed by the URI into your terminal and press “Enter”. To add tags, use the command-line switch -t or --tags.

But most of the time you’ll probably want to use remwhareadFF, the Firefox extension.

Example: Save this article with the tags remwharead, bookmarks and archive.
remwharead -t remwharead,bookmarks,archive https://blog.tastytea.de/posts/keep-track-of-what-you-have-read-online-with-remwharead/

remwharead will automatically ask the Wayback machine from the Internet Archive to archive the page and store the URI to the archived page, unless you run it with -N or --no-archive.

Retrieving / Exporting entries

To display the saved things using the export format “simple”, type remwharead -e simple. You can filter by date and time with -T or --time-span, filter by tags with -s or --search-tags and perform a full-text search with -S or --search-all. You can also use --search-tags and --search-all with regular expressions, with -r or --regex.

Example: Display all things you saved on 2019-09-23.
% remwharead -e simple -T 2019-09-23,2019-09-24
2019-09-23: Keep track of what you've read online with remwharead
            <https://blog.tastytea.de/posts/keep-track-of-what-you-have-read-online-with-remwharead/>
2019-09-23: Another good article
            <https://example.com/good-article.html>

Times are in the format YYYY-MM-DDThh:mm:ss. 2019-09-23 is short for 2019-09-23T00:00:00.

Example: Display all things you tagged with “apple” or “onion”.
% remwharead -e simple -s "apple OR onion"
2019-08-03: The best onion soup recipe of the whole internet!
            <https://example.com/onion-soup.html>
2019-04-12: 5 funny faces you can carve into YOUR apple today!
            <https://example.com/apple-faces.html>

Most export formats show only a portion of the available data for readability reasons. If you want the full datasets, use -e json or -e csv. You can also access the SQLite-database at ${XDG_DATA_HOME}/remwharead/database.sqlite, for example with sqlitebrowser.

Note
${XDG_DATA_HOME} is usually ~/.local/share.

Create an RSS feed

Want to share what you read? with the “rss” export you can create an RSS feed for your friends to subscribe. Unfortunately remwharead can’t create a valid RSS feed out of the box, because it can’t know what content the “link”-element should have. You probably also want to change the title from “Visited things” to something more descriptive.

Example: Shell script to create a valid RSS feed of the last week.
#!/bin/sh

remwharead -e rss -T $(date -d "-1 week" -I),$(date -Iminutes) \
    | sed -e 's|<link/>|<link>https://example.com/</link>|' \
    -e "s|<title>Visited things|<title>My hyperlink archive|" \
    > /var/www/feed.rss
Tip
Put that script into /etc/cron.hourly/ to update your feed once an hour.

Editing remote files with Emacs, comfortably

It took me a long time to collect all the bits and pieces I needed to make editing remote files with Emacs work the way I want, with a simple command that works via SSH. I hope I can save you some time by stitching it here together into a tutorial. I assume you use use-package in my examples.

Emacs server & TRAMP

We start with Emacs’s good old inbuilt server. The default is to use an UNIX domain socket; We have to change that to TCP to be able to receive input from our remote hosts. The server will bind to 127.0.0.1. Pick a strong password that is exactly 64 characters long and a port above 1023. I chose 51313 because if we substitute the digits for letters in the Latin alphabet, we get E M A C. The server will create the file ~/.emacs.d/server/server with the IP, port and password in it. This file needs to be distributed to every host that should be able to access the server.

;; Run server if:
;; - Our EUID is not 0,
;; - We are not logged in via SSH,
;; - It is not already running.
(unless (equal (user-real-uid) 0)
  (unless (getenv "SSH_CONNECTION")
    (use-package server
      :init
      (setq server-use-tcp t
            server-port 51313
            server-auth-key ; 64 chars, saved in ~/.emacs.d/server/server.
            "looph8oow3Aph5ahje1eek1aish3Ohthu4Paengae0iketohGhaemi2iek5ae4ee")
      :config
      (unless (eq (server-running-p) t) ; Run server if not t.
          (server-start)))))

The server expects filenames as input, we can’t just feed it the file. The package TRAMP allow us to use remote file paths with Emacs with the help of SSH. I have modified tramp-password-prompt-regexp to look for verification code prompts from the Google Authenticator PAM module.

Note
My modification overwrites the original value of tramp-password-prompt-regexp, which has a bunch of localized variants of “password” in it. You can view the original value with C-h v tramp-password-prompt-regexp.
(use-package tramp
  :custom
  (tramp-use-ssh-controlmaster-options nil) ; Don't override SSH config.
  (tramp-default-method "ssh")    ; ssh is faster than scp and supports ports.
  (tramp-password-prompt-regexp   ; Add verification code support.
   (concat
    "^.*"
    (regexp-opt
     '("passphrase" "Passphrase"
       "password" "Password"
       "Verification code")
     t)
    ".*:\0? *")))

SSH

In order to avoid having to enter our password again and again, we can edit our SSH configuration to reuse existing connections. The following configuration will create an UNIX domain socket per host and re-use that for all further connections to this host. It will also forward the Emacs server port, that we picked earlier, to every host we connect to. We will have to create ~/.ssh/sockets/ before we use the new configuration.

Warning
These sockets allow for unauthenticated access to every host you are connected to. While this is very convenient, it is also a security risk. The sockets are only usable by your user and root (file mode 0600).
Warning
Everyone on the remote host can connect to the port you forward. They will still need the password, but you might not want to do this if you don’t trust the other users.
Host fc??:* fd??:* 192.168.* server1.example.com server2.example.com
    # Reuse connections.
    ControlMaster auto
    # Close socket 600s after after last connection closes.
    ControlPersist 600
    # Set path for sockets.
    ControlPath ~/.ssh/sockets/%r@%h-%p
    # Forward Emacs-server port.
    RemoteForward 127.0.0.1:51313 127.0.0.1:51313

Wrapper for emacsclient

Using file paths in TRAMP notation gets annoying really quick. Thankfully Andy Skelton created a wrapper script; I extended it with the ability to become root using sudo and an option to use it with local servers. This file needs to be distributed to every host that should be able to access the server.

#!/bin/bash
# Open file on a remote Emacs server.
# https://andy.wordpress.com/2013/01/03/automatic-emacsclient/ with added sudo.

params=()
sudo=0
local=0

for p in "${@}"; do
    if [[ "${p}" == "-n" ]]; then
        params+=( "${p}" )
    elif [[ "${p:0:1}" == "+" ]]; then
        params+=( "${p}" )
    elif [[ "${p}" == "--sudo" ]]; then
        sudo=1
    elif [[ "${p}" == "--local" ]]; then
        # Use local server, for use with --sudo.
        local=1
    else
        # Setting field separator to newline so that filenames with spaces will
        # not be split up into 2 array elements.
        OLDIFS=${IFS}
        IFS=$'\n'

        if [[ $(id -u) -eq 0 || ${sudo} -eq 1 ]]; then
            if [[ ${local} -eq 0 ]]; then
                params+=( "/ssh:$(hostname -f)|sudo:$(hostname -f):"$(realpath -m "${p}") )
            else
                params+=( "/sudo:localhost:"$(realpath -m "${p}") )
            fi
        else
            params+=( "/ssh:$(hostname -f):"$(realpath "${p}") )
        fi

        IFS=${OLDIFS}
    fi
done

emacsclient -f ~/.emacs.d/server/server "${params[@]}"

I had to add [[ "${TERM}" = "dumb" ]] && unsetopt zle to my Zsh configuration to prevent TRAMP connections from hanging all the time. Thanks to Darius for their answer on StackExchange.

Shell configuration

Now we should set VISUAL and EDITOR to the wrapper and set up some nice, short aliases. In my examples I assume we called our wrapper emacsremote. The argument -f causes emacsclient to not try to use UNIX domain sockets (and print an error message).

Note
I wrote the following code for Zsh, but it should also work for Bash.
# Set preferred editor.
if command -v emacsclient > /dev/null; then
    VISUAL="$(command -v emacsclient) -f ~/.emacs.d/server/server -a emacs"
    if [[ -n "${SSH_CONNECTION}" ]]; then # Logged in via SSH.
        if command -v emacsremote > /dev/null; then
            VISUAL="$(command -v emacsremote)"
        fi
    elif [[ $(id -u) -eq 0 ]] && command -v emacsremote > /dev/null; then
        # Edit files as root in the Emacs instance run by the current user.
        VISUAL="$(command -v emacsremote) --sudo --local"
    fi
elif command -v emacs > /dev/null; then
    VISUAL="$(command -v emacs)"
elif command -v vim > /dev/null; then
    VISUAL="$(command -v vim)"
elif command -v nano > /dev/null; then
    VISUAL="$(command -v nano)"
fi
export VISUAL
export EDITOR="${VISUAL}"
if [[ "${VISUAL}" =~ "emacs(client|remote)" ]]; then
    alias e="${VISUAL} -n"
    if [[ "${VISUAL}" =~ "emacsremote$" ]]; then
        # Don't block the terminal until the file is closed.
        alias se="${VISUAL} -n --sudo"
    elif command -v emacsremote >/dev/null &&  [[ -z "${SSH_CONNECTION}" ]]; then
        # Edit files as root in the Emacs instance run by the current user.
        alias se="$(command -v emacsremote) -n --sudo --local"
    fi
else
    alias e="${VISUAL}"
    alias se="sudo ${VISUAL}"
fi

To detect SSH connections after using sudo -i, we have to tell sudo to preserve the environment variable SSH_CONNECTION.

echo 'Defaults env_keep += "SSH_CONNECTION"' >> /etc/sudoers.d/ssh_vars

Updates

  • Updated 2019-05-12: Add -f argument to emacsclient.

  • Updated 2019-10-06: Support files with spaces in emacsremote and allow to open files the user can’t read (for use with emacsremote --sudo).

  • Updated 2019-10-17: Added Zsh-hack to prevent hanging TRAMP-connections.

WireGuard VPN with 2 or more subnets

I wanted to create a WireGuard VPN with 2 subnets in different physical places, each with their own server. I couldn’t find an example how to do that, so I wrote this one.

Introduction

This HowTo is Linux specific.

I’m going to use the IP range fd69::/48 for the VPN, fd69:0:0:1::/64 for subnet 1 and fd69:0:0:2::/64 for subnet 2. I’m going to call the server of subnet 1 server1, its first client client1a, the second one client1b and so on.

All clients in subnet 1 will connect to server1 and all clients in subnet 2 will connect to server2. server1 and server2 will be connected. If client1a wants to connect to client2a, the route will be: client1a → server1 → server2 → client2a.

Preparations

Install WireGuard, create /etc/wireguard and generate a key-pair on each participating peer.

mkdir /etc/wireguard
cd /etc/wireguard
umask 077
wg genkey | tee privatekey | wg pubkey > publickey

Configure servers

Turn on IP forwarding:
echo "net.ipv6.conf.all.forwarding = 1" > /etc/sysctl.d/ip-forward.conf
sysctl -p /etc/sysctl.d/ip-forward.conf
Note
IP forwarding will put your computer into "router-mode", it will no longer autoconfigure via SLAAC. If you need SLAAC, add net.ipv6.conf.DEVICE.accept_ra = 2 to ip-forward.conf.
server1:/etc/wireguard/wg0.conf:
# This peer
[Interface]
Address = fd69:0:0:1::1/48
PrivateKey = <PRIVATE KEY OF server1>
ListenPort = 51820

# Server of subnet 2
[Peer]
PublicKey = <PUBLIC KEY OF server2>
Endpoint = server2:51820
AllowedIPs = fd69:0:0:2::/64

# Clients of subnet 1
[Peer]
PublicKey = <PUBLIC KEY OF client1a>
AllowedIPs = fd69:0:0:1::a/128

[Peer]
PublicKey = <PUBLIC KEY OF client1b>
AllowedIPs = fd69:0:0:1::b/128
server2:/etc/wireguard/wg0.conf:
# This peer
[Interface]
Address = fd69:0:0:2::1/48
PrivateKey = <PRIVATE KEY OF server2>
ListenPort = 51820

# Server of subnet 1
[Peer]
PublicKey = <PUBLIC KEY OF server1>
Endpoint = server1:51820
AllowedIPs = fd69:0:0:1::/64

# Clients of subnet 2
[Peer]
PublicKey = <PUBLIC KEY OF client2a>
AllowedIPs = fd69:0:0:2::a/128

Configure clients

client1a:/etc/wireguard/wg0.conf:
[Interface]
Address = fd69:0:0:1::a/48
PrivateKey = <PRIVATE KEY OF client1a>

[Peer]
PublicKey = <PUBLIC KEY OF server1>
Endpoint = server1:51820
AllowedIPs = fd69::/48
PersistentKeepalive = 25
client1b:/etc/wireguard/wg0.conf:
[Interface]
Address = fd69:0:0:1::b/48
PrivateKey = <PRIVATE KEY OF client1b>

[Peer]
PublicKey = <PUBLIC KEY OF server1>
Endpoint = server1:51820
AllowedIPs = fd69::/48
PersistentKeepalive = 25
client2a:/etc/wireguard/wg0.conf:
[Interface]
Address = fd69:0:0:2::a/48
PrivateKey = <PRIVATE KEY OF client2a>

[Peer]
PublicKey = <PUBLIC KEY OF server2>
Endpoint = server1:51820
AllowedIPs = fd69::/48
PersistentKeepalive = 25

The AllowedIPs setting acts as a routing table. When a peer tries to send a packet to an IP, it will check AllowedIPs, and if the IP appears in the list, it will send it through the WireGuard interface.

The PersistentKeepalive setting ensures that the connection is maintained and that the peer continues to be reachable, even behind a NAT.

Start VPN

Run wg-quick up wg0 on each peer.

Further reading

The article How to easily configure WireGuard by Stavros Korokithakis helped me a great deal in understanding WireGuard.

Updates

  • Updated 2019-02-16 to include IP forwarding.

  • Updated 2019-02-16 with information on how to turn SLAAC back on.

Using AsciiDoc(tor) with Gitea

In this blogpost I describe what I did to get AsciiDoc support into Gitea. If you want more than syntax highlighting and basic formatting, Gitea has to be patched unfortunately(this issue has already been reported). But I think most people will only need to edit 1 configuration file and are done.

Asciidoctor or AsciiDoc?

Asciidoctor has inbuilt support for highlight.js, the solution Gitea uses and is therefore the best choice in most scenarios. If you can’t or don’t want to use it you can use AsciiDoc.

Add the following section to conf/app.ini in your Gitea path. The change causes .adoc files to be rendered with asciidoctor.

[markup.asciidoc]
ENABLED = true
; List of file extensions that should be rendered by an external command
FILE_EXTENSIONS = .adoc,.asciidoc
; External command to render all matching extensions
RENDER_COMMAND = "asciidoctor --backend=html5 --no-header-footer --attribute source-highlighter=highlightjs --out-file=- -"
; Don't pass the file on STDIN, pass the filename as argument instead.
IS_INPUT_FILE = false

If you want to use asciidoc instead the command would be: asciidoc --backend=xhtml11 --no-header-footer --attribute source-highlighter=highlight --out-file=- -. I would choose the xhtml11 backend because it is the only one that encloses code snippets with <code> tags. Instead of highlight you can use source-highlight or Pygments.

If you use asciidoctor and don’t need tables or other fancy stuff you’re now done! If you use asciidoc, you’ll have to patch Gitea to get syntax highlighting.

Patching Gitea

The sanitizer strips almost all attributes from HTML-tags, as a security precaution. I’ve added exceptions for:

  • class attributes on all the tags Asciidoctor introduces,

  • Numerous attributes on table tags,

  • align and valign on td tags,

  • style attributes on span tags, but only if they contain nothing more than color and font definitions.

If you use Asciidoctor with highlight.js output, you don’t need to allow style attributes, if you don’t use tables you can omit the lines that deal with them and the class exception is only useful if you add custom CSS to use them.

Apply the patch with patch -p1 < gitea_relax-sanitizer.patch.

diff -ur a/modules/markup/sanitizer.go b/modules/markup/sanitizer.go
--- a/modules/markup/sanitizer.go   2019-01-26 16:04:56.014108339 +0100
+++ b/modules/markup/sanitizer.go   2019-01-26 16:03:21.776401012 +0100
@@ -38,6 +38,16 @@
 
        // Custom URL-Schemes
        sanitizer.policy.AllowURLSchemes(setting.Markdown.CustomURLSchemes...)
+       // Allow style on span tags
+       sanitizer.policy.AllowAttrs("style").Matching(regexp.MustCompile(`^(background-)?color:[^;]+(; ?font[^;]+)?;?$`)).OnElements("span")
+
+       // Allow class attribute
+       sanitizer.policy.AllowAttrs("class").OnElements("code", "pre", "span", "div", "p", "table", "td")
+
+       // Allow table attributes
+       sanitizer.policy.AllowAttrs("width", "frame", "rules", "cellspacing", "cellpadding").OnElements("table")
+       sanitizer.policy.AllowAttrs("width").OnElements("col")
+       sanitizer.policy.AllowAttrs("align", "valign").OnElements("td")
    })
 }

Tables without borders

I used tables without borders in a manpage I wrote for the list of options. Gitea insist on drawing borders around them, so I had to create a custom CSS snippet.

In your Gitea directory, create custom/templates/custom/header.tmpl.

<style>
    /* Additions for asciidoc */
    .markdown:not(code) table.frame-none
    {
        border: 0 !important;
    }
    .markdown:not(code) table.grid-none *
    {
        border: 0 !important;
    }
</style>