Project Isidore is a personal website written in LISP. This article simultaneously serves as project documentation and enables literate programming via org-babel.
1. Introduction
Welcome! This is the documentation webpage of my personal website. The canonical source code is located at GitHub - HanshenWang/project-isidore: Personal Web Application. For usage, please see the user manual and the navigation bar located at the top of the website.
- Common Lisp Environment Setup - How to get started with Lisp
- Org Publish Pipeline Section - How I Blog from Org Mode
- High Level Project Overview - How to navigate the source tree
- Project Isidore User Manual - Tutorial & Guides
- Project Isidore Reference Manual - For Developers
Navigation buttons are inserted next to each header to take one back to the table of contents.
Copyright (c) 2021 Hanshen Wang. Source code is under the GNU-AGPL-3.0 License. Blog content is available under the CC-BY-SA 4.0 License unless otherwise noted.
2. Common Lisp Environment Setup
He is like to a man building a house, who digged deep and laid the foundation upon a rock. And when a flood came, the stream beat vehemently upon that house: and it could not shake it: for it was founded on a rock. But he that heareth and doth not is like to a man building his house upon the earth without a foundation: against which the stream beat vehemently. And immediately it fell: and the ruin of that house was great.
–Luke 6:48-49
Why lisp? There are always more poets than non-poets in the world, or so I've heard.
Practically speaking, a complete and recent tour of the LISP world for beginners has already been written: Steve Losh - Intro to Common lisp. Steve Losh characterizes the field of web development – at least in Anno Domini 2021 – not as a hamster wheel of backwards incompatibility, but a hamster centrifuge. A house of sand, indeed. In technical aspects, Paul Graham's writings convey any advantages better than I can.
Below are the steps I have taken to setup a Common Lisp environment inside Spacemacs.
- CL the Language = ANSI INCITS 226-1994
- CL Implementation = Steel Bank Common Lisp If you've deep pockets or love reading about the history of Lisp implementations: The History of Franz and Lisp | IEEE Journals & Magazine | IEEE Xplore Common Lisp - Myths and Legends
- CL Library/Package Manager = Quicklisp
- CL System Manager/Build Tooling = Another System Definition Facility (comes bundled with SBCL)
- CL IDE = Spacemacs with SLY
Install Steel Bank Common Lisp, a Common Lisp implementation.
Other implementations of Common Lisp the language exist, but SBCL is the premier open-source implementation as of this writing.
Installing SBCL from your Linux distribution's package manager is the most straight forward way. On Debian/Ubuntu this is as simple as
sudo apt install sbcl
Downloading and unpacking a binary from http://www.sbcl.org/getting.html or building from source are options for later discretion. In the shell,
sbcl * (+ 2 3) * (quit)
will produce the startup SBCL banner and REPL, evaluate an expression, and quit.
Install Quicklisp - the Common Lisp Package Manager
In the shell,
curl -O https://beta.quicklisp.org/quicklisp.lisp sbcl --load quicklisp.lisp ## Inside the SBCL prompt, (quicklisp-quickstart:install) (ql:add-to-init-file) # autoload into sbcl initialization (quit)
After loading – via
(ql:quickload :project-name)
– a project, it will be stored locally. It will be in a directory similar to:/home/$USER/quicklisp/dists/quicklisp/software/example-project-2020-01-01-git
Also,
(ql:where-is-system 'system-name)
will return the system's location.project-name
andsystem-name
are interchangeable here.In order to
(ql:quickload :your-local-project)
Quicklisp looks in/home/$USER/quicklisp/local-projects/
for said project. You can symlink your project if you desire to use some other folder.ln -s /home/$USER/quicklisp/local-projects/project-a/ /home/$USER/project-a/
Not of primary importance yet is a definitions of what entails a lisp
package
or lispsystem
. It will explained in proper order during your introduction to the language, to be more specific it is covered in Peter Seibel's excellent pedagogical work, Practical Common Lisp and other guides are located here and here. N.B. that(ql:quickload :your-local-project)
also calls(asdf:load-system :your-local-project)
. The difference is(ql:quickload)
will download any missing system dependencies.Install Spacemacs
common-lisp
layer - a Common Lisp IDEThe below steps assume you are already familiar with Spacemacs. Inside Spacemacs,
SPC-h-SPC RET common-lisp RET
and follow the layer README.For those who are understandably wary of Emacs other IDE options exist. Lisp is one of the oldest higher level languages (beaten by FORTRAN by a few years). With that rich tradition comes a time where the underlying hardware, operating system, and developer tooling were unified under lisp. Unfortunately, the closest one can reasonably come today is Emacs, a lisp interpreter running on C/UNIX/hardware. Emacs still offers the best-in-class experience amongst open-source offerings, but the notorious learning curve of Emacs can be tempered with a preset configuration: Spacemacs. For those on a Microsoft Windows machine, I have written installation instructions.
Optional: Enable goto definition for SBCL primitives
Download CL Implementation source files and extract them to the location specified by
(sb-ext:set-sbcl-source-location)
, which is set in your user configuration dotfile:/home/$USER/.sbclrc
. Add it if not already present,(sb-ext:set-sbcl-source-location "/usr/share/sbcl-source/")
Now you can use
g d
to calljump-to-definition
to goto the raw documentation: the source code.Optional: Enable offline access to the CL HyperSpec language reference
The full text of the ANSI Common Lisp Standard (1994) is available online in HTML form. To have the reference handy offline and to be able to browse it within Emacs, first download and extract the HTML source for HyperSpec 7.0.tar.gz from the great Internet Archive.
Then we can configure Emacs to open only HyperSpec links inside the Emacs web browser EWW and also inform our Common Lisp IDE of the HyperSpec location.
(setf common-lisp-hyperspec-root "file:///home/$USER/HyperSpec/") ;; Optionally, execute the HYPERSPEC-LOOKUP function with local variable ;; changes to view HyperSpec links exclusively in EWW. (advice-add 'hyperspec-lookup :around (lambda (orig-fun &rest args) (setq-local browse-url-browser-function 'eww-browse-url) (apply orig-fun args)))
Now
, h H
will callsly-hyperspec-lookup
to peruse the symbol at point with the HyperSpec. Note the default behavior ofsly-hyperspec-lookup
is to open a web browser at the online HyperSpec.
Newcomers to the language are advised to dive right into the recommended reading. Now's a good chance to use org-babel for some note-taking. If you've forked the repo to play around with, a high-level overview of the project may be useful.
2.1. Literate Programming
Lisp was the language for research into artificial intelligence before the AI winter. Now the field calls itself machine intelligence. The aim remains much like the story of Icarus: to ape the most noble aspect of man, his rational nature. Until that is realized, humans will be first and foremost the most important audience for any computer language or software. Here is how I setup literate programming with lisp.
You likely heard of org-babel
, an extension of org-mode that allows one to
interleave text and code. It is comparable to a more powerful jupyter
notebook. Get lisp in org-babel blocks by, adding to your init.el/user-config.el
(org-babel-do-load-languages 'org-babel-load-languages '((lisp . t)))
When evaluating a code block with C-c C-c
(, ,
in spacemacs), make sure
to start SLIME first (M-x slime RET).
(princ "Hello World!")
Some may be familiar with poly-org
, a MELPA package which allow multiple
major modes. Naturally this comes in handy when using literate programming.
It uses font-lock-mode
to turn on the relevant major mode when your cursor
is inside said code block. This saves you from having to call
org-edit-special
repeatedly.
Furthermore, for most languages you can only evaluate the
entire code block. Not so for lisp. M-x slime-compile-defun
and M-x
slime-compile-region
do as they say on the tin: compile the specific
function or highlighted region at cursor. Poly-org
breaks these functions
slightly as they do not treat #+begin_src
and #+end_src
as the start and
end-of-file respectively. The following emacs lisp snippet fixes that.
(with-eval-after-load "poly-org" ;; sly-compile-file sends entire .org file. Narrow to span as done in poly-R ;; https://github.com/polymode/poly-org/issues/25 (when (fboundp 'advice-add) (advice-add 'slime-compile-file :around 'pm-execute-narrowed-to-span) (advice-add 'slime-compile-defun :around 'pm-execute-narrowed-to-span) (advice-add 'slime-load-file :around 'pm-execute-narrowed-to-span) (advice-add 'slime-eval-defun :around 'pm-execute-narrowed-to-span) (advice-add 'slime-eval-last-expression :around 'pm-execute-narrowed-to-span) (advice-add 'slime-eval-buffer :around 'pm-execute-narrowed-to-span)))
Since we've already shaved the editor yak up to this point, the last package
I'd like to mention that makes literate programming possible is
org-tanglesync]]. To "tangle" a file is, in literate programming parlance, to
extract just the source code from a document. Then to de-tangle (two-way
sync) has always been a problem, and traditionally org-babel-detangle
has
relied upon unsightly link comments to do so.
Org-tanglesync
keeps your .git controlled source code and your .org mode
file in sync. And it does that without any markings or artifacts on the tangled
code, making collaboration easier. If a more elegant solution to
de-tangle a file exists out in the wild, please do let me know.
3. Project Isidore System Definition
Bibliographies exist for a written corpus of work. The same sort of metadata is
needed a code base. In Common Lisp, it is the .asd
file.
Project Isidore follows the common Model-view-controller (MVC) design pattern. It entails encapsulating data together with its processing (the model) and isolating it from the manipulation (the controller) and presentation (the view) part that has to be done on a user interface.
The project follows the ASDF package inferred system style of using defpackage
forms to specify file inter-dependencies. The entry point of dependency graph is
packages.lisp. To understand the code base, the files are named after the MVC
design pattern.
For an index of symbols, functions and definitions, see the Reference Manual.
3.1. Libraries & System dependencies
Finding new Libraries & Surveying the Ecosystem
Looking for a library? In addition to online search queries, use command(s)
(ql:system-apropos :library) ; search for term in quicklisp dist (ql:who-depends-on :library) ; usage in lisp ecosystem (ql-dist:dependency-tree :library) ; number of dependencies upstream ;; See quicklisp-stats README for example usage. (ql:quickload "quicklisp-stats") ; look at quicklisp download stats (ql:quickload "quicksearch") (qs:? "bluetooth" :du 100)
In addition, look to Sabracrolleton's detailed reviews of Common Lisp libraries. Another great hint to library quality is the Github page for Zach Beane, who does the thankless job of maintaining Quicklisp.
Quicklisp in one respect is less like Javascript's Node Package Manager and more like Debian's apt. Zach makes sure that ALL libraries on Quicklisp build together. He takes this burden upon himself so the end users might avoid dependency hell.
System Dependency graph
Call
sly-eval-buffer
on the following code block to update the graph.;; https://40ants.com/lisp-project-of-the-day/2020/05/0063-asdf-viz.html ;; Inside shell "sudo apt install graphviz". (ql:quickload :cl-dot) ;; Not present in quicklisp, retrieve from https://github.com/guicho271828/asdf-viz (ql:quickload :asdf-viz) (ql:quickload :project-isidore) ; also loads cl-ppcre. (setf cl-dot:*dot-path* (string-trim '(#\space #\newline) (second (ppcre:split " "(nth-value 0 (uiop:run-program "whereis -b dot" :output :string)))))) ;; Tilde char "~" in destination pathname throws an error. (asdf-viz:visualize-asdf-hierarchy (asdf:system-relative-pathname "project-isidore" "assets/project-isidore-dependency-graph.png") (list (asdf:find-system :project-isidore))) ;; "asdf-viz" also can draw class hierarchies and call graphs.
Figure 1: Project Isidore dependency graph Transitive dependencies & Lines of Code from running,
cd ~/quicklisp/dists/quicklisp/software/ find . -name '*.lisp' | xargs wc -l
Date Version (Commit) # of Libraries LOC 2021-11-05 v1.1.0 (7cc0598) 35 251532 2021-12-23 v1.2.0 (fd7c9f3) 42 264852 - Javascript Dependencies
- Highlight.js 11.2.0
Upgrade via editing
:head
underorg-publish-project-alist
. - Mathjax 2.7.0
See
C-h v org-html-mathjax-options
andorg-html-mathjax-template
.
- Highlight.js 11.2.0
Upgrade via editing
Lisp Web Example Projects & Misc. Resources
Cool libraries to check out:
Library Author Function CL-CSV Edward Marco Baringer Spreadsheet CLOG David Botton Websockets CL-Unification Marco Antonetti Generalized pattern matching Cells Kenny Tilton Dataflow (GUI) - L-99: Ninety-Nine Lisp Problems
- https://cells-gtk.common-lisp.dev/cgtk-primer.html
- https://nullprogram.com/blog/2013/06/10/ (WebGPU is the successor to WebGL).
- GitHub - byulparan/websocket-demo: websocket/webgl demo in CommonLisp
- https://www.cs.cmu.edu/Groups/AI/html/faqs/lang/lisp/
- Duncan Bayne / heroku-app-clozure-common-lisp · GitLab. Used as a base template. Hence the AGPL License on my own derived work.
- GitHub - gongzhitaao/orgcss: Simple and clean CSS for Org-exported HTML Used as a base template.
- bendersteed / bread and roses · GitLab
- GitHub - vindarel/lisp-web-template-productlist: A web template with Hunchent…
- cl-bootstrap/demo at master · rajasegar/cl-bootstrap · GitHub
- GitHub - mtravers/wuwei: WuWei – effortless Ajax web UIs from Common Lisp
- https://common-lisp.net/
- https://google.github.io/styleguide/lispguide.xml
- Tools of the Trade, from Hacker News. List of SaaS services
- GitHub - 255kb/stack-on-a-budget: A collection of services with great free ti…
- GitHub - ripienaar/free-for-dev: A list of SaaS, PaaS and IaaS offerings that…
- Lisp Web Server From Scratch using Hunchentoot and Nginx | Zaries's Blog
- ergolib/web at master · rongarret/ergolib · GitHub
- GitHub - eigenhombre/weeds: Work in progress porting my blog software to Comm…
- GitHub - mmontone/cl-forms: Web forms handling library for Common lisp
- lispm comments on Steel Bank Common Lisp (SBCL) 1.4.7 released
- READ EVAL PRINT — It's alive! The path from library to web-app.
- GitHub - no-defun-allowed/nqthm: nqthm - the original(ish) Boyer-Moore theore…
- GitHub - urweb/urweb: The Ur/Web programming language
- LISA - a production-rule system for the development of Lisp-based Intelligent…
- EPTCS 359: ACL2 Theorem Prover and its Applications
- GitHub - hemml/OMGlib: A Common Lisp library to build fully dynamic web inter…
- https://imps.mcmaster.ca/imps-system/README
- GitHub - personal-mirrors/maxima: 🔰 Personal hopefully daily updated Maxima m…
- GitHub - svspire/YACC-is-dead: YACC is dead for Common Lisp (based on http://…
4. Org Publish Pipeline
In the /assets/blog/
folder, all HTML
files are generated by org-publish
.
archive.html
lists all published articles.
In Spacemacs you will need to enable the org
layer. Org Mode comes with Org
Publish built in. Org Publish takes advantage of Org's excellent export
capabilities to generate not only the HTML but also the sitemap and RSS.
To publish/update an article:
- Prepare draft/existing notes for publishing, move .org file to input folder, remove extraneous links, update citations, check formatting etc.
M-x RET org-publish RET blog
- Commit and push changes in git
My Org Publish config is part of my literate .spacemacs org file. I plan to publish the entire dotfile one day, similar to Org Mode - Organize Your Life In Plain Text! Until then, here is all the relevant configuration extracted.
Other exemplars here: good tutorials on org-publish : emacs, New and Improved: Two-Wrongs Now Powered By Org Mode
;;------------------------------------------------------------------------- ;; *** Org Ox Publish Config ;;------------------------------------------------------------------------- (require 'ox-rss) ;; Following 2 lines are needed to exclude parent heading from table of contents but still export the content ;; https://emacs.stackexchange.com/questions/30183/orgmode-export-skip-ignore-first-headline-level (require 'ox-extra) (ox-extras-activate '(ignore-headlines)) ;; Allows exporting bibtex citations to html (require 'ox-bibtex) ;; Exclude default CSS from html export and add external stylesheet (setq org-html-head-include-default-style nil) ;; Omit inline css as we use an imported stylesheet (setq org-html-htmlize-output-type 'css) ;; https://www.taingram.org/blog/org-mode-blog.html (setq org-export-global-macros '(("timestamp" . "@@html:<span class=\"timestamp\">[$1]</span>@@"))) (defun my/org-sitemap-date-entry-format (entry style project) "Format ENTRY in org-publish PROJECT Sitemap format ENTRY ENTRY STYLE format that includes date." (let ((filename (org-publish-find-title entry project))) (if (= (length filename) 0) (format "*%s*" entry) (format "{{{timestamp(%s)}}} [[file:%s][%s]]" (format-time-string "%Y-%m-%d" (org-publish-find-date entry project)) entry filename)))) (setq org-publish-project-alist '(("blog" :components ("blog-content" "blog-rss")) ("blog-content" :base-directory "~/Dropbox/project-maria/blog" :html-extension "html" :base-extension "org" :recursive t :publishing-function org-html-publish-to-html :publishing-directory "~/project-isidore/assets/blog" :section-numbers t :table-of-contents t :exclude "rss.org" :with-title nil :auto-sitemap t :sitemap-filename "archive.org" :sitemap-title "Blog Archive" :sitemap-sort-files anti-chronologically :sitemap-style tree :sitemap-format-entry my/org-sitemap-date-entry-format ;; Use HTML5 ;; https://orgmode.org/manual/HTML-doctypes.html#HTML-doctypes :html-doctype "html5" :html-html5-fancy t ;; Link to external custom stylesheet ;; If you need code highlight from highlight.js, include the latter three lines. :html-head " <link rel=\"stylesheet\" type=\"text/css\" href=\"../global.css\"/> <link rel=\"stylesheet\" href=\"//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.2.0/styles/base16/solarized-light.min.css\"> <script src=\"//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.2.0/highlight.min.js\" defer></script> <script>var hlf=function(){Array.prototype.forEach.call(document.querySelectorAll(\"pre.src\"),function(t){var e;e=t.getAttribute(\"class\"),e=e.replace(/src-(\w+)/,\"src-$1 $1\"),console.log(e),t.setAttribute(\"class\",e),hljs.highlightBlock(t)})};addEventListener(\"DOMContentLoaded\",hlf);</script>" :html-preamble " <div class=\"header header-fixed\"> <div class=\"navbar container\"> <div class=\"logo\"><a href=\"/\">Hanshen Wang</a></div> <input type=\"checkbox\" id=\"navbar-toggle\" > <label for=\"navbar-toggle\"><i></i></label> <nav class=\"menu\"> <ul> <li><a href=\"/about\">About</a></li> <li><a href=\"/work\">Work</a></li> <li><a href=\"/blog/archive.html\">Blog</a></li> <li><a href=\"/contact\">Contact</a></li> </ul> </nav> </div> </div> <h1 class=\"title\">%t</h1> <p class=\"subtitle\">%s</p> <br/> <p class=\"updated\"><a href=\"/contact#article-history\">Updated:</a> %C</p>" ;; Article Postamble includes ;; Javascript snippet to insert anchor links to Table of Contents ;; HTML Footer :html-postamble "<script> const headers = Array.from( document.querySelectorAll('h2, h3, h4, h5, h6') ); headers.forEach( header => { header.insertAdjacentHTML('afterbegin', '<a href=\"#table-of-contents\">⇱</a>' ); }); </script> <hr/> <footer> <div class=\"copyright-container\"> <div class=\"copyright\"> Comments? Corrections? <a href=\"https://hanshenwang.com/contact\"> Please do reach out.</a><a href=\"https://hanshenwang.com/blog/rss.xml\"> RSS Feed. </a><a href=\"https://hanshenwang.com/subscribe\"> Mailing List. </a><br/> Copyright © 2021 Hanshen Wang. Some Rights Reserved.<br/> Blog content is available under <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\"> CC-BY-SA 4.0 </a> unless otherwise noted. </div> <div class=\"cc-badge\"> <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-sa/4.0/\"> <img alt=\"Creative Commons License\" src=\"https://i.creativecommons.org/l/by-sa/4.0/88x31.png\" height=\"31\" width=\"88\"> </a> </div> <div class=\"rss-badge\"> <a rel=\"license\" href=\"http://hanshenwang.com/blog/rss.xml\"> <img alt=\"Really Simple Syndication - RSS\" src=\"https://upload.wikimedia.org/wikipedia/en/thumb/4/43/Feed-icon.svg/50px-Feed-icon.svg.png\" height=\"50\" width=\"50\"> </a> </div> </div> <div class=\"generated\"> Created with %c on <a href=\"https://www.gnu.org\">GNU</a>/<a href=\"https://www.kernel.org/\">Linux</a> </div> </footer>" ) ("blog-rss" :base-directory "~/Dropbox/project-maria/blog" :base-extension "org" :publishing-directory "~/project-isidore/assets/blog" :publishing-function publish-posts-rss-feed :rss-extension "xml" :html-link-home "http://hanshenwang.com/" :html-link-use-abs-url t :html-link-org-files-as-html t :exclude "archive.org" :auto-sitemap t :sitemap-function posts-rss-feed :sitemap-title "Hanshen Wang Blog RSS" :sitemap-filename "rss.org" :sitemap-style list :sitemap-sort-files anti-chronologically :sitemap-format-entry format-posts-rss-feed-entry) )) ;; https://alhassy.github.io/AlBasmala#Clickable-Headlines (defun my/ensure-headline-ids (&rest _) "Org trees without a custom ID will have All non-alphanumeric characters are cleverly replaced with ‘-’. If multiple trees end-up with the same id property, issue a message and undo any property insertion thus far. E.g., ↯ We'll go on a ∀∃⇅ adventure ↦ We'll-go-on-a-adventure " (interactive) (let ((ids)) (org-map-entries (lambda () (org-with-point-at (point) (let ((id (org-entry-get nil "CUSTOM_ID"))) (unless id (thread-last (nth 4 (org-heading-components)) (s-replace-regexp "[^[:alnum:]']" "-") (s-replace-regexp "-+" "-") (s-chop-prefix "-") (s-chop-suffix "-") (setq id)) (if (not (member id ids)) (push id ids) (message-box "Oh no, a repeated id!\n\n\t%s" id) (undo) (setq quit-flag t)) (org-entry-put nil "CUSTOM_ID" id)))))))) ;; Whenever html & md export happens, ensure we have headline ids. (advice-add 'org-html-export-to-html :before 'my/ensure-headline-ids) (advice-add 'org-md-export-to-markdown :before 'my/ensure-headline-ids) ;; https://nicolasknoebber.com/posts/blogging-with-emacs-and-org.html (defun format-posts-rss-feed-entry (entry _style project) "Format ENTRY for the posts RSS feed in PROJECT." (org-publish-initialize-cache "blog-rss") (let* ((title (org-publish-find-title entry project)) (link (concat "blog/" (file-name-sans-extension entry) ".html")) (author (org-publish-find-property entry :author project)) (pubdate (format-time-string (car org-time-stamp-formats) (org-publish-find-date entry project)))) (message pubdate) (format "%s :properties: :rss_permalink: %s :author: %s :pubdate: %s :end:\n" title link author pubdate))) (defun posts-rss-feed (title list) "Generate a sitemap of posts that is exported as a RSS feed. TITLE is the title of the RSS feed. LIST is an internal representation for the files to include. PROJECT is the current project." (concat "#+TITLE: " title "\n#+EMAIL: [email protected]" "\n\n" (org-list-to-subtree list))) (defun publish-posts-rss-feed (plist filename dir) "Publish PLIST to RSS when FILENAME is rss.org. DIR is the location of the output." (if (equal "rss.org" (file-name-nondirectory filename)) (org-rss-publish-to-rss plist filename dir)))
5. Development Operations
Follow industry best practice in automating part of development operations. In the context of this project, CI/CD is done on the Github Actions platform. Cheers to Github for allowing unlimited build limits for open source projects! In the following breakdown, I explain how to run the steps on the local machine.
For the uninitiated, an excellent git porcelain (with spacemacs layer integration) is Magit. This, paired with [1], takes care of your version control needs.
5.1. Continous Integration
Some measure of quality assurance through unit, integration and regression testing upon commit. This is done in MAKE.LISP, see also TESTS.LISP.
5.1.1. Testing
Project Isidore code coverage report.
There does exist a detailed evaluation of the many existing Common Lisp testing
frameworks, the most recent iteration by Sabracrolleton. From this we select
parachute
as our testing framework of choice.
Project Isidore code aims to be portable Common Lisp within reason. Common Lisp Portability Library Status
Notes on Drakma
From: https://courses.cs.northwestern.edu/325/admin/json-client.php
Install Drakma
This should be easy.
(ql:quickload "drakma")
Using Drakma
Get data with URLs in Drakma is simple:
(drakma:http-request url)
url needs to be a string containing a complete URL. Drakma will send a request for that URL to the server indicated, just as a browser or other user client would.
drakma:http-request returns seven values, as Lisp multiple values. If you need to save or use values other than the first, use multiple-value-bind or a similar form. The values returned, in order, are
the body of the reply, either a string, when getting HTML or plain text, a binary array for images, audio, and JSON, or a file stream, if requested using a keyword parameter the HTTP status code an alist of the headers sent by the server the URI the reply comes from, which might be different than the request when redirects occur the stream the reply was read from a boolean indicating whether the stream should be closed the server's status text, e.g., "OK"
Notes on CL-SMTP
Use a basic Gmail account to “Send mail as” with a domain that uses Cloudflar…
;; https://gist.github.com/carboleum/cf03b4c16655f257d96bda8f41f51471 ;; Gmail is limited to 500 emails a day. (ql:quickload "cl-smtp") (let ((from "[email protected]") (to "[email protected]") (subject "test") (message "Hello ;-) from cl-smtp") (login "[email protected]") ;; Generate an application password via google account settings. ;; Use fly.io secrets for production deployments. (passwd "replace-me-with-app-pwd")) (cl-smtp:send-email "smtp.gmail.com" from to subject message :ssl :tls :authentication `(,login ,passwd)))
Debugging
From a former maintainer of SBCL, Nikodemus Siivola. The tracing and stickers
functionality in the SLY IDE is also very useful. Call sly-compile-defun
with
a universal argument C-u to recompile with highest debug settings.
- Test case reduction is an essential skill, no matter the language or the environment. It is the art of reducing the code that provokes the issue (wrong result, an error, whatever) down to manageable size – including the full call path involved, and environmental issues like Slime vs no Slime, or current directory. The smaller the better in general, but it is a balancing act: if you can identify the issue using other methods in five minutes, it doesn't make sense to spend an hour or two boiling down the test case. …but when other methods are not producing results, and time is dragging on then you should consider this.
- Defensive programming. Not just coding to protect against errors, but coding so that your code is easy to debug. Avoid IGNORE-ERRORS and generally swallowing errors silently. Sometimes it is the right thing to do, but the more you do it, the harder BREAK-ON-SIGNALS becomes to use when you need it. Avoid SAFETY 0 like the plague – it can hide a multitude of sins. Avoid DEBUG 0 – it doesn't pay. Write PRINT-OBJECT methods for your objects, give your objects names to use when printing. Check that slots are bound before you use them in your PRINT-OBJECT methods. NEVER use INTERRUPT-THREAD or WITH-TIMEOUT unless you really know what you are doing and exactly why I'm telling you not to use them.
- Stop to think. Read the error messages, if any, carefully. Sometimes they're no help at all, but sometimes there are nuggets of information in them that a casual glance will miss.
- Know your environment. (This is what the question was really about, I know…)
3.0. M-. is gold. Plain v on a backtrace frame in Slime may also take you places if your code has a sufficiently high debug setting, but M-. should work pretty much always.
3.1. The Slime Inspector is one of my primary debugging tools – but that is probably affected by the kind of code I work on, so it might not be the same for everyone. Still, you should familiarize yourself with it – and use the fancy one. :)
3.2. While SBCL backtraces aren't at the moment the pretties ones in the world, try to make sense out of them. Just do (error "foo") in the REPL, and figure out what is going on. Experiment with both the plain SBCL debugger and the fancy Slime Debugger before you need to use them for real. They'll feel a lot less hostile that way. I'll write advice on interpreting the backtraces at another time.
3.3. Learn about BREAK-ON-SIGNALS and TRACE. Also note the SBCL extensions to TRACE.
3.4. The stepper isn't really a debugging tool, IMO – it is a tool for understanding control flow, which sometimes helps in debugging – but if you compile your code with DEBUG 3, then (STEP (FOO)) can take you to places.
3.5. Learn about M-RET (macroexpand) in Slime. Discover what happens if you do (SETF PRINT-GENSYM NIL) first, and understand the potential danger there – but also the utility of being easily able to copy-paste the expansion into your test case when you're trying to reduce it. (Replacing expansions of macros in the COMMON-LISP package is typically pointless, but replacing those from user packages can be golden.)
3.6. If all else fails, do (sb-ext:restrict-compiler-policy 'safety 3) and (sb-ext:restrict-compiler-policy 'debug 3) and recompile your code. Debugging should be easier now. If the error goes away, either (a) you had a type-error or similar in SAFETY 0 code that was breaking stuff but is now swallowed by an IGNORE-ERRORS or a HANDLER-CASE or (b) you may have found on SBCL bug: compiler policy should not generally speaking change codes behaviour – though there are some ANSI mandated things for SAFETY 3, and high DEBUG can inhibit tail-call optimizations which, as Schemers know, can matter.
3.7. DISASSEMBLE isn't normally a debugging tool, but sometimes it can help too. Depends on what the issue is.
- Extensions to printf() debugging. Sometimes this is just the easiest thing. No shame in there.
4.1. Special variables are nice.
(LET ((DEBUG :MAGIC)) …) and elsewhere (WHEN (EQ :MAGIC DEBUG) (PRINT (LIST :FOO FOO)))
Because I'm lazy, I tend to use * as the magic variable, so I can also trivially set it in the REPL. This allows you to get the debugging output for stuff you are interested in only when certain conditions are true. Or you can use it to tell which code path the call is coming from, etc.
4.2. Don't be afraid to add (break "foo=~S" foo) and similar calls to the code you're debugging.
4.3. SB-DEBUG:BACKTRACE can sometimes be of use.
Logging
Heroku will store your application logs for you. Logging is a step above print debugging and can be thought of as "live" documentation. If you don't have the interactivity of the LISP REPL for diagnosis as physician, then you must debug through diagnosis as mortician. Good logs are as useful as detailed blood stains.
Edi Weitz 2007-05-11 19:45:03 UTC Permalink Post by Slava Akhmechet
I'm looking through the Hunchetoot log and occassionally I see the
couldn't write to #<SB-SYS:FD-STREAM for "a Connection reset by peer couldn't write to #<SB-SYS:FD-STREAM for "a Broken pipe
I don't see any problems in the browser, I only found out about these messages because I looked at the log. Can someone think of a situation in which these errors would occur without interrupting user experience and why?
The user pressed "Stop" or went to another website before the page (including all images and other stuff) was fully loaded. "Broken pipe" and "Connection reset by peer" are pretty common error messages you'll find (for example) in every Apache logfile - this is nothing specific to Hunchentoot or Lisp.
HTH, Edi.
Generate Code Coverage Report
;;; Generate Code Coverage Report ;; SBCL specific contrib module. (require :sb-cover) ;; Compiler knob: Turn on generation instrumentation (declaim (optimize sb-cover:store-coverage-data)) ;; Load while ensuring that it's recompiled with the new optimization ;; policy. (asdf:load-system :project-isidore :force t) (asdf:test-system :project-isidore) ;; HTML report output location (sb-cover:report "../assets/code-coverage-report/") (declaim (optimize (sb-cover:store-coverage-data 0)))
5.2. Continous Delivery
Building an executable binary for all 3 major operating system platforms on the
x86-64 architecture. Binaries built this way may be downloaded through the
Github release interface. This is done as the final step of MAKE.LISP by calling
(asdf:make "system-name")
. More esoteric architecture/OS pairs will have to
compile from source, refer to the SBCL compiler supported platform table.
Lisp being amenable to image based development means an executable binary in the Lisp world saves the entire global state (the stack is unwound), libraries and SBCL runtime included.
5.3. Continous Deployment
Commit | Company | Service |
---|---|---|
N/A | Heroku | Platform (Buildpacks) |
603da4f | Fly.io | Platform (Docker) |
1c7a8e9 | Oracle Cloud Infrastructure | Virtual Private Server |
TBA | Hetzner? | |
TBA | Multiple old Thinkpads | Physical Server(s) |
TBA | Electronic scrap, a soldering iron, and oscilloscope | FULL STACK |
Having to go through initial setup of a new computer dampens the otherwise joyous occasion of unwrapping new hardware. The pain is multiplied if it happens to be one's job to manage many computers. In its most general form, nerds around the world are busy tackling the problem of software reproducibility. NixOS and Guix System are the ones currently well-known. I would prefer Guix over NixOS myself on account of preferring Guile Scheme over the Nix Expression Language (self-admittedly designed with the goal of not being a full-featured, general purpose language). I was not able to get Guix System (version 1.3.0) working on Oracle's aarch64 A1-Flex VMs. Please email me if you have. In the meantime I am using and enjoying consfigurator for declarative server configuration, at least avoiding another Tower of Babel situation.
Current infrastructure diagram:
Note to self, before scaling this solution horizontally, consider the COST of doing so. https://groups.csail.mit.edu/tds/papers/Lynch/MIT-LCS-TM-394.pdf
Prerequisites:
- Project Isidore source code.
- Common Lisp development environment set up.
- Basic familiarity with asymmetric encryption.
Deploy Project Isidore
Shell access with Root SSH credentials to a Debian machine is needed. Whether this is a physical server or a virtual private server is up to you. I have a guided setup on Oracle Cloud Infrastructure.
Read PRODUCTION.LISP to properly supply SSH credentials and SSL keys.
DNS Resolution - associate domain name with public IP addresses
Optional: Enable IPv6 on Oracle Cloud Infrastructure.
Visitors to our website won't want to key in http://140.291.294.154:8080. Our Domain Registrar will point http://my-domain-name.com to http://140.291.294.154. Purchase a domain name if needed.
I am not getting paid to say this, but if you are in the market for a Domain Name Registrar, I am a happy customer of Porkbun. Unlike Google, you can reach a human being. Three cheers for Porkbun, and may they grow in success without losing their human touch.
First time customer, brand new account. Foolishly tried to register a domain on my overcharged credit card on the first try. Account was flagged and locked (understandably). Paid off credit card, contacted chat support, account was unlocked and order completed in less than 5 minutes. Lovely experience. Appreciate the cute graphics as well. – review by some stranger named Ben
Cloudflare seem like an ethical company from what I can gather from their blogs and I'll be making use of their generous free tier in the next steps. Oracle has my gratitude, but my recommendation of Porkbun and Cloudflare comes without reservations.
Search online for current methods to create a DNS A records (and/or AAAA records for IPV6) specific to your Domain Registrar. Port 80 is the default port for HTTP traffic, as entering http://my-domain-name.com is equivalent to entering http://my-domain-name.com. If you don't want visitors to have to key in http://my-domain-name.com:8080/ then either read on, use OCI's docs to redirect a subdomain to a URL with a port number, or configure your web server to listen on – what is on UNIX a privileged port – 80. Cloudflare can also redirect all HTTP requests to HTTPS instead of doing this on the server NGINX configuration. This allows me to close port 80 and only have port 443 exposed to the public.
Denial-of-Service protection and Content Delivery Network with Cloudflare
Cloudflare here acts as a reverse proxy (man-in-the-middle) between the website visitor (client) and the server; it provides protection as well as caching services. Now by registering your domain with Cloudflare your domain will be served as HTTPS to the client. But communications from Cloudflare to our server is still via un-encrypted HTTP. We remedy this by using Origin Certificates.
After setting up Authenticated Origin Pulls the origin server now only accepts HTTP requests that use Cloudflare's valid client certificate. There will be no more directly connecting to the web server via IP address.
This helps Cloudflare's Web Application Firewall and its blocking of automated WordPress vulnerability scanning. WordPress powers 43% of all websites. This means any public website will face a barrage of probing attacks targeting WordPress vulnerabilities. As an example, use Web Application Firewall to block all requests with URI paths containing "php" or "wp-includes" from reaching Nginx.
Developing a remote LISP image
Of course with the incremental re-defintion in the LISP REPL and our object database, Rucksack, having update-instance-for-redefined-class it would be fun to imagine the flexibility if we could connect a local sly to our production LISP image. Doing so gives us great observation capabilities and is an cool technique to add to our toolbox. What has by now become the stuff of internet legend, Ron Garret's story of Debugging code from 60 million miles away, continues to serve as an inspiration.
SLY User Manual, version 1.0.42
For an example of how this application starts a slynk server see da1f7fa. There is no need to allow ingress on OCI's virtual cloud network or in firewalld for port 4005. The 4005 ports on both the local and remote machines are communicate via SSH port 22. Create the SSH tunnel on the local machine,
ssh -L4005:localhost:4005 pi-server
M-x sly-connect RET "localhost" RET 4005 RET
.I was playing around with connecting sly to a remote LISP image. I wondered if slynk:create-server would still work after sbcl:save-lisp-and-die. I mean there's no reason to think it wouldn't, but I had to move the slynk:create-server form to the toplevel function used by sbcl:save-lisp-and-die. Otherwise "sudo ss -tulpn | grep LISTEN" didn't show the open port. Mentioning this if it saves somebody else some time.
Crack open a cold one.
Optional: with the boys.
5.3.1. Oracle Cloud Infrastructure
Oracle Corporation offers the most generous free tier in cloud computing (Infrastructure-as-a-Service) by far. I speculate that this is due to Amazon Web Services (AWS) capturing the biggest part of the pie with Oracle competing with the likes of Google and IBM. Oracle historically has had disputes over their stewardship of Open Source software (Java/JVM patent issues, MySQL, OpenOffice, and Solaris) and – with a significant portion of their clients being Fortune 500 companies– you could say that it represents the best and worst parts of litigation-happy corporate America. But I'm not interested in traveling down the road to serfdom (love some Thomas Sowell as well) and neither am I interested in preaching on the idolatrous love of money. What I will relate to below is my experience with Oracle Cloud Infrastructure (OCI) which has thus far been very positive. Thank you Oracle, OCI has been great to use. Create an account to get started.
See also:
Run Always Free Docker Container on Oracle Cloud Infrastructure | by Lucas Je…
Oracle cloud free tier quirks - The Tallest Dwarf
-
Bring your own Image - Debian Stable.
Download the correct
.qcow2
file for your system architecture (ARM64 in our case). In OCI search for Storage > Object Storage & Archive > Buckets, and upload the file into a bucket. Go to Compute > Custom Images and click on Import Image. Make sure "Paravirtualized Mode" is selected. After importing, click Edit Details and tick the box for "VM.Standard.A1.Flex". Edit Custom Image Capabilities to ensure UEFI64 is preferred.Top left hamburger menu > Compute > Instances > Create Instance
Select the imported Debian image and a shape labeled "always free".
After the "Image and Shape" section is the "Networking" section. The defaults are fine here. Rename if desired and take note of the Virtual Cloud Network (VCN) name and subnet name.
Under the subheading "Add SSH keys" we can choose to copy and paste the contents of a public key. Generate it like so:
ssh-keygen -t ed25519 -a 100 -N "" -C "oracle-vm-key" -f ~/.ssh/oracle-vm-key
The file oracle-vm-key contains the private key (-N "" means no passphrase protection). The file oracle-vm-key.pub contains the public key that we will give to cloud-init by pasting the contents of the public key
~/.ssh/oracle-vm-key.pub
file.Note the ability to specify a custom boot volume size. I believe the minimum boot volume size is 47GB. So with the free allowance of 200GB it is possible to have 4 VM instances. For now I would rather avoid the added complexity of distributed computing. I enlarge the boot volume of my one VM to 200GB.
After the instance is finished provisioning, write down the public IP address assigned to the VM.
Setup Root SSH login into the Server
ssh [email protected] -i private-key-file
Replace the host-address with the public IP assigned to the VM. Replace private-key-file with a reference to the file that contains the SSH private key. "opc" is the default user for Oracle Linux. For the Debian image, the default user would be "debian". So in our local shell,
ssh [email protected] -i ~/.ssh/oracle-vm-key # Setup Root SSH access. sudo sed -i -e 's/PermitRootLogin no/PermitRootLogin without-password/' /etc/ssh/sshd_config sudo cp -f /home/debian/.ssh/authorized_keys /root/.ssh/authorized_keys exit
Create the file ~/.ssh/config on our local machine if it does not already exist and add the following lines to connect more ergonomically with
$ ssh oci-a1-flex
.Host oci-a1-flex User root HostName 140.291.294.154 IdentityFile ~/.ssh/oracle-vm-key ControlPath ~/.ssh/%[email protected]%h:%p ControlMaster auto ControlPersist yes
Optional: connect from Emacs TRAMP.
Now
C-x C-f
with the address/ssh:pi-server:/.
ought to work. Popping a shell should also just work thanks to the magic of TRAMP.Setup Ingress Rules in Security List to open ports on Virtual Cloud Network.
We can open the ports we need on our server, but we also need to open said ports on the virtual cloud network level.
Top left hamburger menu > Networking > Virtual Cloud Networks > VCN-name > Subnet-name > Default Security List for VCN-name > Add Ingress Rules
Example: To allow incoming requests from any IP address to port 443: set source CIDR to 0.0.0.0/0 for IPV4 (::/0 for IPV6) and leave Source Port Range blank. Destination Port Range is set to 443.
Caution is required here when exposing our compute instance to the wild, wild internet. Oracle ought to and will shutdown any hijacked bot VPSs and terminate the accounts. I have also seen at least one email screen capture of account termination due to torrenting copyrighted material (while using OCI as a VPN). I think it goes without saying that crypto-mining violates some end user license agreement. Consult Oracle's manual to safely run graphical applications. While googling "oracle cloud caveats", I was led to pay special attention to their FAQ.
However, if you have more Ampere A1 Compute instances provisioned than are available for an Always Free tenancy, all existing Ampere A1 instances are disabled and then deleted after 30 days, unless you upgrade to a paid account. [emphasis mine] To continue using your existing Arm-based instances as an Always Free user, before your trial ends, ensure that your total use of OCPUs and memory across all the Ampere A1 Compute instances in your tenancy is within the Always Free limit.
You are able to re-create any deleted instances. Still, given the upgrade process from a free account to pay-as-you-go relies on further fraud prevention through Cybersource, I would not be surprised if a share of user woes are unique to the free-tier classification and Oracle's interpretation of "The Always Free services will continue to be available for unlimited time as long as you use the services" (Ibid.)
In my experience of upgrading an always free tenancy to pay-as-you-go, you can enter your billing details perfectly accurately. I even saw the test charge successfully debited and credited to my online banking portal, but still the upgrade process failed. I suppose my gmail account was too new and had to change my email to my old outlook address in order for the upgrade process to complete.
Perform due diligence, use the features all major cloud services provide to prevent going over budget.
Top left hamburger menu > Billing & Cost Management > Budgets > Create Budget
5.4. Micro Benchmarks
The most likely worse case scenario is a frontpage post to some link aggregate website such as Reddit. The number of requests made to the server per client obviously varies depending on architecture and workload.
The plural of anecdote is data I hope, so if I'm allowed to draw a similarity between the application in the above anecdote and my own, I should ballpark for around 300 requests per second. Cloudflare is the star of the story here, and I will continue to sing their praises: their CDN allows the caching of my static assets.
I originally spent some time seeing if there was a convenient way to hide
.html
from the URL when serving blog entries. Org-publish generates my static
blog, and what seemed like a minor annoyance at first proved useful when setting
up Cloudflare page rules; it took one rule to tell Cloudflare to cache all URL's
ending in .html
. They have a very reasonable usage policy for their CDN (see
section 2.8 of their EULA). OCI deserves applause in tandem, for their 10TB
free data egress. Thank you to abuse-prevention teams in both these companies!
It makes the homegrown part of the internet possible.
We use loader.io to benchmark our application.
15 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/bible/1-1-1/1-1-31
COMMIT | OS | CPU | MEMORY | AVG. RESP (ms) | AVG. ERR RATE (%) |
---|---|---|---|---|---|
4eaaa66 | Alpine 3.15 | standard-1x | 512MB | N/A | N/A |
cfe3566 | Distroless | 3x share-cpu-1x | 3x 256MB | 1820 | 0 |
e557bb5 | Debian 11.3 | 4 OCPU VM.Standard.A1.Flex | 24GB | 3504 | 0 |
300 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/about
COMMIT | OS | CPU | MEMORY | AVG. RESP (ms) | AVG. ERR RATE (%) |
---|---|---|---|---|---|
4eaaa66 | Alpine 3.15 | standard-1x | 512MB | 1400.66 | 6.53 |
cfe3566 | Distroless | 3x share-cpu-1x | 3x 256MB | 43 | 0 |
e557bb5 | Debian 11.3 | 4 OCPU VM.Standard.A1.Flex | 24GB | 43 | 0 |
300 clients per second for a duration of 1 minute. HTTP Resource: https://www.hanshenwang.com/assets/blog/installation-of-spacemacs.html
COMMIT | OS | CPU | MEMORY | AVG. RESP (ms) | AVG. ERR RATE (%) |
---|---|---|---|---|---|
4eaaa66 | Alpine 3.15 | standard-1x | 512MB | N/A | N/A |
cfe3566 | Distroless | 3x share-cpu-1x | 3x 256MB | 19 | 0 |
e557bb5 | Debian 11.3 | 4 OCPU VM.Standard.A1.Flex | 24GB | 37 | 0 |
For a primer on high performance LISP web servers see Woo: a high-performance Common Lisp web server. It should be pointed out that the hunchentoot listed on Woo's benchmark graph is the single threaded version. The multi-threaded version benchmarks are more impressive. The article about Woo also fails to mention quux-hunchentoot which employs a thread-pooling taskmaster as an extension to Hunchentoot.
These stress tests are run with sudo apt install wrk
on OCI's A1 Flex VM's
with 4 cores and 24GB at commit d0f10c5.
Cl-tbnl-gserver-tmgr
[email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about" Running 10s test @ http://localhost:8081/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.94ms 2.84ms 43.51ms 90.52% Req/Sec 0.91k 509.03 1.73k 46.00% 27146 requests in 10.02s, 50.51MB read Requests/sec: 2709.57 Transfer/sec: 5.04MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about" Running 10s test @ http://localhost:8081/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.93ms 2.82ms 51.14ms 90.58% Req/Sec 0.91k 660.73 2.08k 64.67% 27306 requests in 10.02s, 50.81MB read Requests/sec: 2726.07 Transfer/sec: 5.07MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8081/about" Running 10s test @ http://localhost:8081/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.92ms 2.82ms 49.55ms 90.39% Req/Sec 1.37k 808.96 2.52k 52.00% 27355 requests in 10.02s, 50.90MB read Requests/sec: 2731.18 Transfer/sec: 5.08MB
Default multi-threaded Hunchentoot
[email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about" Running 10s test @ http://localhost:8082/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 64.00ms 131.31ms 1.95s 92.41% Req/Sec 504.07 391.26 1.67k 71.28% 14308 requests in 10.05s, 26.62MB read Socket errors: connect 0, read 0, write 0, timeout 36 Requests/sec: 1423.18 Transfer/sec: 2.65MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about" Running 10s test @ http://localhost:8082/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 65.20ms 142.16ms 1.96s 93.24% Req/Sec 416.33 321.55 1.55k 69.21% 14800 requests in 10.08s, 27.54MB read Socket errors: connect 0, read 0, write 0, timeout 40 Requests/sec: 1468.68 Transfer/sec: 2.73MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8082/about" Running 10s test @ http://localhost:8082/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 65.67ms 145.15ms 2.00s 93.67% Req/Sec 452.20 421.83 2.00k 74.68% 14152 requests in 10.06s, 26.33MB read Socket errors: connect 0, read 0, write 0, timeout 35 Requests/sec: 1406.08 Transfer/sec: 2.62MB
Clack with Woo and libev-dev
[email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about" Running 10s test @ http://localhost:8083/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 43.78ms 29.71ms 221.07ms 88.44% Req/Sec 634.03 234.05 0.91k 77.25% 25272 requests in 10.02s, 47.02MB read Requests/sec: 2522.89 Transfer/sec: 4.69MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about" Running 10s test @ http://localhost:8083/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 55.64ms 72.08ms 597.50ms 92.30% Req/Sec 630.25 245.64 0.95k 76.80% 24364 requests in 10.02s, 45.33MB read Requests/sec: 2432.66 Transfer/sec: 4.53MB [email protected]:~$ wrk -t4 -c100 -d10 "http://localhost:8083/about" Running 10s test @ http://localhost:8083/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 44.21ms 29.66ms 204.73ms 88.33% Req/Sec 629.64 227.31 0.89k 78.00% 25085 requests in 10.01s, 46.67MB read Requests/sec: 2505.62 Transfer/sec: 4.66MB
Quux-Hunchentoot Thread Pool
[email protected]:~$ wrk -t4 -c100 -d10 http://localhost:8080/ Running 10s test @ http://localhost:8080/ 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.34ms 4.82ms 166.69ms 94.11% Req/Sec 820.70 791.93 2.29k 72.15% 24371 requests in 10.04s, 160.35MB read Requests/sec: 2426.30 Transfer/sec: 15.96MB [email protected]:~$ wrk -t4 -c100 -d10 http://localhost:8080/ Running 10s test @ http://localhost:8080/ 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.13ms 2.97ms 43.67ms 93.27% Req/Sec 1.28k 452.76 1.89k 72.50% 25593 requests in 10.05s, 168.39MB read Requests/sec: 2546.40 Transfer/sec: 16.75MB [email protected]nstance:~$ wrk -t4 -c100 -d10 http://localhost:8080/about Running 10s test @ http://localhost:8080/about 4 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 2.86ms 2.70ms 52.46ms 94.02% Req/Sec 0.94k 653.16 2.06k 64.67% 27966 requests in 10.04s, 52.09MB read Requests/sec: 2784.27 Transfer/sec: 5.19MB
Because MAKE.LISP fiddles with compiler knobs in search of performance, a single [email protected] in production can reach upwards of 3300 requests per second. To improve application resiliency Nginx is used to load balance between 12 project-isidore processes each at 2GB of RAM. Lastly, of note is the 4GBps network bandwidth offered by Oracle Cloud. My bottleneck should be at this point the rucksack database, despite offering concurrent transactions.
So with the bare minimum amount of testing done I can say with confidence that my website is well prepared given the restraints on cost, time, and money.
5.4.1. Standards Compliance
Test | Grade |
---|---|
Qualys SSL | A+ |
HTTP Headers | D |
Internet.nl | 97% |
Lighthouse | |
Hardenize |
5.5. Financial Reports
This website is run as a non-profit hobby, for no advertising is or will be displayed. Dollar amounts are in CAD. Thank you to all patrons.
Income Statement | 2021 | 2022 | 2023 |
---|---|---|---|
Revenue | |||
Donations | 0.00 | 0.00 | 0.00 |
Expenses | |||
Domain name | 10.10 | 10.10 | 10.10 |
Total | 10.1 | 10.1 | 10.1 |
6. Data Persistence
Project Isidore uses an embedded database (Rucksack) over more typical client-server RDBMS's such as PostgreSQL. Extremely heavy read-heavy data or whatever data that ought to be cached are stored by the in-memory object prevalence model (BKNR.Datastore). Moore's law has brought us significant improvements, and as a result SQLite is a viable choice for this application.
6.1. PostgreSQL
This section is now outdated. I implemented a trivial PostgreSQL mailing list but running what amounts to your own mail server that does not have its messages marked by the big email providers as spam is most definitely non-trivial. As of commit 7a4fc5d, I have switched to Mailchimp. On production instances, there's a savings of 138MB-92.8MB=45.2MB RAM.
PostgreSQL has very clear and structured documentation. Refer to the documentation to install PostgreSQL locally on your computer. Afterwards a good introduction to basic Create, Read, Update, Delete (CRUD) operations is here: 12 Steps to Build and Deploy Common Lisp in the Cloud (and Comparing Rails) |…
Documentation on Postmodern is better than your average Common Lisp library. Still to supplement the official docs are examples and specifically examples using the Data Access Objects.
- To start PostgreSQL server process
# Install PostgreSQL. sudo apt install postgresql # Start server process. PostgreSQL defaults to PORT 5432 sudo service postgresql start # Login as PostgreSQL Superuser "postgres" via "psql" Client sudo -u postgres psql # Create database "Test" createdb test # Delete database "Test" dropdb test # Login as superuser to create user for database "Test" sudo -u postgres psql # See defparameter `*local-db-params*' in MODEL.LISP. CREATE USER user1 WITH PASSWORD 'user1'; # Use Shell to login as host: localhost with database: test and user:user1 psql -h localhost -d test -U user1 # Once logged in, test=# select * from tablename;
6.2. In-Memory Datastore
Design constraints imposed by the current deployment platform, Heroku & Github.
- Heroku managed PostgreSQL free tier limitations = 10000 rows, 1GB disk capacity.
- Heroku free tier dyno memory (automatic dyno shutdown at 1GB) = 512 MB.
- Heroku free tier slug size = 500 MB.
- Github large file limit = 100MB-2GB.
Characteristics of Bible dataset:
- Read-only data.
- Dataset should be available offline.
- Non-expanding dataset.
- Fits within Heroku free tier dyno memory (18-20MB). Online reports of 80-140mb RAM usage by hunchentoot + ironclad.
- Limited developer resources mean instead of programming/debugging in LISP, I would need to master a second domain specific programming language: SQL.
- Very cost sensitive (cut me some slack, I'm a college student).
Object Relational Mappers (ORM) are notoriously hard to get right. It is too bad
the pure LISP persistence solutions (Allegrocache) remain proprietary. For open
source solutions, I still think bknr.datastore
is among the best for now.
Rucksack by Arthur Lemmens is also worth playing with, but due to the
restrictions of the Heroku ephemeral filesystem, the library with the best fit
for my application would be bknr.datastore
.
Memory-Centric Data Management A Monash Information Services White Paper by Curt
- Monash, Ph.D. May, 2006, accessible at http://www.monash.com/whitepapers.html
See pg 668 of weitzCommonLispRecipes2016 for cookbook recipes on BKNR.DATASTORE.
Object Prevalence : An In-Memory, No-Database Solution to Persistence | by Pa…
6.3. BKNR.Datastore vs Rucksack vs Postmodern
Rucksack measurements are listed first, then BKNR.Datastore.
PROJECT-ISIDORE/VIEWS> (time (get-bible-text 23)) Evaluation took: 0.000 seconds of real time 0.000098 seconds of total run time (0.000095 user, 0.000003 system) 100.00% CPU 228,122 processor cycles 0 bytes consed PROJECT-ISIDORE/VIEWS> (time (get-bible-text 23)) Evaluation took: 0.000 seconds of real time 0.000020 seconds of total run time (0.000020 user, 0.000000 system) 100.00% CPU 36,872 processor cycles 0 bytes consed PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-2-2-2")) Evaluation took: 0.330 seconds of real time 0.328973 seconds of total run time (0.328973 user, 0.000000 system) 99.70% CPU 821,152,853 processor cycles 30,101,264 bytes consed PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-2-2-2")) Evaluation took: 0.440 seconds of real time 0.441031 seconds of total run time (0.431896 user, 0.009135 system) [ Run times consist of 0.017 seconds GC time, and 0.425 seconds non-GC time. ] 100.23% CPU 1,098,042,384 processor cycles 91,860,080 bytes consed PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-73-22-21")) Evaluation took: 8.990 seconds of real time 8.998338 seconds of total run time (8.102798 user, 0.895540 system) [ Run times consist of 0.077 seconds GC time, and 8.922 seconds non-GC time. ] 100.09% CPU 22,446,254,265 processor cycles 796,903,104 bytes consed PROJECT-ISIDORE/VIEWS> (time (bible-page "1-1-1-73-22-21")) Evaluation took: 0.660 seconds of real time 0.659741 seconds of total run time (0.641298 user, 0.018443 system) [ Run times consist of 0.025 seconds GC time, and 0.635 seconds non-GC time. ] 100.00% CPU 1,641,964,703 processor cycles 327,536,880 bytes consed
A sampling of postmodern DAO speeds,
PROJECT-ISIDORE/MODEL> (time (friend-email (mailinglist-get 2))) Evaluation took: 0.010 seconds of real time 0.002794 seconds of total run time (0.002794 user, 0.000000 system) 30.00% CPU 25,589,575 processor cycles 32,432 bytes consed "[email protected]"
And a regular SQL query done in postmodern,
PROJECT-ISIDORE/MODEL> (time (pomo:with-connection (db-params) (pomo:query (:select 'email :from 'mailinglist :where (:= 'id 2)) :single))) Evaluation took: 0.010 seconds of real time 0.003089 seconds of total run time (0.003089 user, 0.000000 system) 30.00% CPU 24,084,690 processor cycles 16,368 bytes consed "[email protected]" 1 (1 bit, #x1, #o1, #b1)
A sampling of cl-sqlite with the in-memory database,
SQLITE> (time (execute-single *db* "select id from users where user_name = ?" "dvk")) Evaluation took: 0.000 seconds of real time 0.000047 seconds of total run time (0.000045 user, 0.000002 system) 100.00% CPU 108,448 processor cycles 0 bytes consed 2 (2 bits, #x2, #o2, #b10)
and with a regular disk based database,
SQLITE> (time (execute-single *dba* "select id from users where user_name = ?" "dvk")) Evaluation took: 0.000 seconds of real time 0.000085 seconds of total run time (0.000081 user, 0.000004 system) 100.00% CPU 196,908 processor cycles 0 bytes consed 2 (2 bits, #x2, #o2, #b10)
For this admittedly shallow testing, BKNR.Datastore and Rucksack perform admirably! Quicklisp-stats shows BKNR.Datastore is still in use, with the most recent example being a startup by an ex-Facebook engineer. The most recent mention I can find of Rucksack is by Ravenpack, at ELS2020.
The quicklisp download stats for Rucksack show an 200% increase (from NIL > 223/239) around the months of January and February. I find it plausible that Ravenpack, who does data analysis for the financial sector, have their engineers repull all libraries from quicklisp once per year; I have noticed same patterns for some other libraries.
BKNR.Datastore is the reason I am able to structure the Tabular Douay Rheims in the way that I have. Any disk based solution would have been too slow, forcing me to cache the pages or store them as static files. Many thanks to Hans Hubner, the author of said library. And yes, cl-store or simple serialization would have sufficed for my use case but I'll take Edmund Weitz's word when he recommends BKNR.Datastore over cl-store in his ebib:weitzCommonLispRecipes2016.
The question of ORM's comes up again. As of the current writing, I am aware of a handful of options.
Elephant - interfaces to the C databases (Berkeley DB, PostgreSQL) most likely bitrotted, perhaps the author got hired to work on allegrocache? Quicklisp-stats show 35 downloads total for the past 2 years.
CLSQL - by far the greatest number of backends supported, with the necessary compromises that suggests. Tradition extends back many years, shares ideas with CommonSQL and has a SOLID track record. Packaged for Debian. Great documentation.
Hu.dwim.perec - Originally started as a fork of CLSQL. Greatly extends the ORM capabilities and is kept up to date: I think mostly by one person, Levente Mészáros. This is one of the libraries that show almost exact same download patterns as Rucksack. So what's the catch? Very little documentation. You will have to dive deep into a bunch of hu.dwim.* packages and look at tests. Pretty much pulls in the entire hu.dwim.* ecosystem with it when downloaded from quicklisp. With great power comes…
Mito - Fukamachi-ware. Similar download stats to CLSQL. Very young project, started in 2016. Also see Crane, the stats show close to zero usage though.
If one is willing to make the trade-off of SQL for object persistence, then as far as I know there are really only two pure lisp options. I say "only" but I'm not aware of any other language with libraries comparable to the ones below.
Rucksack (open source) - as mentioned earlier, authored by hacker Arthur Lemmens (worked at Ravenbrook). Small, written in portable lisp, possessing performance that isn't bad at all, a real gem of a project. The mailing lists of Elephant and Rucksack show some attempt made to combine the ideas of Rucksack into a pure lisp backend for Elephant. Rucksack also shows a lack of recent updates, but unlike Elephant, has users to this day. Ain't it beautiful how the stability of the language shines through the library? Don't let the date of the last commit turn you away. Give it a shot, look at the talk-eclm2006.txt file under the /docs folder.
Allegrocache (proprietary, Franz Inc.) - what Rucksack could have been if you threw a bunch of money at the problem. Tightly integrated with the rest of the Allegro CL ecosystem. You do get what you pay for in this case. I have heard they have great support too. Allegrocache was originally based on ObjectStore (Dan Weinreb of Symbolics fame + others). Dan does a fair job at defending object oriented database management systems here. I would like to point out that Glenn D. House of 2Is (DoD contractor) testifies (21:30) to the conclusions found in Prechault and Garret when comparing Lisp v. Java v. C/C++. Grammatech is also a DARPA funded shop that uses Common Lisp. I would also be remiss if I did not mention the recent milestone of CLASP (Common Lisp implementation with C++ interop) version 1.0. The geriatric IP of Symbolics is still closed source; rumor is there are still legacy DoD contracts and that American Express's fraud detection used to (up until the mid 2000's?) use Open Genera.
Jacobs, J. H., M. R. Swanson, and R. R. Kessler. Persistence is Hard, Then You Die! or Compiler and Runtime Support for a Persistent Common Lisp. Technical report, Center for Software Science, University of Utah, 1994. UUCS-94* 004, 1994.
7. Case Study: Profiling and Performance
A through treatment of the generalities of optimization in Common Lisp can be found in weitzCommonLispRecipes2016 pages 503-560. Dr. Weitz testifies that lisp (SBCL) can confidently reach within a 5x ballpark of C. Less, obviously, if there are many fixnums. SBCL compiler contributor Paul Khuong has also testified to a ballpark of 3x within C. Of course, squabbling over language performance is time better spent on data structures and algorithms, but to have a ballpark estimate is good to know. Python and Ruby for example, are magnitudes slower than Lisp. Ballpark of 20x and 40x respectively. For a dynamically typed, high level language, Lisp performs admirably. Especially striking is the non-leaky abstraction and degree of control provided to the programmer at read, compile and runtime; see the "disassemble" function and runtime compilation tricks.
Lastly, Professor Scott Fahlman, one of the original designers of Common Lisp, weighs in on his experience circa 2020.
Short answer: I don’t know about the Clozure compiler, but the compiler used in the open-source CMU Common Lisp (CMU CL) system produces code that is is very close in performance to C a little faster for some benchmarks, a little slower for others.
But there are some things a Common Lisp user needs to understand in order to get that performance. Basically, you need to declare the types of variables and arguments carefully, and you should not do a lot of dynamic storage allocation (“consing”) in performance critical inner loops usually just 10% or 20% of your total system.
(Steel Bank Common Lisp (SBCL) is essentially the same as CMU CL in terms of the performance of compiled code. The CMU CL open-source community split in two in December 1999 as a result of some disagreements about design and philosophy, and one branch was renamed SBCL. I believe that the parts of the compiler concerned with optimization have not changed much. I currently prefer SBCL for my work on the Scone knowledge-base system and other things.)
The Java compilers I know about produce code that is considerably slower than CMU CL and SBCL. I don’t know much about Haskell performance. C++ is similar to C in performance if you use it like C; I believe that if you make heavy use of the object-oriented features it is considerably slower.
Longer answer, for nerds: I was one of the core designers of Common Lisp. (We were sometimes referred to in those days as the “Gang of Five”.) I wrote what I believe was the first Common Lisp compiler (drawing heavily on the design of earlier compilers for Maclisp and Lisp Machine Lisp).
For many years I ran the CMU Common Lisp project. As part of that project, we developed a public-domain Common Lisp runtime that became the basis for a number of commercial Common Lisp implementation, adapted and supported by various large companies for their own machines.
David B. McDonald, in my group, spent something like four years developing a very highly optimizing compiler for Common Lisp. We called this the “Python” compiler, which caused some confusion when the Python programming language became popular more than a decade later. No relation.
(I proposed that we name our compiler “Python” because a python the snake eats a whole pig, then goes under a bush for several weeks to sleep. The pig makes its way slowly through the snake’s internal pipelines, ultimately emerging as a very compact pellet. Which is pretty much what compilers do, to one degree or another.)
At the time the early 1980s I was starting to work on programs for implementing (or simulating) artificial neural nets. These needed very efficient floating-point arithmetic and vector operations, and I wanted to be sure that we could efficiently program these things in CMU Common Lisp. But at the time, Common Lisp had the reputation of being really awful at floating point a straightforward implementation would constantly be allocating “boxed” floating-point numbers that had to be garbage-collected later.
So Dave McDonald labored mightily over type inference and non-consing ways to handle floating point, and he got the job done. I wrote some neural-net programs in CMU CL, they were later translated into C for wider distribution (by an undergrad coding wizard who really knew what he was doing). The two versions were very close in runtime. When DARPA pulled the plug on support for Common Lisp development, CMU Common Lisp became an open-source project with a different set of developers/maintainers. Both our runtime and our “Python” compiler are part of that distribution (and of SBCL), though of course there has been some evolution since then.
What programmers need to know to get good performance in Common Lisp: People speak of Lisp as a “dynamically typed” language. I think it is more correct to call it (at least for the CMU CL implementation) an “optionally strongly typed” language. The philosophy is this: Programmers can say as much or as little as they like about the type of an argument or value. Whatever you say had better be true you can tell the compiler to trust the declarations or to be suspicious. The more precisely you specify what the entity is, the more likely it will be that the compiler can do some clever optimization based on what you told it.
So, for example, you could say just “number” or “integer” or “integer between 0 and 1024” or “prime number between 0 and 1024”. If you use a very general declaration, the code will work, but it will have to do some runtime type-checking to see what kind of number it is dealing with. It must be ready to deal with some of the exotic number-types that Common Lisp supports: infinite-precisions integers (“bignums”), ratios of integers, imaginary numbers, several levels of floating-point precision, and so on. There is a special, very compact and efficient format that can be used for small integers, but it can only be used if you tell the compiler what to expect.
The same is true of things other than integers: There are several array formats. You can specify what size/shape data to expect, or you can wait and see what someone hands you at runtime. When you get into object-oriented programming, you can tell the system what to expect (and it can figure out more internal data-types via type inference), or you can wait and see what you get and do a runtime type-dispatch to find the proper method to use. That takes time.
So, if you want good performance in Common Lisp, especially for arithmetic and array operations, you have to declare the types as precisely as you can. Then the compiler will do its magic.
Flash forward: Sometime around 2001, I was working for IBM Research (on leave for a while from CMU long story…) and my project was to implement an early version of what became (after several restarts) my Scone knowledge-base system. At the time, the prevalent language in IBM Research was Java, and I knew that it would be hard to interest the IBM people in my system if I did it in Lisp. So I started out doing it in Java.
This system had essentially no arithmetic in it, but did a lot of pointer-chasing off into main memory, a lot of boolean operations, and had a few very intensive inner loops where it spent all its time. Common Lisp was clearly the right tool for this job, but I worked hard on the Java version. Finally, for reasons too complicated to go into here, I decided that I could no longer stand programming the system in Java some very important facilities were missing, especially the Lisp macro system so I decided to port the half-done system to Common Lisp. (As predicted, IBM then lost interest, and I returned to CMU.)
The performance-critical inner loops had already been programmed at that time, so I was able to compare the performance of the Lisp and Java versions. The Lisp version was about 3X faster. Part of that difference was because I was a very good Lisp programmer, and rather new at Java, though I had talked to Java experts about how to get good performance for code like mine. So some of the performance difference was experience, but mostly it was because at the time (and I think still today) Java did not do a lot of object-type inference at compile time. So pretty much every function call was a type-dispatch to find the right method, and that was slow.
There wasn’t a good way around this inefficiency except by writing the performance-critical code as long linear stretches of code with no function calls. And without a good macro system, that was much too tedious.
What follows is an amateur recording of my struggles, useful to jog my memory in the future when working with the SBCL statistical profiler. I had a pretty good idea the function call that needed to be profiled, but if it was a foreign system, I would try a library like Daniel Kochmański / metering · GitLab.
;; Load project into image. (ql:quickload :project-isidore) ;; Load SBCL's statisical profiler. (require :sb-sprof) ;; Includes COMMON LISP USER package symbols as defined in the Hyperspec. (sb-sprof:profile-call-counts "CL-USER") ;; Profile and output both graph and flat formats. (sb-sprof:with-profiling (:max-samples 5000 :report :graph :loop t :show-progress t) (project-isidore:bible-page "1-1-1-3-3-3"))
Self Total Cumul
Nr Count % Count % Count % Calls Function
------------------------------------------------------------------------
1 2004 40.1 2004 40.1 2004 40.1 - EQUALP
2 299 6.0 1054 21.1 2303 46.1 - (SB-PCL::EMF SB-MOP:SLOT-VALUE-USING-CLASS)
3 205 4.1 205 4.1 2508 50.2 - (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;DLISP3.LISP")
4 193 3.9 443 8.9 2701 54.0 - (SB-PCL::FAST-METHOD BKNR.SKIP-LIST:SL-CURSOR-NEXT (BKNR.SKIP-LIST:SKIP-LIST-CURSOR))
5 190 3.8 3896 77.9 2891 57.8 - REMOVE-IF-NOT
6 177 3.5 212 4.2 3068 61.4 - COPY-LIST
7 165 3.3 165 3.3 3233 64.7 - (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;BRAID.LISP")
8 157 3.1 157 3.1 3390 67.8 - (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;PRECOM2.LISP")
9 143 2.9 143 2.9 3533 70.7 - foreign function syscall
10 141 2.8 141 2.8 3674 73.5 - SB-KERNEL:TWO-ARG-STRING-EQUAL
11 134 2.7 134 2.7 3808 76.2 - (LAMBDA (CLASS SB-KERNEL:INSTANCE SB-PCL::SLOTD) :IN SB-PCL::MAKE-OPTIMIZED-STD-SLOT-VALUE-USING-CLASS-METHOD-FUNCTION)
12 121 2.4 121 2.4 3929 78.6 - (LAMBDA (SB-KERNEL:INSTANCE) :IN SB-PCL::GET-ACCESSOR-FROM-SVUC-METHOD-FUNCTION)
13 110 2.2 110 2.2 4039 80.8 - (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;BRAID.LISP")
Self
is how much time was spent doing work directly in that function. Total
is how much time was spent in that function, and in the functions it called. As
this is my own code, I know that REMOVE-IF-NOT calls EQUALP in a lot of
functions. But it can also be a probable hypothesis from just this data alone.
Cumul
is obviously the additive results of the Self
column. I have cut it
off at 80% in light of the Pareto principle. The hypothesis can be confirmed
from looking at the graph formatted portion of the profiler output, pasted
below.
------------------------------------------------------------------------ 3840 76.8 PROJECT-ISIDORE/MODEL:GET-BIBLE-UID [99] 44 0.9 REMOVE-IF-NOT [5] 55 1.1 PROJECT-ISIDORE/MODEL:GET-HAYDOCK-TEXT [96] 190 3.8 3896 77.9 REMOVE-IF-NOT [5] 1 0.0 (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-VERSE) [114] 1 0.0 (SB-PCL::EMF SB-MOP:SLOT-VALUE-USING-CLASS) [2] 1 0.0 (LAMBDA (SB-PCL::.ARG0.) :IN "SYS:SRC;PCL;DLISP3.LISP") [3] 1 0.0 foreign function alloc_list [29] 43 0.9 (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-CHAPTER) [30] 140 2.8 SB-KERNEL:TWO-ARG-STRING-EQUAL [10] 1421 28.4 (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL::FILTER-LIST-BY-BOOK) [17] 2004 40.1 EQUALP [1] 44 0.9 REMOVE-IF-NOT [5] 47 0.9 (LAMBDA (PROJECT-ISIDORE/MODEL::X) :IN PROJECT-ISIDORE/MODEL:GET-HAYDOCK-TEXT) [38] ------------------------------------------------------------------------
GET-BIBLE-UID is expected to take up a large portion of function calls based on my design choices and some back of the napkin math. The profiler has confirmed the information. Let's see if we can't optimize this particular function further.
(time (project-isidore:bible-page "1-1-1-3-3-3"))
Evaluation took: 64.490 seconds of real time 64.673257 seconds of total run time (64.032109 user, 0.641148 system) [ Run times consist of 2.276 seconds GC time, and 62.398 seconds non-GC time. ] 100.28% CPU 160,990,605,382 processor cycles 16,089,927,136 bytes consed
Replacing equalp
with eql
in places where appropriate in the file
model.lisp
. For the best explanation on the different equality predicates in
Common Lisp, see Equality in Lisp - Eli Bendersky's website.
Lisp's equality operators are:
= compares only numbers, regardless of type.
eq compares symbols. Two objects are eq if they are actually the same object in memory. Don't use it for numbers and characters.
eql compares symbols similarly to eq, numbers (type sensitive) and characters (case sensitive)
equal compares more general objects. Two objects are equal if they are eql, strings of eql characters, bit vectors of the same contents, or lists of equal objects. For anything else, eq is used.
equalp is like equal, just more advanced. Comparison of numbers is type insensitive. Comparison of chars and strings is case insensitive. Lists, hashes, arrays and structures are equalp if their members are equalp. For anything else, eq is used.
Evaluation took: 74.060 seconds of real time 74.232869 seconds of total run time (73.519834 user, 0.713035 system) [ Run times consist of 2.961 seconds GC time, and 71.272 seconds non-GC time. ] 100.23% CPU 184,855,087,325 processor cycles 16,089,928,160 bytes consed
Would you look at that. Worse performance as a result of going from a more general equality predicate to a more specific equality predicate. I'm guessing SBCL does some fancy optimization tricks here.
From Edmund Weitz on string-equal/equalp
I would assume that on most implementations STRING-EQUAL is a bit faster (given the right optimization declarations) because it "knows" that its arguments are strings. It's most likely a micro-optimization that's only noticable in tight loops.
It can also be self-documenting to use STRING-EQUAL because the reader of your code then knows that you expect both of its arguments to be strings.
Therefore switching EQUALP to STRING-EQUAL in FILTER-LIST-BY-BOOK gives me the following speedup.
Evaluation took: 69.390 seconds of real time 69.608650 seconds of total run time (68.808030 user, 0.800620 system) [ Run times consist of 2.482 seconds GC time, and 67.127 seconds non-GC time. ] 100.32% CPU 173,202,179,865 processor cycles 16,089,928,912 bytes consed
Going back a few steps and using = instead of eql
doesn't result in anything
significant at all.
I decided to replace the functional approach of REMOVE-IF-NOT with the LOOP DSL. loops - Iterate through a list and check each element with your own condition… This surprisingly did nothing.
Instead of going through the same list three times and collecting one item at a time, I decided to remove the nested loops and go through the same list once but collect three items. This was more due to my unfamiliarity with loop keywords. Good speedup results.
Evaluation took: 14.960 seconds of real time 15.014469 seconds of total run time (14.824816 user, 0.189653 system) [ Run times consist of 0.963 seconds GC time, and 14.052 seconds non-GC time. ] 100.36% CPU 37,334,905,433 processor cycles 6,407,417,456 bytes consed
Removing some list copying in the filter and get-bible-uid functions yielded:
Evaluation took: 13.300 seconds of real time 13.353905 seconds of total run time (13.183019 user, 0.170886 system) [ Run times consist of 0.731 seconds GC time, and 12.623 seconds non-GC time. ] 100.41% CPU 33,191,116,692 processor cycles 6,406,945,232 bytes consed
Proving once again that I am careless and forgetful, the following change
resulted in a 133x speedup. I believe this function was coded when
bknr.datastore:store-objects-with-class
was the only external function I was
familiar with in bknr.datastore
and I was ignorant of
bknr.datastore:store-object-with-id
. Prior to the change, every single time
get-haydock-text
was called, it would iterate through all 35817 verses of the
bible to find one instance of haydock-text. Another concrete example to read the
manual of whatever library I am using.
(defun get-haydock-text (bible-uid) "Returns a string if bible-uid is valid else return NIL. The bible-uid can be found by calling `get-bible-uid' with valid arguments." (let ((cpylist (remove-if-not (lambda (x) (and x (equalp bible-uid (slot-value x 'bknr.datastore::id)))) (copy-list (bknr.datastore:store-objects-with-class 'bible))))) (if (slot-boundp (car cpylist) 'haydock-text) (slot-value (car cpylist) 'haydock-text) (format t "GET-HAYDOCK-TEXT called with invalid bible-uid ~a" bible-uid))))
(defun get-haydock-text (bible-uid) "Returns a string if bible-uid is valid else return NIL. The bible-uid can be found by calling `get-bible-uid' with valid arguments." (if (slot-boundp (bknr.datastore:store-object-with-id bible-uid) 'haydock-text) (slot-value (bknr.datastore:store-object-with-id bible-uid) 'haydock-text) (format t "GET-HAYDOCK-TEXT called with invalid bible-uid ~a" bible-uid)))
Evaluation took: 0.100 seconds of real time 0.107482 seconds of total run time (0.097531 user, 0.009951 system) [ Run times consist of 0.007 seconds GC time, and 0.101 seconds non-GC time. ] 107.00% CPU 267,219,700 processor cycles 40,850,160 bytes consed
I admit this was less of an exercise in speeding up Common Lisp and more of a demonstration of human frailty.
Looking through the git log for commits of type "Perf" shows further optimization commits I have done. Current result for version 1.2.1,
Evaluation took: 0.010 seconds of real time 0.011488 seconds of total run time (0.011127 user, 0.000361 system) 110.00% CPU 28,639,406 processor cycles 10,191,536 bytes consed
After the addition of regex generated cross-references in version 1.2.2,
Evaluation took: 0.840 seconds of real time 0.833305 seconds of total run time (0.813444 user, 0.019861 system) [ Run times consist of 0.002 seconds GC time, and 0.832 seconds non-GC time. ] 99.17% CPU 2,078,095,375 processor cycles 80,577,968 bytes consed
8. User Manual
8.1. How do I find past versions of a blog article?
To view the entire revision history of an article, find and click the article
title on the repository's blog subfolder. Then click on the History
button
to view specific, atomic changes. Project release notes are available for a
general overview and inputting the article URL into the Internet Wayback
Machine is also an option.
8.2. How do I unsubscribe from the mailing list?
To remove an email address from the mailing list, fill out and submit the form on the unsubscribe page. To resubscribe, visit https://www.hanshenwang.com/subscribe.
8.3. Can I visit this website offline?
To access the website offline, download the appropriate executable from the project release page. Releases · HanshenWang/project-isidore · GitHub.
Find the appropriate exectuable matching both your computer's processor architecture and the operating system. Currently only the x86-64 architecture is supported on Ubuntu, MacOS and Windows operating systems.
Project Isidore does not offer signed binaries for MacOS, therefore you will have to manually execute the unsigned binaries. Please see https://lapcatsoftware.com/articles/unsigned.html for more details.
An audit of the source code can be done at any time. Please see the source repository as well as the third party dependencies.
Be advised that the program consumes around 50MB of RAM when used by a single user locally. Please understand the executable is provided AS IS, WITHOUT WARRANTY. See the provided COPYING.txt included in the download.
8.4. I can't find what I'm looking for. How is the documentation organized?
The documentation is organized according to the best practices outlined here: The documentation system — divio.
The closest thing to a tutorial as understood by the divio documentation system ought to be the development quickstart (present in the README.org) or embedded as close as possible to the end-user interface. How-to guides are meant to be placed here in the user manual. The Reference is auto-generated with the help of the Declt system. The Explanation is this Design Notebook blog article and git commit messages.
9. Reference Manual
Reference manuals are technical descriptions of Project Isidore's internal artifact architecture and how to operate it. For end users, please see the User Manual.
The Project Isidore Reference Manual is complete with cross-references to ASDF component dependencies, parents and children, classes' direct methods, super and subclasses, slot readers and writers, setf expanders access and update functions etc. The reference manual also includes exhaustive and multiple-entry indexes for every documented item.
- System components (modules and files)
- Packages
- Exported and internal definitions of
- Constants
- Special variables
- Symbol macros
- Macros
- Setf expanders
- Compiler macros
- Functions
- Generic functions
- Generic methods,
- Method combinations,
- Conditions
- Structures
- Classes
- Types
With all that being said, when the boundary between user and developer is crossed, it makes much more sense to clone the source code and explore it in your LISP IDE. Auto generated manuals may be slightly more useful in LISP than other languages, thanks to the excellent introspection capabilities of SBCL and ASDF, but still are largely only useful for index generation.
9.1. Generate Reference Manual
The Declt Common Lisp library is used generate the reference manual in .texi
texinfo format. GNU Texinfo is able to convert a single .texi
source file into
online HTML format, PDF documentation as well as others. On a high level, Declt
uses ASDF and SBCL contrib sb-introspect
to query and extract documentation
strings, lambda lists, slot type, allocation and initialization arguments, and
definition source files. It requires no special architecture choices in a
system, other than to conform to ASDF conventions. Beyond that, writing clear
docstrings where the opportunity arises will yield great results.
generate-doc.lisp
script is run by a git pre-commit hook. Upon every commit,
it will regenerate the reference manual and include changes into the commit.
pre-commit
is originally located at
/project-isidore/.git/hooks/pre-commit.sample
.
#!/bin/sh # Generate updated manual. SBCL must be installed. See documentation for # environment setup sbcl --load /home/ben/quicklisp/local-projects/project-isidore/src/generate-doc.lisp # Add updated manual.html to commit git add /home/ben/quicklisp/local-projects/project-isidore/assets/reference-manual.html
10. Project History & Credits
For previous project iterations and experience, see project-isidore-java repository on GitHub using Java Spring, and project-isidore-javascript currently on GitHub using NextJS. See also MEAN stack notes.
Credit must be given where credit is due. This website would not be possible without,
- Wonderful friends and family :)
- (Spac)Emacs Community
- Common Lisp Community
- GNU/Linux Debian Community
- Oracle Cloud Infrastructure, Cloudflare, Github and Porkbun.
From the bottom of my heart, thank you!
References
[1] | Scott Chacon. Pro Git. The Expert's Voice in Software Development. Apress, New York, NY, second edition edition, 2014. |