Lamby

This last weekend at Hack Ridge, I worked on a project called Lamby. Lamby is a super simple platform for serverless, function-driven development. It works by defining individual API routes as functions which return string values. I'm really happy with the progress that I made; the app is currently live on https://www.lamby.xyz.

Getting started with lamby

In lamby, the basic unit of computation is a function. Each "function" has a main procedure:

def main(session):
       return "<h1>test</h1>"

This procedure returns a string which is then returned when the route to the function has been GETed.

The Session object

In lamby, the session variable that is passed to every function is a mutable dictionary that is shared between all sessions of all functions.

The Request object

In Lamby, all POST functions are passed a request object that has information about the request that has been performed. An example request object looks like:

{'path': '/query/example',
'args': {'a': 2},
'method': 'POST',
'form': {}}

Using modules

Lamby allows for the creation of shared modules which can then be referenced in functions. For example, if you made the following module:

def double(input):
      return input * 2

and named it test, then in a function you could reference it with the include statement:

include test
def main(session):
      return double("abc")

and use its functions.

Implementation

I built a simple server using Flask, uWSGI, nginx, and Docker. The website itself is a simple Flask application, run through uWSGI and served through nginx. The main application runs as a Docker container, but is also able to spin up sibling containers for each API call, so arbitrary code can be adequately sandboxed.

Future Goals

During the hackathon, I did end up implementing the Stripe API, with the goal of monetizing API requests. I'm considering pursing expanding Lamby to encompass and containerize all of the necessities to make functional web apps.

Algebraic Typing in Lime

Before I started working on Lime, and was only familiar with statically typed languages in the context of Java and C, I thought type systems were really limiting. However, the incredible capability of type systems of languages like Haskell and Rust (Rust's is even turing complete!) have proven this very wrong.

A lot of these innovations come from a field of mathematics called category theory. The fundamental object of category theory is a directed graph called a category:

A drawing showing the objects X, Y, and Z connected by f, g, and f of g. X is mapped to Y by f, Y is mapped to Z by g, and Z is connected to X by g of f.

The nodes of these graphs are called objects, and the edges are called morphisms (hence the term polymorphism).

One of the most important parts of modern type systems is something called algebraic typing. Algebraic typing allows for functions that can take multiple forms, based on an algebraic expression. For example, the car command which returns the first element of a list must have a form for int$, str$, int$$, and so on, cannot be defined through polymorphism: there would be infinitely many forms. Instead, it can be defined algebraically:

car (list:A$) -> A

where A is a placeholder, which may be substituted for any type. Therefore, car will be defined for lists of all types.

More complicated expressions can be done with container types like dictionaries:

dget (dict:[A->B] key:A) -> B

In this example, the theoretical dget command (to get a value from a key in a dictionary) accepts dictionaries mapping any type to any other type, accepts a key of the key type of the dictionary, and returns a value with the type of the value type of the dictionary.

Additionally, since Lime is a first-order language, all functions have explicit types signatures, declaring their input and output types and arity. Functions that operate on other functions can have super explicit type signatures. For example, the map command's signature is:

map (lambda[A->B] A$) -> B$

which will ensure that the list being mapped is of the input type of the function, and will make the output type of map a list of the output type of the function.

The new type system will be merged in 1.2.0.

My MacOS Configuration

This article outlines my configuration for my Mac. All of my dotfiles, as well as a copy of this guide, are available at https://gitlab.com/lschumm/dotfiles.

OS Installation

  • Erase the main disk, erase and rename as Macintosh HD and format it as plain APFS.
  • Then, choose the Reinstall MacOS option on the new disk.
  • On installation make sure to choose: - "Don't transfer any information now" on the Migration Assistant page - Sign in with Apple ID - Set username to lschumm instead of liamschumm; but do not allow Apple ID to reset this password - Under "Express Set-Up", choose "Customize Settings" - Enable Location Services - Disable analytics - Enable Siri - Do not "store files from Documents and Desktop in iCloud Drive" - Enable FileVault; but do not allow iCloud password to unlock the disk - Choose Light theme

OS Configuration

Install XCode

Go into the App Store and download XCode.

To install the XCode command line tools, run:

xcode-select --install
sudo xcodebuild -license

Install Homebrew

Use the brew default installer.

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Change hostname

Under System Preferences > Sharing, change the hostname to the preferred name of your machine (I use the last names of famous computer scientists):

Shows the Computer Name field of the Sharing preferences pane with the text set to my computer name: :code:`dijkstra`.

Set up Git credentials

To set up your git name and email, run

git config --global --edit

Install Python3

MacOS (as of Mojave) installs Python 2.7.10, which is a reasonable version of Python 2 to have. To install Python 3, I recommend brew:

brew install python3

Install python2-pip

When brew installed python3, it also installed pip3. However, MacOS' default version of Python does not include pip. It does include easy_install, which can be used to install pip:

sudo easy_install pip

Install the latest version of Emacs

MacOS (as of Mojave) installs Emacs 22.1.1, which is ancient. Install emacs26 with brew:

brew install emacs

App Installation

I like installing apps through brew cask because command line stuff is great, and it allows all of my applications to be centrally managed by a utility. Nothing is in the Mac App Store.

brew cask install telegram
brew cask install gimp
brew cask install spotify

Dock Configuration

I personally don't really like the Dock, because it takes up space on my small 11 inch screen. I use Spotlight to open applications, and Command + Tab to switch applications/see what's open. This command will set the Dock autohide delay to 1000 seconds, so you have to hover over the bottom of the screen for 1000 seconds to display the dock.

If you really need your dock, Command + Option + D will manually open it; press it again to hide it.

defaults write com.apple.dock autohide-delay -float 1000; killall Dock

Disabling Gatekeeper

Gatekeeper is that annoying "can't be opened because it is from an unidentified developer" thing. To disaple it, run:

sudo spctl --master-disable

I'm a responsible user and don't download random binaries and execute them; I don't like MacOS second guessing me.

Setting up File Hierarchy

I actually like to remap my home directory to Desktop, so it's visible on my screen. I remap my HOME environment variable to ~/Desktop.

mkdir Desktop/research
mkdir Desktop/blog

Run dotfiles installer

cd Desktop
cd dotfiles
./install

Install wakatime

Install wakatime globally using pip:

sudo pip install wakatime

Then, install the wakatime-mode package in emacs. First run:

M-x package-refresh-contents [RET]

to fetch the latest versions of packages. Then, install the package:

M-x package-install [RET] wakatime-mode [RET]

Make Finder show all file extensions

Not showing file extensions on some files is really needlessly confusing. Disable it in Finder > Preferences. I also like disabling warnings when changing extensions.

Shows the :code:`Finder > Prefences` pane with "Show all filename extensions" selected, and "Show warning before changing an extension" unselected.

Set key repeat rate

I like setting my key repeat rate and key repeat time to very fast settings. You can set this under System Preferences > Keyboard > Keyboard.

Shows the :code:`System Preferences > Keyboard > Keyboard` menu with "Key Repeat" set to the "Fast" setting, and "Delay Until Repeat" set to the "Short" setting.

Enable meta key in terminal

To use option as the meta key in Terminal, enable it in Terminal > Preferences > Profiles > Keyboard:

Shows the :code:`Terminal > Preferences > Profiles > Keyboard` pane with "Use Option as Meta key" selected"

Types in Lime

Lime is a strongly, strictly typed language.

Default Types

The default atomic types in Lime are:

  • bool (binary truth value, true or false)
  • int (integer, e.g., -12)
  • float (floating-point number, e.g., 2.0)
  • str (string, e.g., "abc")

Lists can be made out of any type, even lists themselves, by appending a $ token. For example, a list of int`s have the signature :code:`int$, and a list of int$`s will have the signature :code:`int$$.

Custom Types

Defining a struct

To define a struct, use the struct directive at the top level of a program.

(struct person first_name:str last_name:str age:int)

This will also define a new type, person.

Initializing a struct

To initialize a struct, use the new keyword, which will initialize and return a new instance of the struct specified.

(new person first_name "Liam" last_name "Schumm" age 17)

You can specify only a subset of the attributes when initializing, by not including the keys for those items:

(new person first_name "Liam" age 17)

Getting values of a struct

To get the values of a struct, use the get command. For example,

(get example_person first_name)

would return the first_name attribute of example_person.

Setting values of a struct

To set the values of a struct, use the set command. For example,

(set example_person (first_name 2))

would set the first_name attribute of example_person to 2, and return a reference to the object.

Type Signatures

In Lime, all functions must have a type specified for their return value and for their inputs. The type signature BNF is as following:

type           ::= (int|float|str...)
                 | type$
function_name  ::= [A-z]+
type_signature ::= function_name:type (type*)

Polymorphism

Lime has polymorphism, aka "overriding". This allows you to define a single function with different behavior when it is passed different types. For example,

(defun increment:float (x:float) (fadd x 1.0))
(defun increment:int   (x:int)   (iadd x 1))

would define the increment function for ints and floats. This can also be used for optional arguments, like so:

(defun sinput:str () (sinput ":"))
(defun sinput:str (prompt:str) (...c implementation...))

if a prompt is not explicitly defined, it will default to ":".

The any type

The any type allows methods to be specified that take arbitrary combinations of arguments that match a specific expression. For example, the lget command is defined to operate on lists of any and return a singular element of type any. The type signature looks like:

lget:any (arr:any$)

Reimplementing Lime in C

In my quest for minimalism, I decided I needed to stop carrying around two laptops---a Linux one that I used for developing Lime, and a MacOS one iOS development (and schoolwork). Turns out all of the Lime code I wrote used 32 bit assembler. The Mach kernel doesn't support 32-bit addressing, rendering all of that code unusable. Instead of emulating, I decided to totally revamp and rewrite the Lime language from scratch, this time starting in C instead of in Assembler. I realized that a lot of the work I put in developing an Assembler version focused on small-scale operations that gcc/clang would know how to do better than I could. structs and formal data types also mean that I have to spend less time developing a typing system and reimplementing basic operations, and more time actually working on interesting language features.

I took my time with implementing this, and came up with a pretty neat implementation, still utilizing my original stack-based design for the language.

I first implemented a basic stack as an array of longs. Integers are stored as long values, while strings/floats/etc.'s pointers are stored.

I then created a struct for a lambda function with partial application support:

typedef void (*func)();

typedef struct {
       int resolved;
       func function;
       int args_left;
       long* args;
       int args_len;
} lambda;

resolved is 0 if more arguments must be supplied, 1 if it has been processed, applied to the arguments, and the args pointer has been free()d.

The Lime language outputs three basic instructions: push, pop, and resolve. push and pop mutate the stack, and resolve pops a lambda pointer off the stack and applies it to the supplied arguments. My implementation for resolve was:

void resolve(int n) {
       f = *(lambda*)pop();
       if (f.args_left == 0) {
              f.function();
              f.resolved = 1;
       }

       for (int i = 0; i < n; i++) {
              f = apply_lambda(f, pop());
       }
       if (!f.resolved) {
              push((long)&f);
              fp = malloc(sizeof(lambda));
              f = *fp;
       }
}

As a result, Lime compiler output, instead of being a mess of assembler, now looks like, from the following Lime code:

System Message: ERROR/3 (<string>, line 54)

Error in "code" directive: maximum 1 argument(s) allowed, 3 supplied.

.. code:: c
   #include "lib/lime.h"

   int add3() {
          int stack_base_ptr = stack_ptr;
          push((long)3);
          push((long)get_arg(0, stack_base_ptr));
          push(make_lambda((void (*)())iadd, 2));
          resolve(2);
          clear_args(1, stack_base_ptr);
          return 0;
   }

   int main() {
          int stack_base_ptr = stack_ptr;
          push((long)2);
          push((long)1);
          push(make_lambda((void (*)())iadd, 2));
          resolve(2);

          push(make_lambda((void (*)())add3, 1));
          resolve(1);

          push(make_lambda((void (*)())iprint, 1));
          resolve(1);
          clear_args(0, stack_base_ptr);
          return 0;
   }

It's still a bit messy, with lots of ugly casting code and meaningless int return values; but with some macro (and potentially preprocessor) magic it'll get even more understandable. Instead of needing to convert between type systems and calling conventions for Lime, it'll be pretty simple to build a compatability layer to interoperate C and Lime functions, so I can leverage C's standard library in the development and use of Lime.

A Reasonable, Minimal Arch Linux Configuration

This post is a guide for setting up Arch Linux with reasonable defaults. Since this also doubles as documentation for my personal computer configuration, I'll update this post as I make changes.

Please follow this guide at your own risk. Arch is a hobbyist distro, and I am not responsible for any damage you may do to, or data you may lose from, your computer.

Disk Partitioning and Encryption

Overwriting the Disk

First, writing random data to the disk will ensure security by overwriting previous data on the disk, and also making it more difficult to determine the used size of the encrypted volume. The shred command will overwrite the disk with random data (make sure you've backed up your hard drive!):

$ shred --verbose --random-source =/dev/urandom --iterations=3 /dev/sda

Note that with 3 iterations, this may take around 7 hours.

Configuring Partitions

Next, we'll configure our two partitions. The first partition will be a 100 megabyte boot partition, and the second will hold all of our data, encrypted.

$ cfdisk /dev/sda
New > Partition Size: 100M > primary > Bootable
New > Partition Size: [RET] > primary

Configuring Encryption

On the second partition we created, we'll use cryptsetup to initialize an encrypted volume (which may then be mounted like any other device or drive):

$ cryptsetup --verbose --cipher aes-xts-plain64 --key-size 512 --hash sha512 --iter-time 5000 --use-random luksFormat /dev/sda2

Then, we'll mount the second encrypted partition under the name cryptroot.

$ cryptsetup open --type luks /dev/sda2 cryptroot

Partitioning Disks

Format the boot partition as ext4:

$ mkfs.ext4 /dev/sda1

and then format our main (encrypted) partition also as ext4:

$ mkfs.ext4 /dev/mapper/cryptroot

Then, mount the newly created filesystems:

$ mount -t ext4 /dev/mapper/cryptroot /mnt
$ mkdir -p /mnt/boot
$ mount -t ext4 /dev/sda1 /mnt/boot

Installing the Base System

To install the base system and the base developer utilities (required for building packages from the Arch User Repository), use Arch's package manager pacman to write these to our new root partition:

$ pacstrap -i /mnt base base-devel

Then, you'll need to generate something called an fstab. The fstab (FileSystem TABle) tells the computer which partitions to mount at boot:

$ genfstab -U -p /mnt >> /mnt/etc/fstab

Then, changeroot into the new root partition (tell the Linux kernel to use the new filesystem as the root one), so you can operate in the new OS:

$ arch-chroot /mnt

Configuring Locale

Locale is an important parameter of Linux. It tells applications how to render text in the language specified, how to display dates, and other things.

Since you're reading this guide in English, I'm going to assume you'll want US English, UTF-8 encoding. Open up your locale.conf:

$ nano /etc/locale.gen

and uncomment the line that reads:

#en_US.UTF-8 UTF-8

so that it now says:

en_US.UTF-8 UTF-8

Then, generate your newly configured locale:

$ locale-gen

Create your locale.conf:

$ echo LANG=en_US.UTF-8 > /etc/locale.conf

And tell your current running shell to use this new locale:

$ export LANG=en_US.UTF-8

Setting Time Zone

To set your time zone, make a symlink from one of the timezones in /usr/share/zoneinfo to /etc/localtime, like so:

$ ln -s /usr/share/zoneinfo/America/Chicago /etc/localtime

If you're not in Chicago, shell autocomplete (or ls /usr/share/zoneinfo) can help you find your appropriate timezone.

Set your hardware clock time to UTC:

$ hwclock --systohc --utc

Setting Hostname

To set your hostname, just write to the /etc/hostname file with the desired name:

echo dreammachine > /etc/hostname

Configuring Users

First, we'll need to configure a password for the root user. Enter your desired password after running:

$ passwd root

To add a new user with reasonable defaults, use the following like (with username replaced with your desired username):

useradd -m -g users -G wheel,games,power,optical,storage,scanner,lp,audio,video -s /bin/bash username

The -G flag indicates which groups you'd like this new user to be a part of, which indicates what kinds of operations they can perform on the system. wheel is the standard name for users who can use the sudo command (coming from the old-timey phrase "big wheel" meaning somebody important). Other groups are not necessarily necessary, but are standard and may be used by other software.

To set a password for this new user, use the passwd command we used earlier with the username of the user you just created:

passwd lschumm

Configuring grub

GNU GRUB (GNU GRand Unified Bootloader), is a simple libre bootloader. We'll need to install this in order to tell the computer how to boot the new OS.

First, you'll need to install the grub and grub-bios packages on the new system:

$ pacman -S grub grub-bios

Then, you'll need to tell grub that you're using an encrypted drive. Open up your grub configuration:

$ nano /etc/default/grub

and change

GRUB_CMDLINE_LINUX=""

to

GRUB_CMDLINE_LINUX="cryptdevice=/dev/sda2:cryptroot"

Configuring mkinitcpio

mkinitcpio generates the initial ramdisk loaded by Linux. You'll also need to tell this that you're using an encrypted disk by adding encrypt to it's list of "hooks". Open mkinitcpio's configuration file:

$ nano /etc/mkinitcpio.conf

and change this line:

HOOKS="base udev autodetect modconf block filesystems keyboard fsck"

to

HOOKS="base udev autodetect modconf block encrypt filesystems keyboard fsck"

Then, regenerate the initramfs image by running the mkinitcpio utility again:

$ mkinitcpio -p linux

Preparing for First Boot

To wrap up all of our configuration, first let's finish the GRUB configuration:

$ grub-install --recheck /dev/sda
$ grub-mkconfig --output /boot/grub/grub.cf

Then exit the system and reboot:

$ exit
$ umount -R /mnt/boot
$ umount -R /mnt
$ cryptsetup close cryptroot
$ systemctl reboot

First Boot

Now that our system is booting, let's get started with configuration. Since we'll have to run pacman and we haven't configured sudo yet, log in as root.

Connecting to the Internet

For this guide, you'll need an ethernet cable. With your computer plugged in to this cable, run:

$ dhcpcd

This will get an IP for your computer and connect you to the internet.

Give your packages a quick update:

$ pacman -Syy

And then give your system an update:

$ pacman -Syu

And reboot:

$ reboot

Configuring Wifi

To get Wi-Fi working with the default wifi-menu interface, you'll have to install the dialog and wpa_supplicant packages:

$ pacman -S dialog wpa_supplicant

Launching wifi-menu then scans for networks and lets you connect:

$ wifi-menu

Configuring sudo

First, install sudo (it's not included by default in arch base or base-devel):

$ pacman -S sudo

After that's finished installing, you'll need to tell sudo to let users of group wheel use it. Open /etc/sudoers with nano:

$ nano /etc/sudoers

and uncomment the line which says:

# %wheel ALL=(ALL) ALL

This allows wheel users to run execute all commands with root privileges.

Reboot, and log in as your user account.

Configuring i3 and Xorg

First, install Xorg:

$ sudo pacman -S xorg-server xorg-server-utils xorg-xinit

Then, install the i3 window manager:

$ sudo pacman -S i3-wm

And tell Xorg to launch i3 on startup:

$ echo "exec i3" > ~/.xinitrc

To start the window manager, just run:

$ startx

Other Configuration

All of my app-specific configuration is stored in public configuration files on my GitLab.

Credits

  • Most of the initial setup for this guide was taken from this HowtoForge tutorial with added explanation and notes.

CodeDay Fall 2018

CodeDay Fall 2018 was a blast!

What went well

To start off, opening ceremonies went pretty well. The attendees seemed to be excited to get started working ("Why can't we just move on to coding already?"). I ran a quick "speed dating" activity where I divided the attendees into two lines, pairing each off with a person in the other line, and asked them some questions I found on a dating website. The cheesy nature of these questions got a lot of people to meet people they never had before, and form teams not just of the people they go to school with.

I'm really happy with the vibe we had at this season's CodeDay. I think it effectively portrayed an atmosphere of productivity and motivation. As with any tech event hosted in a venue with large TVs, there were lots of Nintendo Switch tournaments, but by the end everybody had finished a project they were proud of.

One thing that I've specifically tried to focus on in all of my coding activities is increasing diversity. I'm really happy with our turnout in terms of racial and gender representation; the trope of one girl on one team "for moral support" didn't hold true at all. A lot of the team leaders and our most knowledgable mentor were all girls, which was really helpful for changing the norms around gender at the event.

Speaking of mentors, the adults that helped this CodeDay were absolutely invaluable. Typically adult mentors only stay for a couple hours, but this event all of them stayed the whole 24 hours, guiding each of the attendees from start to finish in creating apps and games.

What could have gone better

Unfortunately, the venue space was pretty spread out. Since CodeDay was hosted in a company instead of the typical incubator space, teams were scatted throughout conference rooms and divided by offices. This meant that it was pretty easy as a beginner to lock themselves in a room with their friends that was totally cut off from the rest of the attendees.

I also think workshops could have been better for beginners. The curricula that I've worked on with a few other volunteers for CodeDay Chicago could definitely be designed in a way that's a lot easier to understand for people who don't have a background in programming. It'd like to in the future rehash some of these to be more open-ended and project-based so that the takeaways from these workshops can be used as a jumping off point for CodeDay projects.

Pictures

Some fun moments captured by our resident photographer and amazing volunteer, Christian Sparks:

https://d33wubrfki0l68.cloudfront.net/01b9392447b7a1c166d46fe23fd7ed8b08c1262a/d873c/images/workshop-gabrielle-casey.jpg

One of our mentors giving guidance to a team working on a game.

https://d33wubrfki0l68.cloudfront.net/690f34b71597f9d3aaee5380486186e481b5e616/c40f6/images/whiteboarding.jpg

One of our teams whiteboarding out ideas.

https://d33wubrfki0l68.cloudfront.net/607a3381dce5884e1d8d364a21ad6a0a3e425876/10a4d/images/n64.jpg

Trying out the office N64 (I love trendy tech companies.)

https://d33wubrfki0l68.cloudfront.net/43c7392bb59b6b88e33ae618dca15f038fc67a14/1899f/images/workshop.jpg

Our intro to web development workshop, taught by Nik Ermolov.

https://d33wubrfki0l68.cloudfront.net/69be3a76d1113d9bd9e831da21b6216ea007387d/01c34/images/crazycoding.jpg

Me furiously hacking the mainframe.

https://d33wubrfki0l68.cloudfront.net/c107b526bb28f749acd3cf473f0a2a3d38f8a6e3/c579f/images/muchneededrest.jpg

Getting some much needed rest.

RacketCon 2018

I'm attended RacketCon from September 29-30th this 2018. Here are my talk/conference notes and biased thoughts.

Talks

DSSL2: A #lang For Data Structures Students

DSSL2 is a language designed by a professor at Northwestern to provide a good environment for data structures students. It uses a Python-like syntax to be familiar and accessible to students from a wide range of backgrounds. The language looks like this:

#lang dssl2

def insert!(t, k):
    if empty?(t): new_node(k)
    elif zero?(random(size(t) + 1)):
        root_insert!(t, k)
    elif k < t.key:
        t.left = insert!(t.left, k)
        fix_size!(t)
        t
    elif k > t.key:
        t.right = insert!(t.right, k)
        fix_size!(t)
        t
    else: t

and is constructed entirely using the extensive Racket macro system.

  • It's a really effective DSL because it uses existing knowledge of students, and has been successful in undergrad data structures course.
  • There's no existing standard of who knows what in undergrad engineering (aside from 10 weeks of C++) so they'll have to learn something else anyways, unless the professor wants to manage 10 different languages for each assignment.
  • DSSL2 is also effective in that it essentially forces semantic programming (no mutation, etc.), helping guide them in every step of their programming.

Bringing Back Design By Numbers

Design By Numbers is a pretty old language initially designed for artists who wished to learn programming. Some Racket folks have re-implement Design By Numbers in Racket. Code is simple and looks like plain english:

#lang dbn

Paper 0
Pen 100

Command HL Y {
   Set X 25
   Line X Y X Y
}

HL 50

and supports a lot of different language constructs.

  • I think Design By numbers wasn't particularly successful because it doesn't follow semantic guidelines at all.
  • It has inconsistent capitalization, brackets are not scope, and a really odd standard library.
  • While it's difficult for beginners to learn and understand semantics, making a confusing language will make doing anything advanced at all nonobvious to experienced programmers and especially beginners.

Language-Oriented Programming

Language oriented programming is a term that comes up a lot in the Lisp community. Language-Oriented Programming refers to the paradigm of creating DSLs (Domain-Specific Languages) to serve specific tasks: either an entire language (Racket is a DSL built on top of its core primitives) or a small language for a specific task (see DSSL2 above).

How To Ask For The Time In Racket

Gregor is a date (not datetime!) library for Racket.

  • Gregor's (today) and (now) differ because (today) only holds the data for the current day; not time, utc offset, etc. etc.
  • new Date("2001-01-01") returns something different from the actual day, like Fri Dec 31 1999 18:00:00 GMT-0600 (Central Standard Time), because it interprets this as UTC and localizes it.
  • Many other bugs and caveats like this exist. As such, it's useful to only store the date when you want to store a date, not the accompanying time.

A Racket Story

  • DSLs can be really useful for specific coding classes.
  • #lang minecraft-lang
  • #lang vr
  • Specifically difficult languages (e.g., microcontrollers) can greatly benefit from having a language for them written in Racket, by removing the learning curve for a new environment and providing a similar programming interface (Racket) for each task.
  • Cool Racket Packages
  • Protocol Buffers
  • Generates binary objects storing arbitrary data, and can serialize and deserialize automatically. Useful for network transfers.
  • feature-profile
  • LISP is typically slow, since that's not it's focus.
  • Profiling is underused.
  • Built-in profile functionality in Racket is statistical and typically operaties on limited data.
  • The feature-profile module can operate on individual features. While this can lead to micro-optimizations, it also allows for a greater benchmarking granularity that can result in optimization.
  • for-acc
  • User-specified accumulators with automated post-processing.
  • Now part of the base distribution!
  • fc or find-collection
  • Find a module on disk :)
  • handy
  • A mish-mash library. Just no.

Racket For Everyone (Else)

Teaching programming to people who aren't from a computer science/math background can be difficult, but really fruitful (see next section "Digital Ricoeur").

  • A majority of computer science tutorials use mathematical concepts so math people can understand, but this is a huge barrier to entry.
  • DSLs are a good way to provide small interactive deliverables.
  • Scribble and Frog, specifically, for making academic papers and personal blogs.
  • Dr. Racket significantly lowers the barrier to entry, and has great suppotr for application-y interactive projects.

Digital Ricoeur

Digital Ricoeur is an online archive of Paul Ricoeur's works.

  • Paul Ricoeur published in a lot of different journals, so as a result no one university has access to all of his works
  • What's something that can easily search a lot of sources? Computers!
  • Copyright prevents one organization from digitizing and distributing all works
  • Digital Ricoeur compiles lists of records to tell academics where to look for a specific source
  • DSL application: a DSL to notate publications and publication queries

SINBAD

Sinbad (automated Structure InfereNce and Binding of Data) is a library to access and import data uniformly from online RSS/XML feeds. Examples are:

(require sinbad)

(define faa ;; fetch airport status information
    (sail-to "http://services.faa.gov/airport/status/ATL?format=application/xml"
        (cache-timeout 300)  ; refresh every 5 minutes
        (manifest)           ; display schema for available data upon load
        (load)))

(define-struct port (name loc condition delay?))

(fetch faa (make-port "Name" "State" "Weather/Weather" "Delay"))

=> (make-port "Hartsfield-Jackson Atlanta International" "Georgia" "Fair" #false)

;;
;; do more complicated stuff with large datasets
;;

(define vehicle-data-source
    (sail-to "http://www.fueleconomy.gov/feg/epadata/vehicles.csv.zip"
        (load) (manifest)))

(define-struct auto (make model year trans city-mpg hwy-mpg))

;; pull a random data record...
(fetch-random vehicle-data-source
    (make-auto "make" "model" "year" "trany" "city08" "highway08"))

;; get all ~40,000 (!) of them...
(define all-vehicles
    (fetch vehicle-data-source
        (make-auto "make" "model" "year" "trany" "city08" "highway08")))

(length all-vehicles)
  • Get data feeds online, and get RSS/XML link, by requesting sail-to a URL.
  • Add in labels, and it'll fetch them out of the XML.
  • Easily load fetched data into a Racket struct.

Cassius: Verified Web Pages

Cassius is a mathematical formalization of web page layout.

  • How do you verify the appearance of a webpage?
  • Webpages are programs: taking inputs (font size, horizontal size, vertical size, etc.), and returning visual output.
  • W3C unittests are out of date and often don't even work.
  • For those with low vision, ensuring full accessibility compliance for websites is really important.
  • Cassius formalizes web page layout as a mathematical system and uses existing theorem-proving techniques to verify webpage layout.

Petri-Net Flavored Places

A petri net or transition net diagrams the flow and behavior of a distributed system.

Dataflow Network Programming with Neuron

Neuron is a compositional framework to make network software development easy and fun.

  • What if Facebook likes could actually contribute something?
  • Users can give computing power! But how do you computationally facilitate this?
  • Neuron defines two constructs:
  • Sockets
  • Structure to pair input/output ports.
  • Codecs
  • Reader and writer with encoder and decoder functions.

State of Racket

Racket's moving to a Chez scheme base! Spaghetti-ish core code (written in C) is being replaced with Chez Scheme code. To compile Chez scheme, just run

make cs

instead of

make

when building Racket.

FARM

FARM (Functional Art, Music, Modeling, and Design) (yes I know it's not an acronym, I didn't choose it) was a night of algorithmic and functionally-generated music. I didn't get a chance to go to the talks on how the algorithms worked (they happened at the same time as RacketCon), but they sounded pretty cool. Livecoding was also a major part of the event, and another cool application of DSLs!

Workshops

syntax/parse

  • Canonical reference for learning Lisp macros: Macro-By-Example.
  • racket/private/template - good example of macros
  • ~? operator
x = #'apple
#'(eat (~? x 'nothing'))
= (eat apple) if x is defined
otherwise, (eat 'nothing')

Web Development in Racket

I went to a workshop with the amazing Jesse Alama about the state of web development in Racket. Here were a few standout things:

  • sweet.js provides the goodness of Racket-like hygienic macros to unholy JavaScript.
  • db is a nice ORM.
  • Here is a nice intro to web development in Racket workshop written using the standard Node/Flask/etc. method of URL dispatchers.

Unrelated

Prime Checking in NASM

Primality Checking

It's very difficult to quickly determine the prime factors of a number because there's no known deterministic algorithm to do so. Therefore, the only method to determine prime factors and primality is a brute force approach with as much information as possible.

Simple Brute-Force

It's rather trivial to show that if a number n has multiple prime factors, every prime factor must be strictly smaller than n. Therefore, the simplest algorithm to check the primality of a number starts n and loops all the way down to 2, stopping before it reaches 1 (every number is divisible by 1). The pseudocode for this is:

function is_prime(n) {
         i = n
         while i > 1 {
               if (n % i) == 0 {
                     // n is divisible by i, abort
                     return false
               }
               i -= 1
         }
         return true // we've made it through
}

Although this algorithm has to iterate through lots of possible divisors, it does work.

Bounded Brute-Force

A less trivial fact is that non-prime number must have at least one prime factor less than or equal to its square root. We proceed by contradiction: multiple factors must exist, since the number is composite, each being larger than the square root, whose product will then at least be larger than the product of the square root with itself, and thus larger than the given number itself.

Therefore, we may set an upper bound of sqrt(n) on our set of divisors, while still ensuring that at least 1 exists. This is promising, because as the number grows, its square root grows more and more slowly, making our algorithm more and more performant for larger integers. Thus, we can use our same is_prime implementation, with a significantly smaller upper bound:

function is_prime(n) {
         i = sqrt(n)
         while i > 1 {
               if (n % i) == 0 {
                     // n is divisible by i, abort
                     return false
               }
               i -= 1
         }
         return true // we've made it through
}

One more trick we can apply is checking first if a number is even. If a number is even and it isn't 2, the number is divisible by 2 and is thus not prime. We'll apply this trick before we start doing any type of intensive computation:

function is_prime(n) {
         if (i % 2) == 0 {
            if i == 2 { return true  }
            else      { return false }
         }
         i = sqrt(n)
         while i > 1 {
               if (n % i) == 0 {
                     // n is divisible by i, abort
                     return false
               }
               i -= 1
         }
         return true // we've made it through
}

Babylonian Bounded Brute-Force

Now, hold on a second... NASM doesn't have a convenient means of taking the square root truncated to an integer. To calculate the square root would necessitate loading to/from the FPU, which is very computationally intensive. However, we may instead use the Babylonian approximation for the square root of a number instead of the square root; which will still satisfyingly reduce the size of our test space. Recall the Babylonian method to calculate the square root of \(N\):

\begin{equation*} f(x) = \frac{x + N / x}{2} \end{equation*}

Repeated application of \(f\) onto any seed (number to start with) will result in a number that gets closer and closer to the true value of \(\sqrt{N}\). If \(x = \sqrt{N}\) exactly, then \(f\) will return the same value:

\begin{equation*} f(\sqrt{N}) = \frac{\sqrt{N} + N / \sqrt{N}}{2} = \frac{\sqrt{N} + \sqrt{N}}{2} = \sqrt{N}. \end{equation*}

For our purposes, we chose the seed \(x = N / 2\). Calculating one iteration of the Babylonian method then yields:

\begin{equation*} f(x) = \frac{(N / 2) + N / (N / 2)}{2} = \frac{(N / 2) + 2}{2} = \frac{N + 4}{4}. \end{equation*}

Using this convenient expression instead rids our algorithm of those pesky square roots:

function is_prime(n) {
         if (i % 2) == 0 {
            if i == 2 { return true  }
            else      { return false }
         }
         i = (n + 4) / 4
         while i > 1 {
               if (n % i) == 0 {
                     // n is divisible by i, abort
                     return false
               }
               i -= 1
         }
         return true // we've made it through
}

NASM Implementation

Below is the final implementation, in NASM, of the proposed prime checking algorithm. The number to check is passed in on register eax and the result, 0 (not prime) or 1 (prime) is returned in the same eax register.

is_prime:
        ;; save registers
        push ebx ;; store the value of eax, ebx, ecx, edx \
        push ecx ;; so functions called before and after won't have \
        push edx ;; values changed

       ;; even/odd check
       test eax, 1
       cmp eax, 2
       je is_prime_end
       jmp not_prime_end

       ;; save the value of eax in ebx
       mov ebx, eax

       ;; calculate our square root approximation
       add eax, 4 ;; eax = eax + 4

       mov ecx, 4 ;; set dividend to 0
       mov edx, 0 ;; set remainder to 0
       div ecx    ;; divide eax by ecx and put rounded result in eax

       ;; store our iteration variable in ecx
       mov ecx, eax

       is_prime_loop:
                ;; if ecx is 1 and we haven't found
                ;; a factor, it's prime
                cmp ecx, 1
                je is_prime_end

                ;; restore eax to original value
                mov eax, ebx

                ;; divide by ecx
                mov edx, 0 ;; set remainder to 0
                div ecx    ;; divide eax by ecx, leaving remainder in edx

                ;; if edx is 0, there's no remainder and we found a divisor
                cmp edx, 0
                je is_not_prime

                ;; decrement iterator
                dec ecx

                ;; loop through again
                jmp is_prime_loop

       is_not_prime:
                ;; restore values of registers
                pop edx
                pop ecx
                pop ebx

                ;; return value 0 in eax
                mov eax, 0
                ret

       is_prime:
                ;; restore values of registers
                pop edx
                pop ecx
                pop ebx

                ;; return value 1 in eax
                mov eax, 1
                ret

System Information

I ran this code on my Linux machine using the Netwide Assembler (NASM). Note that any other assembler may be used (the Intel instruction set is the same), but syntax may differ.

S-Expression Parsing in Lime

Description

This is an exciting time in Lime development! I've been working a lot on string parsing in Lime, and I've finally gotten to the point where I was able to implement a simple S-Expression parser in Lime.

An S-expression parser is a good idiomatic example of programming in a given language, and as the development of LimeOS continues, a self-hosted Lime runtime (Lime, written in Lime) will be needed to dynamically run Lime programs stored in memory and typed by the user.

The Code

Strategy

I decided to go with a relatively simple procedure for parsing S-expressions, taken from Peter Norvig's wonderful (How to Write a (Lisp) Interpreter (in Python)) (the article that first got me into Lisp!). In Python, this is simply:

def parse(string):
    return string.replace("(", " ( ").replace(")", " ) ").split()

At first, this looks a bit odd. Working it out on paper however shows that it indeed works:

   "(this (is an (s expression)))"
-> " ( this  ( is an  ( s expression ) ) ) "
-> ["(", "this", "(", "is", "an", "(", "s", "expression", ")", ")", ")"]

However, str.replace and str.split aren't implemented in Lime...yet!

string_split

First, I needed to implement a string_split method to split strings by spaces. I decided to go with the following strategy:

(define (string_split string separator)
        (filter is_nonempty_string (_string_split string separator "" (list 0))))

(define (_string_split string separator buf arr)
          (if (= string "") (list_append arr buf)
                 (if (string_startswith string separator)
                     (_string_split (string_ltrim (string_length separator) string) separator "" (list_append arr buf))
                     (_string_split (string_cdr string) separator (string_add buf (string_car string)) arr))))

In the main recursive function, _string_split (the regular string_split exists just to set the default arguments of buf and arr to default values), the flow is as follows:

  • if we have no more characters to parse, return the list we've created
  • otherwise,
  • if the current string in memory starts with the delimiter, remove the delimiter and add a new empty item to the list
  • otherwise, add the character at the beginning of the string to the previous element in the iterated list

This performs relatively well. One thing I did have to account for was concurrent delimiters. That is, when the string an item is being separated by occurs directly next to itself 2 or more times. For example, string_split not taking this into account would have the following behavior:

   (string_split "a b  c")
-> [a, b, , c]

which doesn't make good heuristic sense: separating words by spaces should also separate by blocks of space. Therefore, I defined some pretty simple functions, is_empty_string and is_nonempty_string to get around this. Before the result of string_split is returned, I use the filter function and is_nonempty_string to only return the results that are nonempty.

string_replace

The string iteration algorithm for string_replace is almost exactly the same as that for string_split:

(define (string_replace string search replacement)
        (_string_replace string search replacement ""))

(define (_string_replace string search replacement buf)
        (if (= string "") buf
            (if (string_startswith string search)
                (_string_replace (string_ltrim (string_length search) string) search replacement (string_add buf replacement))
                (_string_replace (string_cdr string) search replacement (string_add buf (string_car string))))))

However, instead of adding elements to a list of split elements, it simply traverses the string, and when it hits the delimiter, it replaces that delimiter with the replacement string and jumps to the character immediately after the replacement string.

Bringing It All Together

The final program I wrote, sexp.lime, looks like:

;;
;; [sexp.lime]
;;
;; A simple S-expression parser, written in Lime.
;; Copyright (C) 2018, Liam Schumm
;;

(define (parse string) (string_split
                       (string_replace (string_replace string
                                        ")" " ) ")
                                        "(" " ( ")
                       " "))
(define (main args) (print (parse (!! 0 args))))

Compiling this program with llc and running it then yields:

$ llc sexp.lime
$ ./a '(+ (- 3 4))
[(, +, (, -, 3, 4, ), )]

A properly parsed S-expression!

Lime System Information

This article was written and tested on Lime alpha, commit 53f8ae1, under 64-bit Arch Linux, kernel 4.18.5-arch1-1-ARCH.