Nov 212017
 

In my last post, “Kleene’s Theorem,” I provided some useful background information about strings, regular languages, regular expressions, and finite automata before introducing the eponymously named theorem that has become one of the cornerstones of artificial intelligence and more specifically, natural language processing (NLP).  Kleene’s Theorem tells us that regular expressions and finite state automata are one and the same when it comes to describing regular languages. In the post I will provide a proof of this groundbreaking principle.

Continue reading »

Nov 092017
 

Strings

As a computer programmer for more than a quarter of century, I don’t think I have ever thought much about strings. I knew the basics. In every language I’d worked with, strings were a data type unto themselves. Superficially they are a sequence of characters, but behind the scenes, computers store and manipulate them as arrays of one or more binary bytes. In programs, they can be stored in variables or constants, and often show up in source code as literals, ie., fixed, quoted values like “salary” or “bumfuzzle.” (That is my new favorite word, btw.) Outside of occasionally navigating the subtleties of encoding and decoding them, I never gave strings a second thought.

Even when I first dipped my toe into the waters of natural language processing, aka NLP (not to be confused with the quasi-scientific neuro linguistic programming which unfortunately shares the same acronym), I still really only worked with strings as whole entities, words or affixes, As I made my through familiarizing myself with existing NLP tools, I didn’t have to dive any deeper than that. It was only when I started programming my own tools from the ground up, did I learn about the very formal mathematics behind strings and their relationship to sets and set theory. This post will be an attempt to explain what I learned.

Continue reading »

Oct 122017
 

Boolean functions, sometimes also called switching functions, are functions that take as their input zero or more boolean values (1 or 0, true or false, etc.) and output a single boolean value. The number of inputs to the function is is called the arity of the function and is denoted as k. Every k-ary function can be written as a propositional formula, a sentence in propositional logic. A binary Boolean function, a Boolean function with two arguments, can be described by one out of sixteen canonical formulas.

Continue reading »

Jun 262017
 

warning symbol

In general it’s a good idea to see warnings your code generates while you are testing, but if you are anything like me, you usually don’t need to see warnings generated by third party code. I was plagued by this today as I was testing a function that utilized NLTK, one of, if not the most, popular natural language processing software libraries for Python.

I’m not too proud to admit that only very rarely do my unit tests run without any failures. It’s usually difficult enough to track down the failures and errors without also being swamped by a ton of extraneous warnings generated by third party software.  Such was the case with a simple function I had written to remove accidental duplicate characters from a piece of text.

Continue reading »

Mar 212017
 

Apple to Ubuntu

For almost the last 20 years, an Apple laptop of one variety or another has been my main computing device. Imagine my surprise when I finally learned today that Apple keyboards don’t have an Insert key. In almost two decades I have never needed it, but that changed this morning.

While working in my favorite Python editor, Wing IDE by Wingware, some sloppy touch typing resulted in the cursor changing from the blinking vertical line I am used to a blinking underline. That change was subtle enough that I missed it, but as soon as I began typing and the text I was entering started overwriting the existing code, I knew something was up. WTF!

Continue reading »

Mar 042017
 

Django logo

If you’ve created any forms at all using the Django web framework then you should already be familiar with Django’s CSRF middleware and the protection it provides web site’s against cross site forgery request attacks. When the middleware is active, and unless the view has this protection overridden, any form POSTed will be expected to contain a hidden field named csrfmiddlewaretoken the value of which is expected to match a similarly named field in a CSRF cookie attached to the user. Because this value is specific to a user and constantly changing as well, testing the output of webpages with forms against what is expected is difficult. What follows is the solution I am using in Django 1.10.

Continue reading »

Feb 152017
 

SWI-Prolog Logo

I know that this post will probably be of interest to about a dozen people worldwide, and even those few may be disappointed by it. Since the official SWI-Prolog packages aren’t often kept up to date and because compiling and installing SWI-Prolog from source should be both quick and straightforward, that is the recommended way to do it on Linux and other *nix systems.

If you are looking for tips, tricks or assistance with an installation problem, you likely won’t find it here. The instructions provided on the SWI-Prolog site for building and installing SWI-Prolog from source code “just worked” for me. Nevertheless, I want to document what I did, and if you are looking for the Cliff Notes version, then by all means, read on.

Continue reading »

Jan 212017
 

Python Logo

I am excited this evening. Why? Because I am finally getting back to some real Python development. While I have recently coded up some GIMP plug-ins, I haven’t really taken the time to properly set up my Python environment since making the switch from OS X to Ubuntu in December.  Now I’ve got some Django programming to do, but before I can start installing any third party packages, I’ll need to install pip, the de facto package management system for installing and managing Python packages. Think of pip being to Python as apt is to Ubuntu. The main repository for Python software is PyPi, the Python Package Index.

Continue reading »

Jan 042017
 

GIMP logo

In my last post, I briefly explained GIMP‘s scripting and plug-in system and the two most common ways to program custom extensions and scripts: Script-Fu and Python-Fu. Script-Fu, being the older of the two options, is well documented and the more widely used of the two. Not because I love the road less traveled, but because I love Python, I have chosen Python-Fu, a set of Python modules that serve as a wrapper for libgimp, as my platform for extending GIMP.

This post (and any I hopefully follow up with) are meant to track my progress as I learn my way around Python-Fu. Even while writing relatively simple scripts I have encountered gotchas and conflicting information about how to do things. Hopefully some of what I’ve discovered scouring both the web and Python source code will save other developers the time and frustration I’ve already paid.

Continue reading »

Dec 092016
 

GIMP logo

For several years now, Adobe Photoshop has been the sole reason that I have continued to run Mac OS X. During that time, I have done the majority of my work in an Ubuntu instance running in a Parallels virtual machine. I’ve finally bitten the bullet and installed Ubuntu as the primary operating system on my MacBook Pro. I couldn’t be more pleased with how the transition has gone, and I regret not doing it earlier.

Continue reading »