Elegant scripts

We’ve all written them, they fall together perfectly, they are readable, but they assume the happy paths. And world can be a happy place, but it’s also deeply flawed, imperfect and for your code to function in such world, even add value … it must handle the imperfection.


Our nice, worldly example

Our script will accept an ID as an argument. It will find an API token in setup.json and make a request to download a PDF from a remote server. The name of the downloaded file is determined by the server.

It’s simple, but a little realistic-ly messy, but hey - we are programmers, this is what we do, this is what we thrive at, right :) … right :I (thinks of all the vibeco…)

Python Version

Python is the lingua franca of programming. Let’s go!

import sys, json, requests, re
from requests.auth import HTTPBasicAuth

id = int(sys.argv[1])

with open('setup.json') as f:
    setup = json.load(f)

url = f"https://www.example.com/pdf-api?id={id}"
resp = requests.get(url, auth=HTTPBasicAuth(setup['token'], 'x'))

pattern = re.compile(r"filename\*?=[f']?(.*?)[']?(?:;?$)")
content_disp = resp.headers['Content-Disposition']
filename = pattern.search(content_disp).group(1)

with open(filename, 'wb') as f:
    f.write(resp.content)

It’s a perfect little script. Each block of code does one thing, each one is few lines, there is no unneded structure or boilerplate really - I like it.

Rye version

Since this is a Rye language blog, we will also write this in Rye :)

rye .Args? .load .first :id
Load %setup.rye |context :setup

re: regexp "filename\*?=[f']?(.*?)[']?(?:;?$)"

format id https://www.example.com/pdf-api?id=%d
|Request 'GET "" 
|Basic-auth! setup/token "x"
|Call :resp 
|Header? "Content-Disposition" |Submatch?* re
|file .Create
|Copy* Reader resp

Ok, a little different, but similar.

What are we assuming?

  • script always gets one integer argument
  • setup file exists and has correct content
  • HTTP request never fails
  • Content-Disposition header is always present with a filename
  • We can always create a new file

As Eugene Lewis Fordsworthe would say - that’s a lot of …, assumptions :(


Adding Basic Validation (Step 2)

I can be like watered down version of Eugene:

“User input is the source of many problems”.

No user input, no problems - but we need them users. So let’s validate those inputs.

Python Version

We will now:

  • check the number of arguments
  • check if ID is integer
  • check if setup has token value defined
import sys, json, requests, re
from requests.auth import HTTPBasicAuth

if len(sys.argv) != 2:
    raise ValueError("script argument id - expected exactly one integer")

try:
    id = int(sys.argv[1])
except ValueError:
    raise ValueError("script argument id - must be an integer")

with open('setup.json') as f:
    setup = json.load(f)

if 'token' not in setup or not isinstance(setup['token'], str):
    raise ValueError("loading setup - token field required as string")

url = f"https://www.example.com/pdf-api?id={id}"
resp = requests.get(url, auth=HTTPBasicAuth(setup['token'], 'x'))

pattern = re.compile(r"filename\*?=[f']?(.*?)[']?(?:;?$)")
content_disp = resp.headers['Content-Disposition'] 
filename = pattern.search(content_disp).group(1)

with open(filename, 'wb') as f:
    f.write(resp.content)

We added those few checks and if you ask me (I am partial to this), the elegant, readable script is already gone. That’s one of the reasons I loathe try/catch approach. It adds structure that disrupts the flow of code.

Rye Version

rye .Args? .validate { <one> integer } 
|check "script argument id" |first :id

Load %setup.rye |context |validate { token: required string } 
|check "setup file" :setup

re: regexp "filename\*?=[f']?(.*?)[']?(?:;?$)"

format id https://www.example.com/pdf-api?id=%d
|Request 'GET "" 
|Basic-auth! setup/token "x"
|Call :resp 
|Header? "Content-Disposition" |Submatch?* re
|file .Create .defer\ 'Close
|Copy* resp .Reader .defer\ 'Close

We used validation dialect for the arguments and config. And .defer\ 'Close to ensure resources (the file writer and HTTP stream reader - no copying to memory btw) are cleaned up.

Script got a little more complex, but structure and flow of it didn’t change.


Full Error Handling (Step 3)

Now let’s handle all the failures, and provide helpful feedback to the user in case it fails. Our initially elegant script, exploded into this … :o

Python Version

We now also check for:

  • does the setup.json exist
  • can we parse the setup.json’s JSON
  • did the HTTP request succeed
  • we provide the default filename if there is no Content-Disposition
  • can we create a new file
  • can we write PDF to it
import sys, json, requests, re
from requests.auth import HTTPBasicAuth

# Validate arguments
if len(sys.argv) != 2:
    print("Error: script argument id - expected exactly one integer")
    sys.exit(1)
try:
    id = int(sys.argv[1])
except ValueError:
    print("Error: script argument id - must be an integer") 
    sys.exit(1)

# Load and validate config
try:
    with open('setup.json') as f:
        setup = json.load(f)
except (FileNotFoundError, json.JSONDecodeError) as e:
    print(f"Error: couldn't open config - {e}")
    sys.exit(1)

if 'token' not in setup or not isinstance(setup['token'], str):
    print("Error: loading setup - token field required as string")
    sys.exit(1)

pattern = re.compile(r"filename\*?=[f']?(.*?)[']?(?:;?$)")

url = f"https://www.example.com/pdf-api?id={id}"

try:
    resp = requests.get(url, auth=HTTPBasicAuth(setup['token'], 'x'))
    resp.raise_for_status()
except requests.RequestException as e:
    print(f"Error: Http request failed - {e}")
    sys.exit(1)

# Extract filename with default fallback
content_disp = resp.headers.get('Content-Disposition', '')
match = pattern.search(content_disp) if content_disp else None
filename = match.group(1) if match else "default.pdf"

try:
    with open(filename, 'wb') as f:
        f.write(resp.content)
except IOError as e:
    print(f"Error: couldn't create local pdf - {e}")
    sys.exit(1)
except Exception as e:
    print(f"Error: couldn't save contents - {e}")
    sys.exit(1)

What was initially a 15-line script is now a 45-line code with lots of structure. The working code is hidden inside all the safety code, it’s almost nowhere to be found.

Python programmers would naturally refactor this with some helper functions, use additional libraries like argparse and a validation library, but that still adds ‘structure’ to our previously clean happy path. It’s better hidden, but it also adds a dependancy.

Rye Version

We added all the same checks to our Rye version.

rye .Args? .validate { <one> integer }
|^check "script argument id" |first :id

Load %setup.rye |check "couldn't open setup file" |context
|validate { token: required string } |^check "loading setup" :setup

re: regexp "filename\*?=[f']?(.*?)[']?(?:;?$)"

format id https://www.example.com/pdf-api?id=%d
|Request 'GET "" 
|Basic-auth! setup/token "x"
|Call |^check "Http request failed" :resp 
|Header? "Content-Disposition"
|Submatch?* re |fix { "default.pdf" }
|file .Create |^check "couldn't create local pdf" |defer\ 'Close
|Copy* resp .Reader .defer\ 'Close |^check "couldn't save contents"

Code got a little more condensed, but the line numbers barely moved, and most importantly, program structure stayed the same! :O

But … magic

There is no magic in code above. All words that you see are regular Rye functions. Code above is just a result of many careful design decisions by Rye’s predecessors, us and some fortune.

In fact, we used just a small part of Rye’s failure and validation handling capabilites. It’s hard to just jump in directly, without first reading more about Rye, but the critical functions above do the following:

  • check - returns a value, if it’s not a failure, otherwise wraps a failure in a higher level failure and returns that
  • fix - again returns a value, if it’s not a failure, otherwise evaluates a block and returns the result of that

What we used above was a related function. Rye has a concept of returning functions (also return is a returning function in Rye - every active word is a function in Rye). And a naming convention for returning functions is to prepend them ^.

  • ^check - is like check, but also returns / exits to caller (a higher level failure) in case of failure

The point

This above is one of the reasons I dislike the try/catch model, which is a main failure handling model in many languages, not just Python of course.

There are better options out there, we demonstrated one above, but this is just a refined version of what Go does, using a Type system and Option type is another method. What is common to these approaches? Failure is a normal value in your language, and should blend with the faculties of your language, be a part of it. I intend to write more about this.