Scheduled Builds in Jenkins Scripted Pipelines

Sometimes you’ll want to have Jenkins trigger a build on a recurring schedule. Common examples include things like a scheduled build at midnight (or some other time when regular work isn’t happening).  Traditionally in Jenkins this could be done on the configuration page for the job in question. You’d find the “Build Periodically” option under Build Triggers and enter a crontab expression to schedule when Jenkins should trigger a build of the job automatically.  With pipelines though, that option disappears from the UI, instead you have to create it as code in your Jenkinsfile.

properties([pipelineTriggers([cron('H/15 * * * *')])])
Put this near the top (ie before a node declaration) of your Jenkinsfile and the job will get built automatically every 15 minutes. It’s worth noting that just making this change isn’t enough, you also have to have a build triggered with this change in place to schedule the build (which makes sense — changes to Jenkinsfile’s don’t affect jobs until the new Jenkinsfile is “run” by Jenkins, which doesn’t happen until a build is scheduled, be that by manual build, automatic trigger from SCM push, etc)
This is fine and easy, but what if you’re using multibranch pipelines?  You’ll likely then want to have different schedules depending on the branch in question (ex: it’d seem silly to schedule a recurring build of a feature branch)  As it turns out this really isn’t too bad, you just need to inspect the BRANCH_NAME variable:

def triggers = []

if("$BRANCH_NAME" == 'develop') {
    triggers << cron('H/15 * * * *') // every 15 minutes
} else if("$BRANCH_NAME" == 'master') {
    triggers << cron('H H(0-2) * * *') // daily between midnight & 2 AM
} else {
    // no scheduled build
}

properties (
    [
        pipelineTriggers(triggers)
    ]
)
In this we set up two schedules for the project: if the branch is the develop branch, we build it every 15 minutes.  If the branch is our master branch, we build it every night between midnight and 2AM. If the branch isn’t develop or master then we don’t schedule any automatic builds. Note that the else block is empty (I could have omitted it entirely), which means that the triggers for the current branch will be cleared. Side note: this is also how you’d delete a previously scheduled build for a branch, just clear the line which initializes the crontab schedule and then next time it’s build the schedule will be cleared.
Advertisements

Python Tip of the Day – Logging basicConfig

Oftentimes you just want to try out something related to logging in the REPL, or in a hacky script.  Wading through the docs on the logging module is this painful exercise in reading about handlers and formatters and other stuff you don’t care about.

The simplest way to just get the ability to do logging in the REPL:


>>> import logging
>>> import sys
>>> logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
>>> logging.debug('This is a debug level logging message')
DEBUG:root:This is a debug level logging message

Simple as that.

Python and Microsoft?

Much has been said about how Microsoft has changed in recent years, no longer the super closed-source monopolistic giant it once was.  Regardless of whether or not you believe the change is real or just at the surface, permanent or temporary, etc, there is definitely some interesting things that have come up recently.  One is how everyone’s favourite programming language is starting to take a prominent role in many of Microsoft’s offerings.

For example, in 2017 Microsoft announced on the SQL Server Blog that Python 3.5 would be embedded within SQL Server and be able to be used natively in that product.  This is huge for the data science & machine learning communities.  I haven’t yet played with this, but supposedly you can effectively write stored procedures using Python.

Another change illustrating the adoption of Python within the Microsoft ecosystem has been in Visual Studio Code land.  Originally Python support in VS Code came via a (really, really well made) extension written by Don Jayamanne.  And last November Don was hired by Microsoft full-time and the Python extension is now a fully MS-supported part of VS Code.  Brett Canon (a Python core contributor who also works at Microsoft) is now the dev lead for the Python extension.  And they’re also actively hiring looking for people for “Visual Studio Code / Python!”

Lastly, there’s been a bunch of noise (see here, here, and here) about how Microsoft is now actively seeking input on how Python could potentially be used as potentially a replacement for VBA as the macro/scripting language in Excel.  It’s still way too early on this to see what will come of it, but imagine a world where managers who spend their lives in Excel get the power of Python at their fingertips and what that could potentially do for raising the exposure of the language.

Does all this indicate a trend?  I hope so, I think the adoption of Python in Microsoft products will help boost the popularity of the language even further (and it had some explosive growth in 2017).  Interesting times all the same.

Why not exact story point estimates?

Awhile back at a job a question was raised in a sprint planning meeting about why we don’t do exact story point estimation, instead of doing the fibonacci (or fibonacci-like) scales. We had scales of 0.25, 0.5, 1, 3, 5, 8, 13, etc at the place, and people were asking why not “0.75”?

Being a topic I’m quite interested in, I wrote a long email explaining my thoughts on the subject, and now the same conversation has come up again at my current job. As such I thought it might be useful to share that email with the world. The following is that email, slightly reworded to be a bit more blog-entry friendly. So “why don’t you just use exact story point estimates”?

I’ll reflect the question: “Why do we use story points at all? Why not just estimate in hours?” This is a really good question, why don’t we just say “4 hours” instead of “0.5 SP”. If we did that, then we’d no longer have this funny problem of a ticket that’s really 30 minutes having an equivalent estimate to something that takes 2 hours. So if we did that, then we’d be better off, as we could then start estimating more precisely. Instead of 4 tickets at 0.25 SP each totalling 1 SP, which means 1 day, we might have 15 minutes + 1 hour + 2 hours + 45 minutes == 4 hours, so if we go that route then we could pack in even more stuff to a sprint, and get even more stuff done. Right?

Well, not so much. The first problem with time estimates is it makes a fundamentally invalid assumption: that all developers are created equal. A senior dev & a junior dev might agree entirely on what exactly needs to be done to complete a task, but I can pretty much guarantee that it’ll take the senior dev less time than the junior (regardless of task). So if we estimate in time units then suddenly we have this problem of do you estimate to the level of the person who’s really experienced, or to the junior person? A fibonacci style sequence of “bucket sizes” (like 0.25, 0.5, 1, 2, 5, etc) helps with this (if it takes the junior 9 hours, and a senior 5, then the SP estimate will likely be the same — 1). (Mike Cohn, the Scrum Alliance guy, has a few blog posts on Story points to this effect, see this and this)

So that’s one problem with exact time estimates, but there’s a bigger one: behavioural psychology. There’s actually a lot of research that’s been done in behavioural psych that shows that while people are really good at *relative* estimating, they are terrible at *exact* estimating, particularly in a domain where there is a lot of uncertainty, or a lack of repetition (like software development, where you’re often asked to do things or work with technologies you’ve never done or used before).

I’ll give you an analogy (this was the same example that a former scrum master of mine used with me): say I pointed at two buildings, one that was 50 ft tall and one that was 97ft tall and said “about how much bigger is the second one to the first?” You’d look at them, and even with no knowledge of carpentry, architecture, civil engineering, etc, you’d probably be able to say with a high degree of confidence that the second is about twice as big and you’d be pretty darn close. Now let’s say I asked “how many feet taller is the second building compared to the first?” Now, maybe if you’re really experienced at knowing the hights of buildings, you’ll be able to come up with a number that’s close to the actual difference, but I sure wouldn’t, in fact I’d probably get hung up on being “perfect” with the estimate and end up spending a disproportionate amount of time “estimating”. That’s relative sizing vs exact sizing, and we’re wired such that we’re good at the former, but not so much at the latter.

Now lets take the story even further. Lets say I point at two buildings and one is 50 ft tall and the other is 600 feet tall (12 times bigger). Now let’s say I ask you the same relative sizing question. Suddenly because the magnitude of difference is so high it becomes more difficult to do the relative estimating, but you might say “10 times bigger” and you’d be not far off. But the real question: does it matter (when the difference is so great) that you’re off slightly with that estimate? Probably not, and that’s why with SP estimates the scale is usually fibonacci like — past 5 the numbers get bigger faster because the relative differences are so huge it’s not meaningful to be as precise.

I used to have links to a bunch of research papers talking about this, but have unfortunately since lost them. It’s actually really interesting stuff.

In any case, so pulling it back: that’s why we don’t do 0.75 SP, because one of the big points of story point estimating is to do relative estimating. Once we start splitting hairs then there really is no point in doing SP estimates at all, and instead just estimate everything in hours (which is problematic).

Python Tip of the Day – subTest!

Coming from a jUnit background, one of the things I always missed with the vanilla Python unitttest library was parameterized tests.  Oftentimes when writing unit tests for a particular unit you find yourself writing effectively the same test over and over again, but with different inputs. Wouldn’t it be nice if we could write the test once and somehow parameterize the test with different inputs? Yes. Yes it would.

Py.test supports this, but what if you really like Nose or some other test runner? Well, in Python 3.4 now in the standard unittest library we have something very similar — subTests.

The idea is you can write a loop over a set of inputs, and within that loop define a test within a with self.subTest() context. Each iteration will test the given input and failures for each are counted as separate test failures. Really handy for cutting down on unit test boilerplate code.

Serverless Microservices and Python (with tests!) – Part 2

Ok, so in part 1 of this series, I started off by exploring the use of Lambda and API Gateway as a tool for building scalable microservices in Python. I largely focussed on taking an existing tutorial, and building out some unit tests for it, as well as some supplementary scripts to make bundling stuff up for delivery to Lambda easier.

In this entry, I’m going to explore adding a new requirement to the existing project — supporting bcrypt as a digest.

So to begin with, since I’m a big TDD fan, I’m going to do this by first adding a test, then making the test green, then refactoring. If you want to see the code as it was at this point, I tagged the commit I was at in Github

So first things first, lets start with a (failing) test (leaving out the rest of the test file for brevity):

SAMPLE_BCRYPT_HASH = '$2b$12$44roRI0Ftbbvoy6V1YQebOKeO7a7WhzRvv.X194BMxykDT0nQGcS2'
...
    def test_valid_bcrypt_hash_with_matching_password_returns_true(self):
        event = _build_event('bcrypt', SAMPLE_BCRYPT_HASH, SAMPLE_PASSWORD)
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

Run it, and yup, it’s red. So lets make it green, by modifying the lambda_handler function:

def lambda_handler(event, context):
    digest = event['digest']
    hash_pass = event['hash_pass']
    password = event['password']

    if digest == "bcrypt":
        return True

    ... rest of function is the same ...

Wait, what? Always return True when the digest is bcrypt? Yup, this is the TDD way, write the simplest code possible to make the test green & then revise. We don’t yet have a test that says when the password doesn’t match the hash and the digest is bcrypt you should return False, so let’s add one:

    def test_valid_sha1_hash_with_wrong_password_returns_false(self):
        event = _build_event('bcrypt', SAMPLE_BCRYPT_HASH, 'this is not the password')
        expected = False

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

Now, we need to revise lambda_handler to handle both cases with bcrypt. Some may feel like this is silly, but this is the heart of TDD: taking the smallest possible steps to keep the code concise and ensure you have tests to handle the cases you think. If we had gone ahead and done the “real” solution for bcrypt (seen below) right away, then we’d only have half the tests for bcrypt. If we added the false test after the fact it’d have been green upon completing writing it, and that means you have an unverified test (in this toy example it’s silly to be this pedantic, but take my word for it — if you’ve never seen a test fail when you expect it to, it’s not a valid test).

So anyways, silly pedantic example aside, let’s go ahead and solve it for real:

from passlib.hash import pbkdf2_sha256, pbkdf2_sha512, pbkdf2_sha1, bcrypt

def lambda_handler(event, context):
    digest = event['digest']
    hash_pass = event['hash_pass']
    password = event['password']

    if digest == "bcrypt":
        verification = bcrypt.verify(password, hash_pass)        

    ... rest of function is the same ...

And now you run the tests and they’re gree..err…I mean red. WTF?

MissingBackendError: bcrypt: no backends available -- recommend you install one (e.g. 'pip install bcrypt')

Oh yeah, we need bcrypt installed. No biggie, just add bcrypt to our requirements.txt file and voila after pip install -r requirements.txt into our development venv and we’re good.

Sweet, now we have bcrypt support, and tests are green. Now we can refactor things a bit for simplicity. Look at our lambda_handler function, there’s a big nasty if/else block that’s kinda icky:

def lambda_handler(event, context):
    digest = event['digest']
    hash_pass = event['hash_pass']
    password = event['password']

    if digest == "sha256":
        verification = pbkdf2_sha256.verify(password, hash_pass)
    elif digest == "sha512":
        verification = pbkdf2_sha512.verify(password, hash_pass)
    elif digest == "bcrypt":
        verification = bcrypt.verify(password, hash_pass)
    else:
        verification = pbkdf2_sha1.verify(password, hash_pass)
    return verification

Let’s simplify by creating a mapping of strings to functions:

from passlib.hash import pbkdf2_sha256, pbkdf2_sha512, pbkdf2_sha1, bcrypt

HASH_MAPPINGS = {
    "sha256": pbkdf2_sha256,
    "sha512": pbkdf2_sha512,
    "bcrypt": bcrypt,
    "sha1": pbkdf2_sha1,
}

DEFAULT_HASH = pbkdf2_sha1

def lambda_handler(event, context):
    digest = event['digest']
    hash_pass = event['hash_pass']
    password = event['password']
    hash_fn = HASH_MAPPINGS.get(digest, DEFAULT_HASH)
    return hash_fn.verify(password, hash_pass)

Much shorter. Now to add new digests we simply add a new entry to HASH_MAPPINGS. One thing is bothering me though, right now if one fails to specify hash_pass as an arg, the lambda function blows up as a KeyError gets thrown. This is again hitting that “what’s the requirement?” issue, but I felt like what should happen is that instead of a 500 server error on Lambda you should instead just get a response of False (no password matches an unspecified hash). Unit test:

    def test_unspecified_hash_pass_returns_false(self):
        event = _build_event('bcrypt', SAMPLE_BCRYPT_HASH, 'password')
        del event['hash_pass']
        expected = False

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

And (after verifying this was red), making it green:

def lambda_handler(event, context):
    digest = event['digest']
    hash_pass = event.get('hash_pass')
    password = event['password']
    if not hash_pass:
        return False

Similarly, we already specified that an invalid digest ends up using SHA1, so let’s make the value in the event dict completely optional. First the test:

    def test_unspecified_digest_uses_sha1(self):
        event = _build_event('does not matter', SAMPLE_SHA1_HASH, SAMPLE_PASSWORD)
        del event['digest']
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

And the change to make it green:

def lambda_handler(event, context):
    digest = event.get('digest', DEFAULT_HASH)
    hash_pass = event.get('hash_pass')
    password = event['password']
    if not hash_pass:
        return False
    hash_fn = HASH_MAPPINGS.get(digest, HASH_MAPPINGS.get(DEFAULT_HASH))
    return hash_fn.verify(password, hash_pass)

password is still a required arg and results in a 500 server error, but we’ll revisit that one later. We’ve made some real progress, refactored the code to be much more versatile & concise, added an entire new digest, and validated all this behaviour locally. Now it’s time to throw it all to Lambda. Run build.sh and throw it all up to lambda, and uh-oh:

{
    "stackTrace": [
        [
            "/var/task/index.py",
            23,
            "lambda_handler",
            "return hash_fn.verify(password, hash_pass)"
        ],
        [
            "/var/task/passlib/utils/handlers.py",
            761,
            "verify",
            "return consteq(self._calc_checksum(secret), chk)"
        ],
        [
            "/var/task/passlib/handlers/bcrypt.py",
            530,
            "_calc_checksum",
            "self._stub_requires_backend()"
        ],
        [
            "/var/task/passlib/utils/handlers.py",
            2221,
            "_stub_requires_backend",
            "cls.set_backend()"
        ],
        [
            "/var/task/passlib/utils/handlers.py",
            2143,
            "set_backend",
            "raise default_error"
        ]
    ],
    "errorType": "MissingBackendError",
    "errorMessage": "bcrypt: no backends available -- recommend you install one (e.g. 'pip install bcrypt')"
}

This is the stacktrace you get. What’s up, I thought we included bcrypt in the zip file? Unzipping the zip file and verifying the contents we see that it was included, but, and this is a gotcha with Lambda, bcrypt has some external compiled dependencies — it’s not pure Python. I’m developing on a Macbook running OSX El Capitan which is a much different environment than Amazon Linux (which is what a Lambda container runs in).

So, this is where it gets interesting. I started off doing some googling, and found this guy: https://github.com/Miserlou/lambda-packages Which is some common Python libraries with compiled dependencies built for Amazon Linux. Theoretically you should be able to specify that as a dependency in your requirements.txt, build it, and be good to go. So I tried this, and low and behold now my zip file is larger than the 50MB for uploading through the Lambda web interface. Throwing a zip file into an S3 bucket is simple enough, so I did that, and then saved my Lambda function and tried again.

And got the same MissingBackendError. Yup, dependency hell.

So I dropped this approach. Even if it had worked, that’s going to make your dev environment and your prod environment a little different (in dev I’d still be dependent upon bcrypt, in prod upon lambda-packages) which is a smell.

Supposedly you can spin up an EC2 instance based on the Amazon Linux AMI and do your bundling for lambda there, but that’s far from convenient (you need to spin up an EC2 instance, get your repo there, do the whole build, then get the zip file from that instance to wherever you need it to be). Alternatively, there’s a Docker image out there that mimics the Amazon Linux image that Lambda uses, so you could (locally) run a container from that image and do the same thing (pip install, bundle it into a zip, etc). But this is really getting into a world I don’t really want to go (at least not for now), so I did some more Googling and found that passlib actually supports 5 different bcrypt implementations (or “backends”):

  • bcrypt, if installed.
  • py-bcrypt, if installed.
  • bcryptor, if installed.
  • stdlib’s crypt.crypt(), if the host OS supports BCrypt (primarily BSD-derived systems).
  • A pure-python implementation of BCrypt, built into Passlib.

And that last one is disabled by default as it’s just too damn slow. For now though, we just want something that works, and is easy (we’ll optimize later), so let’s enable that backend. This is done by set the environmental variable PASSLIB_BUILTIN_BCRYPT=”enabled” where you’re running passlib. With Lambda, setting some env variables is easy, you can do this in the web interface:

Screen Shot 2017-07-27 at 2.16.32 PM

Doing this, I no longer got a MissingBackendError, but now there was a new problem:

{
    "errorMessage": "2017-07-27T21:17:09.542Z f0af983b-7310-11e7-8079-97327f3cc568 Task timed out after 3.00 seconds"
}

Yup, apparently that plain Python version is in fact just way too slow. You can extend the timeout value for a Lambda function on the Configuration tab under advanced items:

Screen Shot 2017-07-27 at 2.19.06 PM

It’s worth noting this can increase your costs with Lambda, as pricing is execution-time related.  With that change in place (50 seconds is crazy, but just trying to get it to work), I got a new error, this time from API Gateway:

{
    "message": "Endpoint request timed out"
}

This was after running for about 30 seconds. I assumed this was timeout for API Gateway, and this page confirmed it. Unfortunately it’s not possible to change this either.

So back to the drawing board….

In part 3 I’m going to continue from here, looking into perhaps doing the compiled dependency on a Amazon Linux based box route.

Serverless Microservices and Python (with tests!) – Part 1

So I’m currently on holiday and also between jobs (had my last day at old job last week, and first day at new gig is next week), which means of course what am I doing but spending some time learning some tech that’s fun & buzzwordy.

Right now it seems like you can’t listen to a tech podcast without hearing “microservices” or “serverless”, especially if you listen to anything with a devops bias. So, why not explore both? I’ve always wanted to learn a bit more about AWS Lambda and in particular the combination of Lambda with AWS API Gateway to create little microservices that are supremely scalable without the headache of server maintenance. Did some Googling and stumbled across this tutorial which seemed like exactly what I was looking for.

So, I worked through the tutorial, and minor hiccups aside, got a simple little password verification microservice up and running in almost no time at all.  Sweet.

Ok, so for me, when I do tutorials like this, I find I need to build or extend the exercise to help reinforce what I’ve learned. Aside from that, one of the questions I have about Lambda projects is how does testing work? Do you still do unit testing like you would with a regular Python project? Any differences?

So, let’s take this example and enhance it with a new requirement — support Bcrypt as a digest.

Now, there’s a problem (ok, this is contrived, work with me here): normally before you start adding new functionality you want to ensure you have a decent set of automated tests to ensure that you don’t break existing behaviour. So, step 1: let’s add some unit tests that enforce the existing requirements we have in our little Lambda function. I saw these as:

  • supports three digests: SHA1, SHA256, and SHA512.
  • when given a valid hash for a digest, and the plaintext password that hash was based upon, return True
  • when given an valid hash for a digest, and a random string (that doesn’t match the hash), return False

Simple enough. So let’s get cracking. First thing I did was start to “project-ize” this code, so that it’s more than a random Python file. This consisted of creating a requirements.txt file to list the dependencies the project uses (currently only passlib), and to move it into a project in my IDE of choice. I like to use PyCharm as my dev environment, so I fired up PyCharm and created a new project based upon the venv created from the requirements.txt file. Next I did a bit of restructuring moving the source file into a directory called src and created a sibling directory called test. I like to structure my Python projects this way, but really this is arbitrary and personal convention more than anything.

With all that in place, I added index_test.py (mirroring the index.py name that was created in the tutorial) and started backfilling some tests. Note that since lambda_handler is just a plain old Python function, unit testing is actually completely straightforward. A first stab:

import unittest

from index import lambda_handler

class TestLambdaHandler(unittest.TestCase):
    def test_valid_sha256_hash_with_matching_password_returns_true(self):
        event = {
            "digest": "sha256",
            "hash_pass": "$pbkdf2-sha256$29000$.L93bg0BwFiLEaL0fm8NIQ$yYmxiSuP9pXXbrO4cT6CkE1QaNKpt8PjugrgvOBfcRY",
            "password": "password"
        }
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

    def test_valid_sha256_hash_with_wrong_password_returns_false(self):
        event = {
            "digest": "sha256",
            "hash_pass": "$pbkdf2-sha256$29000$.L93bg0BwFiLEaL0fm8NIQ$yYmxiSuP9pXXbrO4cT6CkE1QaNKpt8PjugrgvOBfcRY",
            "password": "this is not the password"
        }
        expected = False

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

Again, all straightforward stuff. My style of test writing is to follow the Arrange, Act, Assert pattern, as I find this helps with readability. In terms of running them, I personally just ran these with the default test runner from within PyCharm, but there’s nothing magical here, so you could just as easily run them with your favourite runner (be it Nose, py.test or whatever).

As is usually the case with writing tests, you start to find duplication and simplify. In both of these the event declaration is a bit verbose, so lets break it into a helper, and add some tests for other digests:

import unittest

from index import lambda_handler

SAMPLE_PASSWORD = 'password'
SAMPLE_SHA512_HASH = '$pbkdf2-sha512$25000$ltLae69VihFirDVGSOmdUw$pcLVv3Vnm3XRx9aHNUgI1FQaF8.UmKHBYt.Hs2EI7at/V80kbsb2P1A2t9akjNom8ZUgVJ4AcbA5vk/7QTgEJQ'
SAMPLE_SHA256_HASH = '$pbkdf2-sha256$29000$.L93bg0BwFiLEaL0fm8NIQ$yYmxiSuP9pXXbrO4cT6CkE1QaNKpt8PjugrgvOBfcRY'

class TestLambdaHandler(unittest.TestCase):
    def test_valid_sha256_hash_with_matching_password_returns_true(self):
        event = _build_event('sha256', SAMPLE_SHA256_HASH, SAMPLE_PASSWORD)
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

    def test_valid_sha256_hash_with_wrong_password_returns_false(self):
        event = _build_event('sha256', SAMPLE_SHA256_HASH, 'this is not the password')
        expected = False

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

    def test_valid_sha512_hash_with_matching_password_returns_true(self):
        event = _build_event('sha512', SAMPLE_SHA512_HASH, SAMPLE_PASSWORD)
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

    def test_valid_sha512_hash_with_wrong_password_returns_false(self):
        event = _build_event('sha512', SAMPLE_SHA512_HASH, 'this is not the password')
        expected = False

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

def _build_event(digest, hash_pass, password):
    return {
        "digest": digest,
        "hash_pass": hash_pass,
        "password": password,
    }

Astute readers will recognize that this is a classic example of tests which lend themselves to py.test’s parameterized tests.  I leave the work of converting these to parameterized tests as an exercise for the reader. 🙂

Continuing along, you reach a point where you start to observe behaviour that’s implicit in the code as it exists today, but which is unclear if it’s required or just an accident. For example: currently if you give an arbitrary string as the digest, then it uses SHA1. Is that required, or just an accident of implementation? Recall though that at this point our goal is just to backfill tests to capture current behaviour. That is, we’re writing characterization tests, so I chose to add a test to enforce that behaviour:

    def test_default_hash_is_sha1(self):
        event = _build_event(None, SAMPLE_SHA1_HASH, SAMPLE_PASSWORD)
        expected = True

        result = lambda_handler(event, None)

        self.assertEqual(expected, result)

Ok, so now we have our tests which enforce current behaviour, a nice project structure, and at this point this is all plain old normal Python development, nothing about Lambda here. At this point you could follow the same steps in the tutorial and bundle it all up into a zip file, upload to Lambda and you’re good.

But I like automating some of the build stuff, so wrote a simple little Bash script to generate the zip file, and called it build.sh:

#!/bin/sh

mkdir BUILD
cp -r src/* BUILD/
cp requirements.txt BUILD/
cd BUILD
../install_deps.sh
rm requirements.txt
zip -r lambda.zip *
mv lambda.zip ..

Note that this also leaves the tests out of the bundle sent to Lambda, as A) there’s no reason for them to live there, and B) having them in the zip bloats the zip file slightly. install_deps.sh looks like:

#!/bin/sh

pip install -r requirements.txt -t .

I could’ve just put the pip install line into build.sh, but I had a feeling that installing of requirements might get a bit tricky with bundling something up for Lambda, so broke it out into a separate script.

Now you can just run build.sh from the project dir, and lambda.zip gets created, ready for upload to Lambda. It’d be nice to enhance the script to upload the file to an S3 bucket & tell Lambda to look at that bucket, but that’s future work, this is good enough for now.

For me this was an interesting exercise, as it was a bit of an epiphany moment to realize that a Lambda handler is just a plain Python function, so there’s no real magic in unit testing it. In my next blog entry, I’ll pick up from here and add bcrypt as a supported digest using TDD and work through the hiccups discovered.  All the code I wrote is also in Github: https://github.com/pzelnip/lambda-password-service

The 2017 Vancouver Polyglot Unconference

This year as in many years past I was fortunate enough to be able to attend the annual Vancouver Polyglot Unconference. For those unaware, this event (now in it’s 6th year!) is a chance for technicians, programmers, engineers, and others working in the tech industry in Vancouver & surrounding areas to get together “for a day of spontaneous sharing, teaching and learning”.

This year’s event, like others in the past was a great opportunity to learn and share from others struggling with the challenges of modern software development. I thought I’d write a bit about some of the highlights for me from the event.

General Themes In Pitches

Being an unconference, the format of the event is to have attendees “pitch” ideas for discussions at the start of the day, then people vote on the topics they’d be interested in seeing throughout the day, and then the organizers facilitate those discussions by scheduling them into particular rooms, etc. I often find it interesting to listen to the pitches to see what commonalities there are from different folks working in the industry.

This year I was quite surprised to see the “human” side of development come up in many of the suggested topics. Many pitched talks related to hiring & interviewing, team effectiveness, mentoring & training juniors, progressing to becoming a senior developer, diversity, and how software development is becoming increasingly complex. From a tech standpoint, I heard React mentioned a lot, as were microservices and container orchestration technologies (Kubernetes, Mesos, etc).

Something I was particularly taken aback by was the sheer variety and breadth of topics suggested. This is generally true at this event, but it seemed particularly diverse to me this year than in prior years.

Particular Discussions

JS State of the Union

Chris Nicola kicked off the first session I attended with the JS “State of the Union” discussion which has happened in prior years at the unconference. Unsurprisingly React was a technology mentioned a fair bit in this session, as was Vue.js.

I’m not a front-end guy, so this was definitely not my forte, but themes I took away from this session was the continued explosion of the sheer number of JS frameworks out there. I didn’t stick around for the entire session, instead following the law of two feet to switch to….

SOLID is wrong

This session (pitched by Anthony Tsui) was interesting and rather lively. The context: earlier this year Dan North (of BDD Fame) did a talk & slide deck on why SOLID principles are wrong. Rather a controversial stance given how many “classic” well known software developers (Uncle Bob Martin in particular) have long argued how SOLID principles are a key development design practice. The slide deck from Dan North is at: https://speakerdeck.com/tastapod/why-every-element-of-solid-is-wrong

The session itself featured some lively debate around the arguments made by Dan. A theme I walked away with was the classic argument of expediency vs resiliency, i.e. do I build for right now or design for an unknown & unpredictable future.

Training Juniors

For me, this session was the highlight of the day. Saem (unfortunately I do not know his last name) facilitated the session by first doing a presentation (with slides) outlining some of the lessons & techniques that are grounded in real research that he’s adopted with mentoring and managing junior developers. Lots of fascinating discussion around active recall, modes of thinking, how to optimize learning, the importance of clear, well-written problem statements, techniques for both helping juniors recognize when they’re stuck and how to get unstuck, and how to set clear expectations & check-ins around those. Really fascinating stuff, and I found myself (as someone who’s had to manage a few co-ops) finding parallels between moments I’ve experienced and ideas mentioned. I plan on adopting some of the suggestions made in my work with junior/co-op developers I manage.

Steps to Be A Senior Developer

This session was interesting as well. I think unfortunately the original intent of the session (“I’m not a senior but I want to be one, how do I get there?”) got a bit sidetracked. Much of the discussion ended up around how to get hired as a senior dev, and less around how to progress to be a senior dev. I’m not sure I walked away with many clear ideas that expanded upon what I already think makes a senior developer a senior (ability to be self-reliant, resiliency, maturity, ability to mentor, etc).

Complexity of Modern Software Development

This session was pitched as as “I’m going to convince you that software development complexity is getting out of control”, and (perhaps unsurprisingly) the discussion ended up rotating around sources of complexity in modern development. Some of the topics discussed were the “ooh shiny” syndrome vs adopting technology based on need, the distinction between inherent vs accidental complexity, solving problems at the wrong level of abstraction, and the sheer explosion of choices we have around competing technologies as being a source of complexity.

An interesting analogy I heard during the session that I hadn’t thought of before but seems quite apt was the idea of open source software as an exernality (in the sense of economics) and the implications that makes. Interesting stuff.

Stupid Questions about diversity

Last session of the day for me was one facilitated by Holly Burton who created a space for people to ask “stupid questions” around diversity. This was truly interesting, lots of discussion around stats & research around diversity. Much of the discussion ended up focussed on gender diversity (i.e. male/female equality).

Some of the eye-opening moments of the session included discussion around how there’s a “PTSD effect” happening around “bro culture” at tech firms and the implications around how you present your company in things like job postings. For example (and this had never occurred to me before), but advertising things like “we play ping pong”, or “we have nerf battles all the time”, or “beer fridays!” can and often do turn female developers off of applying to a firm.

Another gender difference I hadn’t considered: the importance of being clear on what is actually required for a job posting. Women are far more likely to self-select themselves out of applying for a position when they don’t exactly meet the stated requirements, so having “wish lists” rather than “real requirements” tends to end up with the result being that many women who are qualified to do a job not applying for it. Really interesting stuff that I’m hoping I’ll be able to apply in postings my current employer produces.

Meta Thoughts

All-in-all Polyglot was a great event again, I find it like a 1-day compressed window into the pulse of modern software development. For me personally, I find it useful to go to the event simply to help keep current.

Some of the themes that stood out to me were around the challenges with hiring & career progression (as someone who is very interested in the human side of development I really liked seeing this), as well as the increasing complexity of software development. Things are getting harder, which almost seems unintuitive as we have more, better, tools available at our disposals as developers.

In any case, I can’t wait until next year when I go to the event again. Kudos to the organizers for putting on such a great event year after year.

Resumes, my take

So a blog I follow had a recent post about resumes for devs, particularly for junior devs or recent grads. It ended with the open question:

Readers, do you have any advice for students or anyone who doesn’t have years of dev experience to put on their resumes?

I started to write a reply, and then it ballooned into a blog post of my own, so here’s my unsolicited advice on the topic.

First, a disclaimer: while I’ve read a fair number of resumes & interviewed quite a few candidates at places I’ve worked, at the end of the day I’m a dev not a hiring manager, so all this should be taken with a grain of salt.

So what advice would I give to those looking to spice up their resumes. Well, first, not so much resume advice, but general job seeking advice: do your best not to settle.  Devs especially right now (even junior devs) can afford to be a bit selective about where they apply.  Read the job post & do a bit of research into the company.  Do you know anyone who works there or has worked there who you could ask about what the place is like? Read some reviews of the place on sites like Glassdoor. After doing that, honestly ask yourself: does it sound like a place you’d be interested in working at?  Will it help you achieve your career goals?

Don’t just apply to a place because “you never know it might work out”, target your applications to places & positions that align with your values and interests. A former manager once said to me “the problem with applying to a job is that they might offer it to me, so I better be sure that I’d like to work there before applying”, and I think that’s sage advice. Having said that, I recognize this can be challenging when you’re desperate for cash, or find yourself in a situation where you need to be employed ASAP, but even in those cases doing some due diligence can help you prioritize which places to apply to first, and how much effort to put into each application.

Ok, but what about resumes.  Well, to begin with: structure your resume to highlight your strengths first. If your strengths are your previous job experiences, then put those front and center. If your strengths are your projects (which might be more the case for undergrads or recent grads), make those the focus so put them at or near the top. Same for education. Prioritize the stuff that makes you look good, don’t get hung up on traditional resume structures (i.e. “oh there has to be objectives first, then work experience, then education, etc”).  Make the “wow” stuff about you come first as that’ll encourage someone to keep reading.

Second, when you write your bullet points for your experiences, read them back to yourself and ask the question “so what?”. This can be a useful exercise in making sure the “why it’s important/valuable” is clearly communicated. Here’s a bad example:

upgraded company to new code review system

As a hiring manager I’ll read that and go “so what?” and likely toss the resume aside. OTOH, if it read:

upgraded company to new code review system resulting in a $10,000 savings in licensing costs

Well, that’s the kinda thing that will get you in for an interview. That’s a rather extreme example, but try to put as many quantifiable items alongside your experiences (examples might include number of bugs resolved, or how much you increased test coverage, etc). This advice is particularly true when applying to a place where tech resumes are first filtered by non-devs. This bit of advice can be challenging for devs, as often the things we know are important/useful are things that only devs (or technically savvy people) know are important/useful.

Next, the one you’ve probably heard a million times: take the time to tailor your resume for the job you’re applying to. If the job posting you’re applying to makes it clear that the company cares about AWS experience, then make sure you highlight your experience with AWS (as well as you can, obviously be honest and don’t misrepresent your abilities). This is true as well for an objectives section: make sure your objectives statement aligns well with the job post you’re applying for.  This more than anything makes the difference in my experience.

One more thing: get a copy of the book What Color is Your Parachute? and read the chapter on resumes. Then when you’re done and you’ve written your resume, read the rest of the book as it’s full of really useful advice on things like interviewing, salary negotiation, etc.

Lastly, remember that looking for work is work. Don’t spend days writing a single resume for a single job, but definitely recognize the fact that it’s going to take some effort. It sucks too because job hunting is one of those things where you can do everything to the best of your ability and still not get in for an interview, and can also write a crap resume, but luck out & happen to catch a hiring manager on a good day resulting in an interview.

Good luck!