Advent of Code F# – Day 7

ps. look out for all my other solutions for Advent of Code challenges here.

 

Sorry for the delay in getting this post out today, I enjoyed some really fine Ramen with some of the lovely people I worked with at Yubl and only managed to find time to do the challenge and write this post late in the evening.

(ps. here are some of the best people I have ever had the pleasure to work with )

team_ramen

anyways, let’s get back to the main topic…

 

Day 7

See details of the challenge here.

The input for today’s challenge looks like this:

xdsqxnovprgovwzkus[fmadbfsbqwzzrzrgdg]aeqornszgvbizdm
itgslvpxoqqakli[arktzcssgkxktejbno]wsgkbwwtbmfnddt[zblrboqsvezcgfmfvcz]iwyhyatqetsreeyhh
pyxuijrepsmyiacl[rskpebsqdfctoqg]hbwageeiufvcmuk[wfvdhxyzmfgmcphpfnc]aotmbcnntmdltjxuusn
mfhczaevladdsqawgp[rwabwdnwiytloldf]varesbnjnsdbsmhmsi[tyjtbpzrbfzbwlga]sznkksuymkbyxlykfqg[fyislgfghcbltaft]knrkzaldhauordwfl
piftqfdhtumcmjmsge[qrsntvxhtfurcgcynx]oyswvuklvtmivlhen[syqhqtijyiduoxb]pdtdrhijqqzvcnl[xivmeqcwyafxvnok]jvlbkrwbgcgzaqms
pfqiqyscrxhvtrjzt[unmovhoommbcckocp]ziwuhtfghcqhzeysdw[zmhlfonldrgkbimft]nnlbctvfpbcoqzw[zivyewjzuuvvasybded]mznpvozhzsvkdedqu
adncdhtushtvtfcbez[rvaycmplefdvbrchc]vtviiplkpfhsyhwzz[pdpnsseaizogzvtkcq]piorguaivfpummlo
cdgyiakhcpbibtdwm[dqmibwtfswjlfxvwe]jghsohdnnowueerunt[stsuvrwswspkgom]mmyifoverwkyjqfofhd

Looking at the input, they all follow a similar pattern and we can separate them out with String.Split(‘[‘, ‘]’) and end up with an array where the odd number elements are the hypernet sequences.

We still need a way to look for ABBAs, which fortunately can be achieved easily with Seq.windowed and a bit of pattern matching (see the hasABBA function below).

 

Part 2

You would also like to know which IPs support SSL (super-secret listening).

An IP supports SSL if it has an Area-Broadcast Accessor, or ABA, anywhere in the
supernet sequences (outside any square bracketed sections), and a corresponding
Byte Allocation Block, or BAB, anywhere in the hypernet sequences. An ABA is any
three-character sequence which consists of the same character twice with a
different character between them, such as xyx or aba. A corresponding BAB is the
same characters but in reversed positions: yxy and bab, respectively.

For example:

aba[bab]xyz supports SSL (aba outside square brackets with corresponding bab
within square brackets).
xyx[xyx]xyx does not support SSL (xyx, but no corresponding yxy).
aaa[kek]eke supports SSL (eke in supernet with corresponding kek in hypernet;
the aaa sequence is not related, because the interior character must be different).
zazbz[bzb]cdb supports SSL (zaz has no corresponding aza, but zbz has a
corresponding bzb, even though zaz and zbz overlap).
How many IPs in your puzzle input support SSL?

Part 2 is slightly more challenging as it requires to track the ABAs from the sequences that are not part of the hypernet sequences, and look for corresponding BABs in the hypernet sequences (seriously, who thought of these acronym…).

We can use the same technique of Seq.windows + pattern matching to find all the ABAs, but we also need a hasBAB function to check if any of the corresponding BABs exists in a hypernet sequence.

Whilst I was at it, I also threw in a ToBABs active pattern to convert an array of ABAs into their corresponding BABs.

The question was slightly ambiguous – I wasn’t sure if all the corresponding BABs need to exist in the hypernet sequences or just one, I went with the later and that got me the right answer so I guess I got lucky there.

 

Links

Yubl’s road to Serverless architecture – Part 1

Since Yubl’s closure quite a few people have asked about the serverless architecture we ended up with and some of the things we have learnt along the way.

As such, this is the first of a series of posts where I’d share some of the lessons we learnt. However, bear in mind the pace of change in this particular space so some of the challenges/problems we encountered might have been solved by the time you read this.

ps. many aspects of this series is already covered in a talk I gave on Amazon Lambda at Leetspeak this year, you can find the slides and recording of the talk here.

 

From A Monolithic Beginning

Back when I joined Yubl in April I inherited a monolithic Node.js backend running on EC2 instances, with MongoLab (hosted MongoDB) and CloudAMQP (hosted RabbitMQ) thrown into the mix.

yubl-monolith

There were numerous problems with the legacy system, some could be rectified with incremental changes (eg. blue-green deployment) but others required a rethink at an architectural level. Although things look really simple on paper (at the architecture diagram level), all the complexities are hidden inside each of these 3 services and boy, there were complexities!

My first tasks were to work with the ops team to improve the existing deployment pipeline and to draw up a list of characteristics we’d want from our architecture:

  • able to do small, incremental deployments
  • deployments should be fast, and requires no downtime
  • no lock-step deployments
  • features can be deployed independently
  • features are loosely coupled through messages
  • minimise cost for unused resources
  • minimise ops effort

From here we decided on a service-oriented architecture, and Amazon Lambda seemed the perfect tool for the job given the workloads we had:

  • lots of APIs, all HTTPS, no ultra-low latency requirement
  • lots of background tasks, many of which has soft-realtime requirement (eg. distributing post to follower’s timeline)

 

To a Serverless End

It’s suffice to say that we knew the migration was going to be a long road with many challenges along the way, and we wanted to do it incrementally and gradually increase the speed of delivery as we go.

“The lead time to someone saying thank you is the only reputation metric that matters”

– Dan North

The first step of the migration was to make the legacy systems publish state changes in the system (eg. user joined, user A followed user B, etc.) so that we can start building new features on top of the legacy systems.

To do this, we updated the legacy systems to publish events to Kinesis streams.

 

Our general strategy is:

  • build new features on top of these events, which usually have their own data stores (eg. DynamoDB, CloudSearch, S3, BigQuery, etc.) together with background processing pipelines and APIs
  • extract existing features/concepts from the legacy system into services that will run side-by-side
    • these new services will initially be backed by the same shared MongoLab database
    • other services (including the legacy ones) are updated to use hand-crafted API clients to access the encapsulated resources via the new APIs rather than hitting the shared MongoLab database directly
    • once all access to these resources are done via the new APIs, data migration (usually to DynamoDB tables) will commence behind the scenes
  • wherever possible, requests to existing API endpoints are forwarded to the new APIs so that we don’t have to wait for the iOS and Android apps to be updated (which can take weeks) and can start reaping the benefits earlier

 

After 6 months of hard work, my team of 6 backend engineers (including myself) have drastically transformed our backend infrastructure. Amazon was very impressed by the work we were doing with Lambda and in the process of writing up a case study of our work when Yubl was shut down at the whim of our major shareholder.

Here’s an almost complete picture of the architecture we ended up with (some details are omitted for brevity and clarity).

overall

Some interesting stats:

  • 170 Lambda functions running in production
  • roughly 1GB of total deployment package size (after Janitor Lambda cleans up unreferenced versions)
  • Lambda cost was around 5% of what we pay for EC2 for a comparable amount of compute
  • the no. of production deployments increased from 9/month in April to 155 in September

 

For the rest of the series I’ll drill down into specific features, how we utilised various AWS services, and how we tackled the challenges of:

  • centralised logging
  • centralised configuration management
  • distributed tracing with correlation IDs for Lambda functions
  • keeping Lambda functions warm to avoid coldstart penalty
  • auto-scaling AWS resources that do not scale dynamically
  • automatically clean up old Lambda function versions
  • securing sensitive data (eg. mongodb connection string, service credentials, etc.)

I can also explain our strategy for testing, and running/debugging functions locally, and so on. If there’s anything you’d like me to cover in particular, please leave a comment and let me know.

 

Links

Advent of Code F# – Day 6

ps. look out for all my other solutions for Advent of Code challenges here.

 

Day 6

See details of the challenge here.

The input for today’s challenge looks like this:

cmezkqgn
nmzrgcft
ydpndcps
zjihhows
kvptxsrx
ubbvugwq

Since the only difference between the 2 parts of this challenge is how the characters in a column are sorted by count, so we can capture the common flow to solving both parts in a solve function below.

Let’s walk through what’s happening here step by step.

  1. first we collect all the characters of each column into a seq<char[]>
  2. then for each char[] we
    1. group by the chars by itself
    2. map the resuling seq<char * seq<char>> into a sequence of char and count, ie seq<char * int>
    3. use the supplied sortBy function to sort this sequence by count
    4. take the first (most common/least common depending on the sortBy function)
    5. return the char
  3. stich the chars together into a good old fashioned string

 

So now, solving part 1 is a simple one-liner:

solve Seq.sortByDescending

 

Part 2

Of course, that would be the message – if you hadn’t agreed to use a modified
repetition code instead.

In this modified code, the sender instead transmits what looks like random data,
but for each character, the character they actually want to send is slightly
less likely than the others. Even after signal-jamming noise, you can look at
the letter distributions in each column and choose the least common letter to
reconstruct the original message.

In the above example, the least common character in the first column is a; in
the second, d, and so on. Repeating this process for the remaining characters
produces the original message, advent.

Given the recording in your puzzle input and this new decoding methodology, what
is the original message that Santa is trying to send?

Because of the work we had done earlier, solving part 2 is also a one-liner:

solve Seq.sortBy

 

Links

Advent of Code F# – Day 5

ps. look out for all my other solutions for Advent of Code challenges here.

 

Day 5

See details of the challenge here.

Let’s start by adding a hash function that’ll take an input string and return the hexdecimal representation of its MD5 hash.

From there, we can create an infinite sequence of hash values generated by concatinating the input and an increasing index (as per the instruction). But, remember, we’re only interested in hash values that starts with 5 zeroes.

Once we have this sequence of interesting hash values, part 1 requires us to take the first 8 and make a password out of it.

 

Part 2

As the door slides open, you are presented with a second door that uses a
slightly more inspired security mechanism. Clearly unimpressed by the last
version (in what movie is the password decrypted in order?!), the Easter Bunny
engineers have worked out a better solution.

Instead of simply filling in the password from left to right, the hash now also
indicates the position within the password to fill. You still look for hashes
that begin with five zeroes; however, now, the sixth character represents the
position (0-7), and the seventh character is the character to put in that position.

A hash result of 000001f means that f is the second character in the password.
Use only the first result for each position, and ignore invalid positions.

For example, if the Door ID is abc:

The first interesting hash is from abc3231929, which produces 0000015…; so, 5
goes in position 1: _5______.
In the previous method, 5017308 produced an interesting hash; however, it is
ignored, because it specifies an invalid position (8).
The second interesting hash is at index 5357525, which produces 000004e…; so,
e goes in position 4: _5__e___.
You almost choke on your popcorn as the final character falls into place,
producing the password 05ace8e3.

Given the actual Door ID and this new method, what is the password? Be extra
proud of your solution if it uses a cinematic “decrypting” animation.

We can reuse the inifinite sequence of interesting hash values we constructed earlier, but this time around we need to also filter out hash values whose 6th character is not 0-7.

Additionally, we need to track the first hash value for each position.

The first idea I had was to use a mutable Option<char>[] (length 8) to represent the password, and then iterate over the valid hash values until that array is filled with Some x. But this approach felt too imperative for my liking..

I couldn’t use a fold as it will fold over all the inputs.. however, I can use a scan which yields the intermediate results as a sequence – which I can then skip until I have 8 hash values.

I can also use an immutable Set to represent all the positions that are still unoccupied as I iterate through the hash values.

Whilst part 1 was not computationally light, it did complete in a matter of seconds on my laptop. Part 2 took more than a minute to finish so if anyone knows a good way to speed things up please leave a comment to this post.

 

Links

Advent of Code F# – Day 4

ps. look out for all my other solutions for Advent of Code challenges here.

 

Day 4

See details of the challenge here.

The input for today’s challenge looks something like this:

bkwzkqsxq-tovvilokx-nozvyiwoxd-172[fstek]
wifilzof-wbiwifuny-yhachyylcha-526[qrazx]
jvyyvzpcl-jhukf-shivyhavyf-487[zhtsi]
kwvacumz-ozilm-kivlg-kwvbiqvumvb-694[gknyw]
mvhkvbdib-kmjezxodgz-mvwwdo-omvdidib-837[dmvbi]
nzydfxpc-rclop-qwzhpc-lnbftdtetzy-171[cptzd]
vhehkyne-unggr-inkvatlbgz-813[gnehk]
tcorcikpi-hnqygt-octmgvkpi-570[nzewo]
xmtjbzidx-wvnfzo-jkzmvodjin-447[uyzlp]
willimcpy-mwupyhayl-bohn-mufym-734[stjoc]
sbejpbdujwf-cvooz-xpsltipq-961[azfnd]

So our first task is to turn these into a more usable format, eg. a tuple of (encrypted name, sector ID, checksum).

To find the real room names, we need to calculate the checksum value from the encrypted name.

To produce the desired sorting in LINQ, I’d have used OrderByDescending and ThenBy. There is no ThenBy in any of the F# modules, but F# tuples are ordered the way you expect – by first item, then second item, and so on.

So, I can achieve the same result as OrderByDescending + ThenBy with one line instead:

Seq.sortBy (fun (c, cs) -> (Seq.length cs), c)

pretty neat, right?

 

Part 2

With all the decoy data out of the way, it’s time to decrypt this list and get
moving.

The room names are encrypted by a state-of-the-art shift cipher, which is nearly
unbreakable without the right software. However, the information kiosk designers
at Easter Bunny HQ were not expecting to deal with a master cryptographer like
yourself.

To decrypt a room name, rotate each letter forward through the alphabet a number
of times equal to the room’s sector ID. A becomes B, B becomes C, Z becomes A,
and so on. Dashes become spaces.

For example, the real name for qzmt-zixmtkozy-ivhz-343 is very encrypted name.

What is the sector ID of the room where North Pole objects are stored?

First, let’s add a decrypt function:

I found the question “where North Pole objects are stored” ambigious. After a few failed attempts to guess what the room name should be I ended up just printing them all out and did a quick search for “north“. I’m not sure if the author of the challenge expected you to do that, or maybe I missed a clue somewhere..

Anyhow, once I figured out what the room name is, getting the sector ID was easy.

 

Links