Advent of Code F# — Day 9

ps. look out for all my oth­er solu­tions for Advent of Code chal­lenges here.


Day 9

See details of the chal­lenge here.

The input for today’s chal­lenge is a very long string like this:


First, let’s see how we’re gonna parse this input.

The approach I went with is to recur­sive­ly split the input with input.Split([| ‘(‘; ’)’ |], 3) which will return either:

  • the whole string if there’s no (n x count) sec­tion, eg. “ADVENT” => “ADVENT”
  • or an array of 3 ele­ments
    • for “X(1x3)YZUVW”, it’ll return [| “X”; “1x3”; “YZUVW” |]
    • for “(1x3)YZUVW”, it’ll return [| “”; “1x3”; “YZUVW” |]

To parse the num­ber of char­ac­ters (n) and num­ber of times to repeat (count) from “1x3” we can intro­duce an active pat­tern (the Repeat pat­tern below) to do the job.

Now we can recur­sive­ly split the string and count the length of the decom­pressed string.


Part 2

Appar­ent­ly, the file actu­al­ly uses ver­sion two of the for­mat.

In ver­sion two, the only dif­fer­ence is that mark­ers with­in decom­pressed data are
decom­pressed. This, the doc­u­men­ta­tion explains, pro­vides much more sub­stan­tial
com­pres­sion capa­bil­i­ties, allow­ing many-giga­byte files to be stored in only a
few kilo­bytes.

For exam­ple:

(3x3)XYZ still becomes XYZXYZXYZ, as the decom­pressed sec­tion con­tains no
X(8x2)(3x3)ABCY becomes XABCABCABCABCABCABCY, because the decom­pressed data from
the (8x2) mark­er is then fur­ther decom­pressed, thus trig­ger­ing the (3x3) mark­er
twice for a total of six ABC sequences.
(27x12)(20x12)(13x14)(7x10)(1x12)A decom­press­es into a string of A repeat­ed
241920 times.
(25x3)(3x3)ABC(2x3)XY(5x2)PQRSTX(18x9)(3x2)TWO(5x7)SEVEN becomes 445 char­ac­ters
Unfor­tu­nate­ly, the com­put­er you brought prob­a­bly doesn’t have enough mem­o­ry to
actu­al­ly decom­press the file; you’ll have to come up with anoth­er way to get its
decom­pressed length.

What is the decom­pressed length of the file using this improved for­mat?

Part 2 requires slight­ly more work and, as the chal­lenge hint­ed, we might over­flow on int so let’s use int64 instead.

The main dif­fer­ence to part 1 is that we need to make the decompressV2 func­tion itself recur­sive, and call into it from inside the loop func­tion to work out the decom­pressed length of the repeat­ed sec­tion.