Array and String Manipulation

Array
Other names:
static array
Quick reference
Worst Case
space O(n)
lookup O(1)
append O(1)
insert O(n)
delete O(n)

An array organizes items sequentially, one after another in memory.

Each position in the array has an index, starting at 0.

Strengths:
  • Fast lookups. Retrieving the element at a given index takes O(1) time, regardless of the length of the array.
  • Fast appends. Adding a new element at the end of the array takes O(1) time, if the array has space.
Weaknesses:
  • Fixed size. You need to specify how many elements you’re going to store in your array ahead of time. (Unless you’re using a fancy dynamic array.)
  • Costly inserts and deletes. You have to “scoot over” the other elements to fill in or close gaps, which takes worst-case O(n) time.
In JavaScript

Some languages (including Javascript) don’t have these bare-bones arrays.

Here’s what arrays look like in Java.

// instantiate an array that holds 10 integers
int gasPrices[] = new int[10];

gasPrices[0] = 346;
gasPrices[1] = 360;
gasPrices[2] = 354;
Inserting

If we want to insert something into an array, first we have to make space by “scooting over” everything starting at the index we’re inserting into:

In the worst case we’re inserting into the 0th index in the array (prepending), so we have to “scoot over” everything in the array. That’s O(n) time.

Deleting

Array elements are stored adjacent to each other. So when we remove an element, we have to fill in the gap—”scooting over” all the elements that were after it:

In the worst case we’re deleting the 0th item in the array, so we have to “scoot over” everything else in the array. That’s O(n) time.

Why not just leave the gap? Because the quick lookup power of arrays depends on everything being sequential and uninterrupted. This lets us predict exactly how far from the start of the array the 138th or 9,203rd item is. If there are gaps, we can no longer predict exactly where each array item will be.

Data structures built on arrays

Arrays are the building blocks for lots of other, more complex data structures.

Don’t want to specify the size of your array ahead of time? One option: use a dynamic array.

Want to look up items by something other than an index? Use an object.

Array Slicing

Array slicing involves taking a subset from an array and allocating a new array with those elements.

In JavaScript you can create a new array of the elements in myArray, from startIndex to endIndex (exclusive), like this:

myArray.slice(startIndex, endIndex);

JavaScript

You can also get everything from startIndex onwards by just omitting endIndex:

myArray.slice(startIndex);

JavaScript

Careful: there’s a hidden time and space cost here! It’s tempting to think of slicing as just “getting elements,” but in reality you are:

  1. allocating a new array
  2. copying the elements from the original array to the new array

This takes O(n) time and O(n) space, where n is the number of elements in the resulting array.

That’s a bit easier to see when you save the result of the slice to a variable:

const tailOfArray = myArray.slice(1);

JavaScript

But a bit harder to see when you don’t save the result of the slice to a variable:

return myArray.slice(1);
// Whoops, I just spent O(n) time and space!

JavaScript

myArray.slice(1).forEach(item => {
  // Whoops, I just spent O(n) time and space!
});

JavaScript

So keep an eye out. Slice wisely.

In-Place Algorithm

An in-place function modifies data structures or objects outside of its own stack frame  (i.e.: stored on the process heap or in the stack frame of a calling function). Because of this, the changes made by the function remain after the call completes.

In-place algorithms are sometimes called destructive, since the original input is “destroyed” (or modified) during the function call.

Careful: “In-place” does not mean “without creating any additional variables!” Rather, it means “without creating a new copy of the input.” In general, an in-place function will only create additional variables that are O(1) space.

An out-of-place function doesn’t make any changes that are visible to other functions. Usually, those functions copy any data structures or objects before manipulating and changing them.

In many languages, primitive values (integers, floating point numbers, or characters) are copied when passed as arguments, and more complex data structures (arrays, heaps, or hash tables) are passed by reference. This is what Javascript does.

Here are two functions that do the same operation on an array, except one is in-place and the other is out-of-place:

function squareArrayInPlace(intArray) {

  intArray.forEach((int, index) => {
    intArray[index] *= int;
  });

  // NOTE: no need to return anything - we modified
  // intArray in place
}

function squareArrayOutOfPlace(intArray) {

  // We allocate a new array that we'll fill in with the new values
  const squaredArray = [];

  intArray.forEach((int, index) => {
    squaredArray[index] = Math.pow(int, 2);
  });

  return squaredArray;
}

Working in-place is a good way to save time and space. An in-place algorithm avoids the cost of initializing or copying data structures, and it usually has an O(1) space cost.

But be careful: an in-place algorithm can cause side effects. Your input is “destroyed” or “altered,” which can affect code outside of your function. For example:

const originalArray = [2, 3, 4, 5];
squareArrayInPlace(originalArray);

console.log('original array: ' + originalArray);
// Logs: original array: 4,9,16,25 - confusingly!

Generally, out-of-place algorithms are considered safer because they avoid side effects. You should only use an in-place algorithm if you’re space constrained or you’re positive you don’t need the original input anymore, even for debugging.

Dynamic Array
Other names:
array list, growable array, resizable array, mutable array
Quick reference
Average Case Worst Case
space O(n) O(n)
lookup O(1) O(1)
append O(1) O(n)
insert O(n) O(n)
delete O(n) O(n)

dynamic array is an array with a big improvement: automatic resizing.

One limitation of arrays is that they’re fixed size, meaning you need to specify the number of elements your array will hold ahead of time.

A dynamic array expands as you add more elements. So you don’t need to determine the size ahead of time.

Strengths:
  • Fast lookups. Just like arrays, retrieving the element at a given index takes O(1) time.
  • Variable size. You can add as many items as you want, and the dynamic array will expand to hold them.
  • Cache-friendly. Just like arrays, dynamic arrays place items right next to each other in memory, making efficient use of caches.
Weaknesses:
  • Slow worst-case appends. Usually, adding a new element at the end of the dynamic array takes O(1) time. But if the dynamic array doesn’t have any room for the new item, it’ll need to expand, which takes O(n) time.
  • Costly inserts and deletes. Just like arrays, elements are stored adjacent to each other. So adding or removing an item in the middle of the array requires “scooting over” other elements, which takes O(n) time.
In JavaScript

Confusingly, in JavaScript, dynamic arrays are just called arrays.

Here’s what they look like:

const gasPrices = [];

gasPrices[0] = 346;
gasPrices[1] = 360;
gasPrices[2] = 354;
Size vs. Capacity

When you allocate a dynamic array, your dynamic array implementation makes an underlying fixed-size array. The starting size depends on the implementation—let’s say our implementation uses 10 indices. Now say we append 4 items to our dynamic array. At this point, our dynamic array has a length of 4. But the underlying array has a length of 10.

We’d say this dynamic array’s size is 4 and its capacity is 10. The dynamic array stores an endIndex to keep track of where the dynamic array ends and the extra capacity begins.

Doubling Appends

What if we try to append an item but our array’s capacity is already full?

To make room, dynamic arrays automatically make a new, bigger underlying array. Usually twice as big.

Why not just extend the existing array? Because that memory might already be taken by another program.

Each item has to be individually copied into the new array.

Copying each item over costs O(n) time! So whenever appending an item to our dynamic array forces us to make a new double-size underlying array, that append takes O(n) time.

That’s the worst case. But in the best case (and the average case), appends are just O(1) time.

Amortized cost of appending
  1. The time cost of each special O(n) “doubling append” doubles each time.
  2. At the same time, the number of O(1) appends you get until the next doubling append also doubles.

These two things sort of “cancel out,” and we can say each append has an average cost or amortized cost of O(1). 

Given this, in industry we usually wave our hands and say dynamic arrays have a time cost of O(1) for appends, even though strictly speaking that’s only true for the average case or the amortized cost.

Practice
Question 1

Your company built an in-house calendar tool called HiCal. You want to add a feature to see the times in a day when everyone is available.

To do this, you’ll need to know when any team is having a meeting. In HiCal, a meeting is stored as objects with integer properties startTime and endTime. These integers represent the number of 30-minute blocks past 9:00am.

For example:

{ startTime: 2, endTime: 3 }  // meeting from 10:00 – 10:30 am
{ startTime: 6, endTime: 9 }  // meeting from 12:00 – 1:30 pm

Write a function mergeRanges() that takes an array of multiple meeting time ranges and returns an array of condensed ranges.

For example, given:

[
  { startTime: 0,  endTime: 1 },
  { startTime: 3,  endTime: 5 },
  { startTime: 4,  endTime: 8 },
  { startTime: 10, endTime: 12 },
  { startTime: 9,  endTime: 10 },
]

your function would return:

[
  { startTime: 0, endTime: 1 },
  { startTime: 3, endTime: 8 },
  { startTime: 9, endTime: 12 },
]

Do not assume the meetings are in order. The meeting times are coming from multiple teams.
Write a solution that’s efficient even when we can’t put a nice upper bound on the numbers representing our time ranges. Here we’ve simplified our times down to the number of 30-minute slots past 9:00 am. But we want the function to work even for very large numbers, like Unix timestamps. In any case, the spirit of the challenge is to merge meetings where startTime and endTime don’t have an upper bound.
Do you have an answer? No?

Here is a hint to get you started

Breakdown

What if we only had two ranges? Let’s take:

[{ startTime: 1, endTime: 3 }, { startTime: 2, endTime: 4 }]

These meetings clearly overlap, so we should merge them to give:

[{ startTime: 1, endTime: 4 }]

But how did we know that these meetings overlap?
Do you have an answer? Yes?

Gotchas

Look at this case:

[{ startTime: 1, endTime: 2 }, { startTime: 2, endTime: 3 }]

These meetings should probably be merged, although they don’t exactly “overlap”—they just “touch.” Does your function do this?

Look at this case:

[{ startTime: 1, endTime: 5 }, { startTime: 2, endTime: 3 }]

Notice that although the second meeting starts later, it ends before the first meeting ends. Does your function correctly handle the case where a later meeting is “subsumed by” an earlier meeting?

Look at this case:

[
  { startTime: 1, endTime: 10 },
  { startTime: 2, endTime: 6 },
  { startTime: 3, endTime: 5 },
  { startTime: 7, endTime: 9 },
]

Here all of our meetings should be merged together into just { startTime: 1, endTime: 10 }. We need keep in mind that after we’ve merged the first two we’re not done with the result—the result of that merge may itself need to be merged into other meetings as well.

 

Make sure that your function won’t “leave out” the last meeting.

We can do this in O(n lg{n}) time.

Full Breakdown
What if we only had two ranges? Let’s take:
[{ startTime: 1, endTime: 3 }, { startTime: 2, endTime: 4 }]
These meetings clearly overlap, so we should merge them to give:
[{ startTime: 1, endTime: 4 }]

But how did we know that these meetings overlap?
We could tell the meetings overlapped because the end time of the first one was after the start time of the second one! But our ideas of “first” and “second” are important here—this only works after we ensure that we treat the meeting that starts earlier as the “first” one.

How would we formalize this as an algorithm? Be sure to consider these edge cases:

  1. The end time of the first meeting and the start time of the second meeting are equal. For example: [{ startTime: 1, endTime: 2 }, { startTime: 2, endTime: 3 }]
  2. The second meeting ends before the first meeting ends. For example: [{ startTime: 1, endTime: 5 }, { startTime: 2, endTime: 3 }]

Here’s a formal algorithm:

  1. We treat the meeting with earlier start time as “first,” and the other as “second.”
  2. If the end time of the first meeting is equal to or greater than the start time of the second meeting, we merge the two meetings into one time range. The resulting time range’s start time is the first meeting’s start, and its end time is the later of the two meetings’ end times.
  3. Else, we leave them separate.

So, we could compare every meeting to every other meeting in this way, merging them or leaving them separate.

Comparing all pairs of meetings would take O(n^2) time. We can do better!

If we’re going to beat O(n^2) time, maybe we’re going to get O(n) time? Is there a way to do this in one pass?

It’d be great if, for each meeting, we could just try to merge it with the next meeting. But that’s definitely not sufficient, because the ordering of our meetings is random. There might be a non-next meeting that the current meeting could be merged with.

What if we sorted our array of meetings by start time?

Then any meetings that could be merged would always be adjacent!

So we could sort our meetings, then walk through the sorted array and see if each meeting can be merged with the one after it.

Sorting takes O(n lg{n}) time in the worst case. If we can then do the merging in one pass, that’s another O(n) time, for O(n lg{n}) overall. That’s not as good as O(n), but it’s better than O(n^2).

Solution

First, we sort our input array of meetings by start time so any meetings that might need to be merged are now next to each other.

Then we walk through our sorted meetings from left to right. At each step, either:

  1. We can merge the current meeting with the previous one, so we do.
  2. We can’t merge the current meeting with the previous one, so we know the previous meeting can’t be merged with any future meetings and we throw the current meeting into mergedMeetings.
function mergeRanges(meetings) {

  // Create a deep copy of the meetings array
  // https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/assign#Deep_Clone
  const meetingsCopy = JSON.parse(JSON.stringify(meetings));

  // Sort by start time
  const sortedMeetings = meetingsCopy.sort((a, b) => {
    return a.startTime - b.startTime;
  });

  // Initialize mergedMeetings with the earliest meeting
  const mergedMeetings = [sortedMeetings[0]];

  for (let i = 1; i < sortedMeetings.length; i++) {
    const currentMeeting    = sortedMeetings[i];
    const lastMergedMeeting = mergedMeetings[mergedMeetings.length - 1];

    // If the current meeting overlaps with the last merged meeting, use the
    // later end time of the two
    if (currentMeeting.startTime <= lastMergedMeeting.endTime) {
      lastMergedMeeting.endTime = Math.max(lastMergedMeeting.endTime, currentMeeting.endTime);
    } else {

      // Add the current meeting since it doesn't overlap
      mergedMeetings.push(currentMeeting);
    }
  }

  return mergedMeetings;
}
Complexity

O(n lg{n}) time and O(n) space.

Even though we only walk through our array of meetings once to merge them, we sort all the meetings first, giving us a runtime of O(n lg{n}). It’s worth noting that if our input were sorted, we could skip the sort and do this in O(n) time!

We create a new array of merged meeting times. In the worst case, none of the meetings overlap, giving us an array identical to the input array. Thus we have a worst-case space cost of .

Bonus
  1. What if we did have an upper bound on the input values? Could we improve our runtime? Would it cost us memory?
  2. Could we do this “in place” on the input array and save some space? What are the pros and cons of doing this in place?
What We Learned

This one arguably uses a greedy approach as well, except this time we had to sort the array first.

How did we figure that out?

We started off trying to solve the problem in one pass, and we noticed that it wouldn’t work. We then noticed the reason it wouldn’t work: to see if a given meeting can be merged, we have to look at all the other meetings! That’s because the order of the meetings is random.

That’s what got us thinking: what if the array were sorted? We saw that then a greedy approach would work. We had to spend O(n lg{n}) time on sorting the array, but it was better than our initial brute force approach, which cost us O(n^2) time!

Question 2

Write a function that takes an array of characters and reverses the letters in place. 

Why an array of characters instead of a string?

The goal of this question is to practice manipulating strings in place. Since we’re modifying the input, we need a mutable  type like an array, instead of JavaScript‘s immutable strings.

Do you have an answer? No?

Here is a hint to get you started.

Breakdown

In general, an in-place  algorithm will require swapping elements.

Do you have an answer? Yes?

Solution

We swap the first and last characters, then the second and second-to-last characters, and so on until we reach the middle.

function reverse(arrayOfChars) {

  let leftIndex = 0;
  let rightIndex = arrayOfChars.length - 1;

  while (leftIndex < rightIndex) {

    // Swap characters
    const temp = arrayOfChars[leftIndex];
    arrayOfChars[leftIndex] = arrayOfChars[rightIndex];
    arrayOfChars[rightIndex] = temp;

    // Move towards middle
    leftIndex++;
    rightIndex--;
  }
}
Complexity

O(n) time and O(1) space.

Question 3

You’re working on a secret team solving coded transmissions.

Your team is scrambling to decipher a recent message, worried it’s a plot to break into a major European National Cake Vault. The message has been mostly deciphered, but all the words are backward! Your colleagues have handed off the last step to you.

Write a function reverseWords() that takes a message as an array of characters and reverses the order of the words in place. 

Why an array of characters instead of a string?

The goal of this question is to practice manipulating strings in place. Since we’re modifying the message, we need a mutable  type like an array, instead of JavaScript‘s immutable strings.

For example:

const message = [ 'c', 'a', 'k', 'e', ' ',
                'p', 'o', 'u', 'n', 'd', ' ',
                's', 't', 'e', 'a', 'l' ];

reverseWords(message);

console.log(message.join(''));
// Prints: 'steal pound cake'

When writing your function, assume the message contains only letters and spaces, and all words are separated by one space.

Do you have an answer? No?

Here is a hint to get you started.

Breakdown

Let’s start with a simpler problem. What if we wanted to reverse all the characters in the message?

Do you have an answer? Yes?

Gotchas

We can do this in O(1) space. Remember, in place.

We can do this in O(n) time.

If you’re swapping individual words one at a time, consider what happens when the words are different lengths. Isn’t each swap O(n) time in the worst case?

Breakdown

Let’s start with a simpler problem. What if we wanted to reverse all the characters in the message?

Well, we could swap the first character with the last character, then the second character with the second to last character, and so on, moving towards the middle. Can you implement this in code?

function reverseCharacters(message) {

  let leftIndex = 0;
  let rightIndex = message.length - 1;

  // Walk towards the middle, from both sides
  while (leftIndex < rightIndex) {

    // Swap the left char and right char
    const temp = message[leftIndex];
    message[leftIndex] = message[rightIndex];
    message[rightIndex] = temp;
    leftIndex++;
    rightIndex--;
  }
}

Ok, looks good. Does this help us?

Can we use the same concept but apply it to words instead of characters?

Probably. We’ll have to figure out a couple things:

  1. How do we figure out where words begin and end?
  2. Once we know the start and end indices of two words, how do we swap those two words?

We could attack either of those first, but I’m already seeing a potential problem in terms of runtime. Can you guess what it is?

What happens when you swap two words that aren’t the same length?

// the eagle has landed
[ 't', 'h', 'e', ' ', 'e', 'a', 'g', 'l', 'e', ' ',
  'h', 'a', 's', ' ', 'l', 'a', 'n', 'd', 'e', 'd' ];

Supposing we already knew the start and end indices of ‘the’ and ‘landed’, how long would it take to swap them?

O(n) time, where n is the total length of the message!

Why? Notice that in addition to moving ‘the’ to the back and moving ‘landed’ to the front, we have to “scoot over” everything in between, since ‘landed’ is longer than ‘the’.

So what’ll be the total time cost with this approach? Assume we’ll be able to learn the start and end indices of all of our words in just one pass (O(n) time).

O(n^2) total time. Why? In the worst case we have almost as many words as we have characters, and we’re always swapping words of different lengths. For example:

"a bb c dd e ff g hh"

We take O(n) time to swap the first and last words (we have to move all n characters):

// input: a bb c dd e ff g hh
[ 'a', ' ', 'b', 'b', ' ', 'c', ' ', 'd', 'd', ' ',
  'e', ' ', 'f', 'f', ' ', 'g', ' ', 'h', 'h' ];

// first swap: hh bb c dd e ff g a
[ 'h', 'h', ' ', 'b', 'b', ' ', 'c', ' ', 'd', 'd',
  ' ', 'e', ' ', 'f', 'f', ' ', 'g', ' ', 'a' ];

Then for the second swap:

// input: a bb c dd e ff g hh
[ 'a', ' ', 'b', 'b', ' ', 'c', ' ', 'd', 'd', ' ',
  'e', ' ', 'f', 'f', ' ', 'g', ' ', 'h', 'h' ];

// first swap: hh bb c dd e ff g a
[ 'h', 'h', ' ', 'b', 'b', ' ', 'c', ' ', 'd', 'd',
  ' ', 'e', ' ', 'f', 'f', ' ', 'g', ' ', 'a' ];

// second swap: hh g c dd e ff bb a
[ 'h', 'h', ' ', 'g', ' ', 'c', ' ', 'd', 'd',
  ' ', 'e', ' ', 'f', 'f', ' ', 'b', 'b', ' ', 'a' ];

We have to move all n characters except the first and last words, and a couple spaces. So we move n-5 characters in total.

For the third swap, we have another 5 characters we don’t have to move. So we move n-10 in total. We’ll end up with a series like this:

n + (n-5) + (n-10) + (n-15) + … + 6 + 1

This is a subsection of the common triangular series.  We’re just skipping 4 terms between each term!

So we have the sum of “every fifth number” from that triangular series. That means our sum will be about a fifth the sum of the original series! But in big O notation dividing by 5 is a constant, so we can throw it out. The original triangular series is O(n^2), and so is our series with every fifth element!

Okay, so O(n^2) time. That’s pretty bad. It’s possible that’s the best we can do…but maybe we can do better?

Let’s try manipulating a sample input by hand.

And remember what we did for our character-level reversal…

Look what happens when we do a character-level reversal:

// input: the eagle has landed
[ 't', 'h', 'e', ' ', 'e', 'a', 'g', 'l', 'e', ' ',
  'h', 'a', 's', ' ', 'l', 'a', 'n', 'd', 'e', 'd' ];

// character-reversed: dednal sah elgae eht
[ 'd', 'e', 'd', 'n', 'a', 'l', ' ', 's', 'a', 'h', ' ',
  'e', 'l', 'g', 'a', 'e', ' ', 'e', 'h', 't' ];

Notice anything?

What if we put it up against the desired output:

// input: the eagle has landed
[ 't', 'h', 'e', ' ', 'e', 'a', 'g', 'l', 'e', ' ',
  'h', 'a', 's', ' ', 'l', 'a', 'n', 'd', 'e', 'd' ];

// character-reversed: dednal sah elgae eht
[ 'd', 'e', 'd', 'n', 'a', 'l', ' ', 's', 'a', 'h', ' ',
  'e', 'l', 'g', 'a', 'e', ' ', 'e', 'h', 't' ];

// word-reversed (desired output): landed has eagle the
[ 'l', 'a', 'n', 'd', 'e', 'd', ' ', 'h', 'a', 's', ' ',
  'e', 'a', 'g', 'l', 'e', ' ', 't', 'h', 'e' ];

The character reversal reverses the words! It’s a great first step. From there, we just have to “unscramble” each word.

More precisely, we just have to re-reverse each word!

Solution

We’ll write a helper function reverseCharacters() that reverses all the characters between a leftIndex and rightIndex. We use it to:

  1. Reverse all the characters in the entire message, giving us the correct word order but with each word backward.
  2. Reverse the characters in each individual word.
function reverseWords(message) {

  // First we reverse all the characters in the entire message
  reverseCharacters(message, 0, message.length - 1);
  // This gives us the right word order
  // but with each word backward

  // Now we'll make the words forward again
  // by reversing each word's characters

  // We hold the index of the *start* of the current word
  // as we look for the *end* of the current word
  let currentWordStartIndex = 0;
  for (let i = 0; i <= message.length; i++) {

    // Found the end of the current word!
    if (i === message.length || message[i] === ' ') {

      // If we haven't exhausted the string our
      // next word's start is one character ahead
      reverseCharacters(message, currentWordStartIndex, i - 1);
      currentWordStartIndex = i + 1;
    }
  }
}

function reverseCharacters(message, leftIndex, rightIndex) {

  // Walk towards the middle, from both sides
  while (leftIndex < rightIndex) {

    // Swap the left char and right char
    const temp = message[leftIndex];
    message[leftIndex] = message[rightIndex];
    message[rightIndex] = temp;
    leftIndex++;
    rightIndex--;
  }
}
Complexity

O(n) time and O(1) space!

Hmm, the team used your function to finish deciphering the message. There definitely seems to be a heist brewing, but no specifics on where. Any ideas?

Bonus

How would you handle punctuation?

What We Learned

The naive solution of reversing the words one at a time had a worst-case O(n^2) runtime. That’s because swapping words with different lengths required “scooting over” all the other characters to make room.

To get around this “scooting over” issue, we reversed all the characters in the message first. This put all the words in the correct spots, but with the characters in each word backward. So to get the final answer, we reversed the characters in each word. This all takes two passes through the message, so O(n) time total.

This might seem like a blind insight, but we derived it by using a clear strategy:

Solve a simpler version of the problem (in this case, reversing the characters instead of the words), and see if that gets us closer to a solution for the original problem.

Question 4

In order to win the prize for most cookies sold, my friend Alice and I are going to merge our Girl Scout Cookies orders and enter as one unit.

Each order is represented by an “order id” (an integer).

We have our lists of orders sorted numerically already, in arrays. Write a function to merge our arrays of orders into one sorted array.

For example:

const myArray = [3, 4, 6, 10, 11, 15];
const alicesArray = [1, 5, 8, 12, 14, 19];

console.log(mergeArrays(myArray, alicesArray));
// logs [1, 3, 4, 5, 6, 8, 10, 11, 12, 14, 15, 19]

Do you have an answer? No?

Here is a hint to get you started

Breakdown

We could simply concatenate (join together) the two arrays into one, then sort the result:

function mergeSortedArrays(myArray, alicesArray) {
  const mergedArray = myArray.concat(alicesArray);
  return mergedArray.sort((a, b) => a - b);
}

What would the time cost be?  Remember that one of the things the interviewer mentioned is that the arrays are already sorted.

Do you have an answer? Yes?
Gotchas

We can do this in O(n) time and space.

If you’re running a built-in sorting function, your algorithm probably takes O(n lg{n}) time for that sort.

Think about edge cases! What happens when we’ve merged in all of the elements from one of our arrays but we still have elements to merge in from our other array?

Breakdown

We could simply concatenate (join together) the two arrays into one, then sort the result:

function mergeSortedArrays(myArray, alicesArray) {
  const mergedArray = myArray.concat(alicesArray);
  return mergedArray.sort((a, b) => a - b);
}

What would the time cost be?

O(n lg{n}), where n is the total length of our output array (the sum of the lengths of our inputs).

We can do better. With this algorithm, we’re not really taking advantage of the fact that the input arrays are themselves already sorted. How can we save time by using this fact?

A good general strategy for thinking about an algorithm is to try writing out a sample input and performing the operation by hand. If you’re stuck, try that!

Since our arrays are sorted, we know they each have their smallest item in the 0th index. So the smallest item overall is in the 0th index of one of our input arrays!

Which 0th element is it? Whichever is smaller!

To start, let’s just write a function that chooses the 0th element for our sorted array.

function mergeArrays(myArray, alicesArray) {

  const mergedArray = [];

  const headOfMyArray = myArray[0];
  const headOfAlicesArray = alicesArray[0];

  // Case: 0th comes from my array
  if (headOfMyArray < headOfAlicesArray) {
    mergedArray[0] = headOfMyArray;

    // Case: 0th comes from Alice's array
  } else {
    mergedArray[0] = headOfAlicesArray;
  }

  // Eventually we'll want to return the merged array
  return mergedArray;
}

Okay, good start! That works for finding the 0th element. Now how do we choose the next element?

Let’s look at a sample input:

[3,  4,  6, 10, 11, 15]  // myArray
[1,  5,  8, 12, 14, 19]  // alicesArray

To start we took the 0th element from alicesArray and put it in the 0th slot in the output array:

[3,  4,  6, 10, 11, 15]  // myArray
[1,  5,  8, 12, 14, 19]  // alicesArray
[1,  x,  x,  x,  x,  x]  // mergedArray

We need to make sure we don’t try to put that 1 in merged array again. We should mark it as “already merged” somehow. For now, we can just cross it out:

[3,  4,  6, 10, 11, 15]  // myArray
[x,  5,  8, 12, 14, 19]  // alicesArray
[1,  x,  x,  x,  x,  x]  // mergedArray

Or we could even imagine it’s removed from the array:

[3,  4,  6, 10, 11, 15]  // myArray
[5,  8, 12, 14, 19]      // alicesArray
[1,  x,  x,  x,  x,  x]  // mergedArray

Now to get our next element we can use the same approach we used to get the 0th element—it’s the smallest of the earliest unmerged elements in either array! In other words, it’s the smaller of the leftmost elements in either array, assuming we’ve removed the elements we’ve already merged in.

So in general we could say something like:

  1. We’ll start at the beginnings of our input arrays, since the smallest elements will be there.
  2. As we put items in our final mergedArray, we’ll keep track of the fact that they’re “already merged.”
  3. At each step, each array has a first “not-yet-merged” item.
  4. At each step, the next item to put in the mergedArray is the smaller of those two “not-yet-merged” items!

Can you implement this in code?

function mergeArrays(myArray, alicesArray) {

  const mergedArray = [];

  let currentIndexAlices = 0;
  let currentIndexMine = 0;
  let currentIndexMerged = 0;

  while (currentIndexMerged < (myArray.length + alicesArray.length)) {
    const firstUnmergedAlices = alicesArray[currentIndexAlices];
    const firstUnmergedMine = myArray[currentIndexMine];

    // Case: next comes from my array
    if (firstUnmergedMine < firstUnmergedAlices) {
      mergedArray[currentIndexMerged] = firstUnmergedMine;
      currentIndexMine++;

      // Case: next comes from Alice's array
    } else {
      mergedArray[currentIndexMerged] = firstUnmergedAlices;
      currentIndexAlices++;
    }

    currentIndexMerged++;
  }

  return mergedArray;
}

Okay, this algorithm makes sense. To wrap up, we should think about edge cases and check for bugs. What edge cases should we worry about?

Here are some edge cases:

  1. One or both of our input arrays is 0 elements or 1 element
  2. One of our input arrays is longer than the other.
  3. One of our arrays runs out of elements before we’re done merging.

Actually, (3) will always happen. In the process of merging our arrays, we’ll certainly exhaust one before we exhaust the other.

Does our function handle these cases correctly?

If both arrays are empty, we’re fine. But for all the other edge cases, at some point firstUnmergedMine or firstUnmergedAlices will be undefined because there won’t be an element at one of those indices. Then JavaScript will compare undefined with a number, which will always be false, and mergedArray might be out of order or contain undefined!

How can we fix this?

We can probably solve these cases at the same time. They’re not so different—they just have to do with indexing past the end of arrays.

To start, we could treat each of our arrays being out of elements as a separate case to handle, in addition to the 2 cases we already have. So we have 4 cases total. Can you code that up?

Be sure you check the cases in the right order!

function mergeArrays(myArray, alicesArray) {

  const mergedArray = [];

  let currentIndexAlices = 0;
  let currentIndexMine = 0;
  let currentIndexMerged = 0;

  while (currentIndexMerged < (myArray.length + alicesArray.length)) {

    // Case: my array is exhausted
    if (currentIndexMine >= myArray.length) {
      mergedArray[currentIndexMerged] = alicesArray[currentIndexAlices];
      currentIndexAlices++;

      // Case: Alice's array is exhausted
    } else if (currentIndexAlices >= alicesArray.length) {
      mergedArray[currentIndexMerged] = myArray[currentIndexMine];
      currentIndexMine++;

      // Case: my item is next
    } else if (myArray[currentIndexMine] < alicesArray[currentIndexAlices]) {
      mergedArray[currentIndexMerged] = myArray[currentIndexMine];
      currentIndexMine++;

      // Case: Alice's item is next
    } else {
      mergedArray[currentIndexMerged] = alicesArray[currentIndexAlices];
      currentIndexAlices++;
    }

    currentIndexMerged++;
  }

  return mergedArray;
}

Cool. This’ll work, but it’s a bit repetitive. We have these two lines twice:

mergedArray[currentIndexMerged] = myArray[currentIndexMine];
currentIndexMine++;

Same for these two lines:

mergedArray[currentIndexMerged] = alicesArray[currentIndexAlices];
currentIndexAlices++;

That’s not DRY. Maybe we can avoid repeating ourselves by bringing our code back down to just 2 cases.

See if you can do this in just one “if else” by combining the conditionals.

You might try to simply squish the middle cases together:

if (isAlicesArrayExhausted ||
  (myArray[currentIndexMine] < alicesArray[currentIndexAlices])) {

  mergedArray[currentIndexMerged] = myArray[currentIndexMine];
  currentIndexMine++;

But what happens when myArray is exhausted?

myArray[currentIndexMine] will be undefined. But our code will still work! JavaScript gives false for any comparison with undefined, and we want false here—we’re checking if the element in myArray comes first, and it doesn’t if myArray is empty. But this code wouldn’t work in many other languages—we’d get an error when we tried to access an element in an empty array or compare a number with a non-numerical value like undefined.

Even though our code works, it’s messy to access and compare an element that doesn’t exist. Let’s adjust our code so we’re more explicit and don’t rely on JavaScript’s uncommon and nonobvious behavior of giving false in comparisons with undefined.

Solution

First, we allocate our answer array, getting its size by adding the size of myArray and alicesArray.

We keep track of a current index in myArray, a current index in alicesArray, and a current index in mergedArray. So at each step, there’s a “current item” in alicesArray and in myArray. The smaller of those is the next one we add to the mergedArray!

But careful: we also need to account for the case where we exhaust one of our arrays and there are still elements in the other. To handle this, we say that the current item in myArray is the next item to add to mergedArray only if myArray is not exhausted AND, either:

  1. alicesArray is exhausted, or
  2. the current item in myArray is less than the current item in alicesArray
function mergeArrays(myArray, alicesArray) {

  // Set up our mergedArray
  const mergedArray = [];

  let currentIndexAlices = 0;
  let currentIndexMine = 0;
  let currentIndexMerged = 0;

  while (currentIndexMerged < (myArray.length + alicesArray.length)) {

    const isMyArrayExhausted = currentIndexMine >= myArray.length;
    const isAlicesArrayExhausted = currentIndexAlices >= alicesArray.length;

    // Case: next comes from my array
    // My array must not be exhausted, and EITHER:
    // 1) Alice's array IS exhausted, or
    // 2) The current element in my array is less
    //    than the current element in Alice's array
    if (!isMyArrayExhausted && (isAlicesArrayExhausted ||
      (myArray[currentIndexMine] < alicesArray[currentIndexAlices]))) {

      mergedArray[currentIndexMerged] = myArray[currentIndexMine];
      currentIndexMine++;

      // Case: next comes from Alice's array
    } else {
      mergedArray[currentIndexMerged] = alicesArray[currentIndexAlices];
      currentIndexAlices++;
    }

    currentIndexMerged++;
  }

  return mergedArray;
}

The if statement is carefully constructed to avoid indexing past the end of an array, because JavaScript would give us undefined and we don’t want to compare undefined with an integer. We take advantage of JavaScript’s short circuit evaluation and check first if the arrays are exhausted.

Complexity

O(n) time and O(n) additional space, where n is the number of items in the merged array.

The added space comes from allocating the mergedArray. There’s no way to do this “ in place” because neither of our input arrays are necessarily big enough to hold the merged array.

But if our inputs were linked lists, we could avoid allocating a new structure and do the merge by simply adjusting the next pointers in the list nodes!

In our implementation above, we could avoid tracking currentIndexMerged and just compute it on the fly by adding currentIndexMine and currentIndexAlices. This would only save us one integer of space though, which is hardly anything. It’s probably not worth the added code complexity.

Bonus

What if we wanted to merge several sorted arrays? Write a function that takes as an input an array of sorted arrays and outputs a single sorted array with all the items from each array.

Do we absolutely have to allocate a new array to use for the merged output? Where else could we store our merged array? How would our function need to change?

What We Learned

We spent a lot of time figuring out how to cleanly handle edge cases.

Sometimes it’s easy to lose steam at the end of a coding interview when you’re debugging. But keep sprinting through to the finish! Think about edge cases. Look for off-by-one errors.