December 2023 – Recap

1 month vs 2 months
  • 20+ problems solved
  • Studied: LinkedLists, C++

Unfortunately on December I got Influenza A which lasted for 2 weeks + I had finals 😦 so I had a pause since 12th.

After that I felt the need to learn C++ more so I have been studying C++ and some concepts from C these days. I learned most things in C++ so now I am back to problem solving in January. I am writing C++ study notes in google docs and it’s 51 pages now.

Learning low level programming has been really helpful and now I understand better things like TCO (tail call optimization) runtime stack mechanism and foundational concepts in data structures.

1071. Greatest Common Divisor of Strings – Solution

Difficulty: Easy
Topics: Math, String

For two strings s and t, we say “t divides s” if and only if s = t + ... + t (i.e., t is concatenated with itself one or more times).

Given two strings str1 and str2, return the largest string x such that x divides both str1 and str2.

I initially solved this like this:

string gcdOfStrings(string str1, string str2) {
int minlen = min(str1.size(), str2.size());
// check if one is substring of another. if not, return "".
if (str1.substr(0, minlen) != str2.substr(0, minlen)) {
return "";
}
for (int i = minlen; i > 0; i--) {
// if str1 and str2 are not divisible by i, pass
if (! (str1.size() % i == 0 && str2.size() % i == 0)) {
continue;
}
// check if GCD is correct
string GCD = str1.substr(0, i);
for (int j = 0; j < str1.size(); j+=i) {
if (GCD != str1.substr(j, i)) goto pass;
}
for (int j = 0; j < str2.size(); j+=i) {
if (GCD != str2.substr(j, i)) goto pass;
}
return GCD;
pass: ;
}
return "";
}

But there was a simpler logic:

str1 and str2 have a gcd iff str1+str2 == str2+str1

Proof:
str1 = m (GCD)
str2 = n (GCD)
str1 + str2 = (m + n) GCD = str2 + st1

if they have a gcd, find it with math -> gcd().

string gcdOfStrings(string str1, string str2) {
return (str1 + str2 == str2 + str1) ?
str1.substr(0, gcd(size(str1),size(str2))) : "";
}

Boyer–Moore majority vote algorithm

Links: Original Paper published in 1991 (pg. 105-117)

I was solving 169. Majority Element and was struggling to find a one passalgorithm with O(1) space (these are called streaming algorithms). The best I could figure out was in O(n^2) time + O(1) space or O(nlogn) time + O(n) space. But then I saw this mind-blowing one pass approach called MJRTY – Vast Majority Vote Algorithm by Boyer and Moore.

The approach becomes evident with the image below:

https://www.cs.utexas.edu/~moore/best-ideas/mjrty/example.html#step04

We start counting the first element, if a different element appears, instead of start counting that, we decrease the current count. When count becomes 0, we start counting whatever element we encounter next. At the end of the iteration, what we are left with is the majority element.

This only works when an element appears more than n/2 times through the array. The why is pretty intuitive and logical.

Implementation

int majorityElement(vector<int>& nums) {
int x, cnt = 0;
for (int i = 0; i < nums.size(); i++){
if (cnt == 0){
x = nums[i];
cnt++;
}
else{
if (nums[i] == x) cnt++;
else cnt--;
}
}
return x;
}

or faster if written like this:

int majorityElement(vector<int>& nums) {
int x = nums[0];
int cnt = 1;
for(int i=1;i<nums.size();i++){
if(nums[i] == x){
cnt++;
}
else{
cnt--;
}
if(cnt == 0){
x = nums[i];
cnt = 1;
}
}
return x;
}

Queue – Basics

Implementation

may use dynamic array and an index pointing to the head of the queue.

should support enqueue (appends a new element), dequeue (removes the first element).

class MyQueue {
    private:
        vector<int> data;      
        int p_start;            
    public:
        MyQueue() {p_start = 0;}
        bool enQueue(int x) {
            data.push_back(x);
            return true;
        }

        bool deQueue() {
            if (isEmpty()) {
                return false;
            }
            p_start++;
            return true;
        };
        int Front() {
            return data[p_start];
        };
        bool isEmpty()  {
            return p_start >= data.size();
        }
};

But this has drawbacks. Memory is wasted every time we dequeue.

Thus, need a circular queue. 

Circular Queue

fixed-size array, 2 pointers indicating head and tail. Goal is to reuse the wasted storage.

In designing a DS, it’s good to have as few attributes as possible and maintain a simple manipulation logic. Sometimes, there can be some redundancy to minimize time complexity. 

Attributes:

  • queue: fixed size array.
  • headIndex: integer which indicates the current head element
  • count: the current length. with `headIndex`, we can locate the tail element.
  • capacity: max number of elements that can be held in the queue. With this, we reduce TC for calling len(queue). 
class MyCircularQueue {
private:
    vector<int> data;
    int head;
    int tail;
    int size;

public:
    MyCircularQueue(int k) {
        data.resize(k);
        head = -1;
        tail = -1;
        size = k;
    }
    bool enQueue(int value) {
        if (isFull()) {
            return false;
        }
        if (isEmpty()) {
            head = 0;
        }
        tail = (tail + 1) % size;
        data[tail] = value;
        return true;
    }
    bool deQueue() {
        if (isEmpty()) {
            return false;
        }
        if (head == tail) {
            head = -1;
            tail = -1;
            return true;
        }
        head = (head + 1) % size;
        return true;
    }    
int Front() {
        if (isEmpty()) {
            return -1;
        }
        return data[head];
    }
    int Rear() {
        if (isEmpty()) {
            return -1;
        }
        return data[tail];
    }
    bool isEmpty() {
        return head == -1;
    }
    bool isFull() {
        return ((tail + 1) % size) == head;
    }
};

Built In Queue C++

use stl queue in practice.

#include <queue>

int main() {
    queue<int> q;

    q.push(5);

    if (q.empty()) {}

    q.pop();

    q.front() ; get first element

    q.back() ; get last element

    q.size() ; get size
}

Hoare’s Quickselect & Lomuto’s Partition Scheme

Quickselect Algorithm

Hoare’s selecting algorithm is a textbook algorithm used to solve “find kth something”: kth smallest, kth largest, kth most frequent, kth less frequent, etc.

Time Complexity: Average O(n), Worst case O(n^2)

See 347. Top K Frequent Elements. Below, we solve this question with Hoare’s Quickselect.

Hoare’s Quickselect

In a hashmap of <element, frequency>, select a random pivot point and use a partition scheme to place the pivot into a position where less frequent elements come to left side of pivot and more or equally frequent comes to the right.

If pivot index is N-k, there are k elements to the right side and all those are more or equally frequent elements, so return them. Else, choose the side of the array to proceed recursively.

Lomuto’s Partition Scheme

places elements >= pivot in the right side of pivot and elements < pivot in the left side.

  1. swap the pivot with last element (or just start with the end element being pivot)
  2. start a pointer from the left (index 0). Call this store_index.
  3. iterating from 0 to end, if the element is less than pivot val, swap the element with element at store_index. If the element is >= pivot val, just pass.
  4. After the iteration, swap pivot with element at store_index (put pivot back in the right position).

Solution to 347. Top K Frequent Elements

class Solution {
private:
vector<int> unique;
map<int, int> count_map;

public:
int partition(int left, int right, int pivot_index) {
int pivot_frequency = count_map[unique[pivot_index]];
// 1. Move the pivot to the end
swap(unique[pivot_index], unique[right]);

// 2. Move all less frequent elements to the left
int store_index = left;
for (int i = left; i <= right; i++) {
if (count_map[unique[i]] < pivot_frequency) {
swap(unique[store_index], unique[i]);
store_index += 1;
}
}

// 3. Move the pivot to its final place
swap(unique[right], unique[store_index]);

return store_index;
}

void quickselect(int left, int right, int k_smallest) {
/*
Sort a list within left..right till kth less frequent element
takes its place.
*/

// base case: the list contains only one element
if (left == right) return;

int pivot_index = left + rand() % (right - left + 1);

// Find the pivot position in a sorted list
pivot_index = partition(left, right, pivot_index);

//If the pivot is in its final sorted position
if (k_smallest == pivot_index) {
return;
} else if (k_smallest < pivot_index) {
// go left
quickselect(left, pivot_index - 1, k_smallest);
} else {
// go right
quickselect(pivot_index + 1, right, k_smallest);
}
}

vector<int> topKFrequent(vector<int>& nums, int k) {
// build hash map: element and how often it appears
for (int n : nums) {
count_map[n] += 1;
}

// array of unique elements
int n = count_map.size();
for (pair<int, int> p : count_map) {
unique.push_back(p.first);
}

// kth top frequent element is (n - k)th less frequent.
// Do a partial sort: from less frequent to the most frequent, till
// (n - k)th less frequent element takes its place (n - k) in a sorted array.
// All elements on the left are less frequent.
// All the elements on the right are more frequent.
quickselect(0, n - 1, n - k);
// Return top k frequent elements
vector<int> top_k_frequent(k);
copy(unique.begin() + n - k, unique.end(), top_k_frequent.begin());
return top_k_frequent;
}
};

3. Longest Substring Without Repeating Characters – Solution

Difficulty: medium
Topics: Hash Table, String, Sliding Window

Given a string s, find the length of the longest substring without repeating characters.

Example 1:
Input: s = “abcabcbb”
Output: 3
Explanation: The answer is “abc”, with the length of 3.

Example 2:
Input: s = “bbbbb”
Output: 1
Explanation: The answer is “b”, with the length of 1.

Example 3:
Input: s = “pwwkew”
Output: 3
Explanation: The answer is “wke”, with the length of 3. Notice that the answer must be a substring, “pwke” is a subsequence and not a substring.

Solution

It’s easy to come up with a O(n^2) approach. Just check the length of longest substring starting from 0 to end.

O(n) approach needs more thinking:

int lengthOfLongestSubstring(string s) {

map<char, int> charMap;
int start = -1;
int maxLen = 0;

for (int i = 0; i < s.size(); i++) {
if (charMap.count(s[i]) != 0) {
start = max(start, charMap[s[i]]);
}
charMap[s[i]] = i;
maxLen = max(maxLen, i-start);
}

return maxLen;
}

This puts the index of where characters appears in a hashmap. So as we iterate from 0 to end, we can know where that character appeared last time and find the length. Each time we recalculate the max.

Reflection

interesting concept to blackbox:

  • whenever finding a maximum value through iteration, let’s use maxval = max(maxval, newval)
  • use hashmap to store the value and its frequency, or value and index.
  • If the value is a character or linear sequence number, it can be stored in an array as well. But using hashmap can be more efficient space-wise.

November 2023 – Recap

10th day vs 25th day
  • 57 problems solved
  • Studied: Arrays, Strings, Binary Search, Hash Tables

Hello. I’m writing this at 5:00 AM. This month was somewhat busy, but since November 11, when I decided to commit at least 1 hour to PS every day, I have been keeping it. No matter if I have a midterm tomorrow, an internship application due, or if I’ve slept only 3 hours and my brain is not working, I just sit down for an hour. It’s inefficient at times, but it’s symbolic of my commitment to not losing track. I hope I maintain this pace next month too.

I often get scared doing PS/CP since this is definitely a field where effort can’t win talent and I doubt if I have talent. Oftentimes I think “what if I just spend all this time on learning full-stack dev or machine learning”? If there is someone who have done CP and decided to go to data science or quant field, I want to ask “How do you overcome the fear stemming from spending time on uncertain things?” But anyways there is no going back.

Well. This was reflection for this month. Thank you for reading whoever reading this. I don’t usually share this blog to anyone I know because I want to show them when I have a solid result and not the process.

1. Two Sum – Solution

Difficulty: Easy
Topics: Array, Hash Table

Question

Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

You can return the answer in any order.

Example 1:Input: nums = [2,7,11,15], target = 9 Output: [0,1] Explanation: Because nums[0] + nums[1] == 9, we return [0, 1].

Example 2:Input: nums = [3,2,4], target = 6 Output: [1,2]

Example 3:Input: nums = [3,3], target = 6 Output: [0,1]

Solution

While storing the numbers in hashmap, check if the complementary (target – number) is in the hashmap.

Note that hashmap[nums[i]] = i; should not be before the part where we search for comp. This is because search for comp should be done in values previously put in, not including the current value.

vector<int> twoSum(vector<int>& nums, int target) {
        unordered_map<int, int> hashmap;
        vector<int> res;
        for (int i = 0; i < nums.size(); i++){
            int comp = target - nums[i];
            if (hashmap.count(comp) > 0){
                res = {hashmap[comp], i};
                return res;
            }
            hashmap[nums[i]] = i;
        }
        return res;
    }

Hashmap in C++

Basic Usages

unordered_map<int, int> hashmap;
// 2. insert a new (key, value) pair
hashmap.insert(make_pair(2, 3));
// 3. insert a new (key, value) pair or update the value of existed key
hashmap[1] = 1;
hashmap[1] = 2;
// 4. get the value of a specific key
hashmap[1]
// 5. delete a key
hashmap.erase(2);
// 6. check if a key is in the hash map
if (hashmap.count(2) <= 0);
// 7. get the size of the hash map
hashmap.size()
// 8. iterate the hash map
for (auto it = hashmap.begin(); it != hashmap.end(); ++it) {
cout << it->first << "," << it->second << endl;
}
// 9. clear the hash map
hashmap.clear();
// 10. check if the hash map is empty
if (hashmap.empty())
Iterating over list of keys:
for (Type key : keys) {
        if (hashmap.count(key) > 0) {
            if (hashmap[key] satisfies the requirement) {
                return needed_information;
            }
        }
        // Value can be any information you needed (e.g. index)
        hashmap[key] = value;
    }

(to be continued…)

Iterating over map
for (auto& it : myMap) {
vec.push_back(it);
}
Sorting map based on value
#include <iostream>
#include <map>
#include <vector>
#include <algorithm>

// Comparator function to sort pairs by second value
bool sortByVal(const std::pair<int, int>& a, const std::pair<int, int>& b) {
return (a.second < b.second);
}

int main() {
// Define a map
std::map<int, int> myMap = {{1, 40}, {2, 30}, {3, 60}, {4, 20}};

// Copy elements to a vector of pairs
std::vector<std::pair<int, int>> vec;
for (auto& it : myMap) {
vec.push_back(it);
}

// Sort the vector by value
std::sort(vec.begin(), vec.end(), sortByVal);

// Print the sorted vector
for (auto& it : vec) {
std::cout << it.first << ": " << it.second << std::endl;
}

return 0;
}

Tips

  • When counting frequency, don’t need to check if the hashmap[x] exists.
// don't do this:
if (hashmap.count(x) > 0) hashmap[x]++;
else hashmap[x] = 1;
// do this:
hashmap[x]++;

// same for adding up the frequency:
for (...){
sum += hashmap[x];
}

658. Find K Closest Elements – Solution

Question

Difficulty: medium
Topics: Array, Two Pointers, Binary Search, Sliding Window

Given a sorted integer array arr, two integers k and x, return the k closest integers to x in the array. The result should also be sorted in ascending order.

An integer a is closer to x than an integer b if:

  • |a - x| < |b - x|, or
  • |a - x| == |b - x| and a < b

Example 1:Input: arr = [1,2,3,4,5], k = 4, x = 3 Output: [1,2,3,4]

Example 2:Input: arr = [1,2,3,4,5], k = 4, x = -1 Output: [1,2,3,4]

Solution

This was a question that I struggled a lot. It was harder than usual binary search problems. There are many ways to think for this problem.

Solution 1: O(N⋅log(N)+k⋅log(k))

compute the distances between the elements and x, sort it and display the first k elements. When displaying it, sort again.

Solution 2: O(log(N)+k) – Bin. Search + 2 Pointers

Search the index of element closest to x. Then, start l and r pointers from there and expand them based on the smallest distance.

Solution 3: O(log(Nk)) – Find the left bound.

Search the left bound with binary search. Let l = 0 and r = N-k which is the maximum index that a left bound can be.

compute the mid point. Now, assuming the mid is the left bound, mid+k is the right bound. Then,

If value at mid is closer to x:

  • mid and further cannot be the left bound because then the right bound would be more distant from x than mid.

But, since there can be an even better left bound in the left side, we continue the bin search with the left side.

so we do r = mid;

If value at mid+k is closer to x:

  • index at mid or lower can’t be left bound. The left bound is somewhere between mid+1 ~ mid+k

so we do l = mid+1;

class Solution {
public:
vector<int> findClosestElements(vector<int>& arr, int k, int x) {
int l = 0, r = arr.size() - k;
while (l < r){
int mid = (l+r)/2;
if (x - arr[mid] > arr[mid+k] - x){
l = mid + 1;
}
else{
r = mid;
}
}
return vector<int>(arr.begin() + l, arr.begin() + l + k);
}
};

There is another way of seeing this: (from lee215’s solution post)

We compare the distance between x - A[mid] and A[mid + k] - x
If we assume A[mid] ~ A[mid + k] is sliding window

case 1: x – A[mid] < A[mid + k] – x, need to move window go left
——-x—-A[mid]—————–A[mid + k]———-

case 2: x – A[mid] < A[mid + k] – x, need to move window go left again
——-A[mid]—-x—————–A[mid + k]———-

case 3: x – A[mid] > A[mid + k] – x, need to move window go right
——-A[mid]——————x—A[mid + k]———-

case 4: x – A[mid] > A[mid + k] – x, need to move window go right
——-A[mid]———————A[mid + k]—-x——

Reflection

I think whenever using binary search, I should think about based on what criteria I will choose the left side or the right side given a mid point. This was particularly harder because I didn’t think of finding the left bound. Let’s remember that when finding a range, I can try to find only left or right bound.