• Java HashMap Examples

    In this post, I walk through the concept of Java HashMap, and use examples to describe its usage cases. 

    What is HashMap

    Java HashMap is a Hash table based implementation of the Java Map interface.  It provides all of the optional map operations, and permits null values and the null key.  The order of Keys in a HashMap is not guaranteed. So we should not reply on the order of the keys in a HashMap. 

    HashMap vs Hashtable

    Based on java doc, Java HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.

    [Read More...]
  • Java Map vs HashMap vs TreeMap vs LinkedHashMap

    Java Map

    Java Map is an interface with the following signature. 

    public interface Map<K,V>

    Here are some properties of Java Map:

    1. It defines an operation to map keys to values.
    2. A map cannot contain duplicate keys; each key can map to at most one value.
    3. The Map interface provides three collection views, which allow a map’s contents to be viewed as a set of keys, collection of values, or set of key-value mappings.

    4. The order of a map is dependent on its implementations. 

    [Read More...]
  • Java SortedMap Tutorial

    In this post, I will use examples to show how to use Java Sorted Map. A SortedMap is a Map that sort its entries in ascending order according to the keys’ natural ordering, or according to a Comparator provided at the time of building a SortedMap. SortedMap is useful if you want to quickly locate an entry that is larger than a specified key. For example, it can be used to implement a Consistent hash

    Since the entries in SortedMap are sorted by keys, all the keys in a sortedMap must implement the Comparable interface or provided a comparator when creating the SortedMap.

    [Read More...]
  • Count word frequency

    Count word frequency is a popular task for text analysis. In this post, I describe how to count word frequency using Java HashMap, python dictionary, and Spark. 

    Use Java HashMap to Count Word frequency

    {a=5, b=2, c=6, d=3}

    Use Python Dict to count word frequency

    The output:

    {‘a’: 5, ‘c’: 6, ‘b’: 2, ‘d’: 3}

    Use Spark to count word Frequency

    The above method works well for small dataset. However, if you have a huge dataset, the hashTable based method will not work. You will need to develop a distributed program to accomplish this task.

    [Read More...]
  • System design interview- consistent hash

    consistent hash virtual node

    Consistent hashing is a simple yet powerful solution to a popular problem: how to locate a server in a distributed environment to store or get a value identified by a key, while handle server failures and network change?

    A simple method is number the servers from 0 to N – 1. To store or retrieve a value V, we can use Hash(V) % N to get the id of the server. 

    However, this method does not work well when 1) some servers failed in the network (e.g, N changes) or 2)  new machines are added to the server.

    [Read More...]
  • Convert infix notation to reverse polish notation (Java)

    How to implement a calculator is a popular interview question. To answer this question well, you need to maser Stack data structures, convert an infix notation to RPN and evaluate reverse polish notation. 

    Since the input is usually in infix notation, e.g. “3 + 6 * 2”, it is difficult develop a program to evaluate it directly.

    In practice, we can implement a calculator algorithm into two steps:

    1. the first step is to convert your mathematical expressions, which is called infix notation, into Reverse Polish Notation (RPN),
    [Read More...]
  • Interview Questions: Edit distance

    Edit distance related questions are frequently asked by many large companies such as Google, Facebook, Apple, Microsoft, Uber, Airbnb, Amazon, et al. In this post, we describe three kinds of problems related to edit distance.

    What is edit distance?

    Edit distance is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances can be applied in natural language processing for automatic spell correction by selecting words from a dictionary that have a low distance to the word in question. 

    [Read More...]
  • Leetcode – LRU cache solution in Java

    LRU cache figure

    Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and set.

    get(key) – Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
    set(key, value) – Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.


    The popular method to implement an LRU cache is to use a bounded Queue. 

    The key to for implementing an LRU is to put any recently used data at the head of the queue.

    [Read More...]
  • Python Dictionary Example

    Python dict structure is similiar to Java’s HashMap or HashTable.
    It aims to store key value pairs.

    Here is an example:
    we build a dict to map a username to his profile information, which is a tuple

    username => (Full_name, gender, age)

    Search in Dict

    Now we can do a look up by username with O(1) time complexity.
    Search the user with username equal to ‘jim’

    Key not exist exception

    you may get key not exist exception when the username is not in the table

    Do key check

    We can get around this issue by an exist check

    Use get with a default value

    A better method is to use the get() method with default value.

    [Read More...]