Problem 1: TwoSum
The problem is as follows:
table = new HashEntry *[Hash_Table_Size]; //create an array of pointers, point to the HashEntry (key-value pairs) and let pointer-to-pointer 'table' to point to this array of pointers. Thus:
table[hash]->key/value means the key/value pointed by pointer located at 'hash' location inside the first-level pointer array which is indexed by the 'table'.
hash = HashFunc(hash+1, input_size); //this is used to iteratively search the hash table. Again, this is not a good function.
This is a two-way hashing:
(1). First loop to store the mappings;
(2). Second loop to search for the solution.
(2) Two-Way hashing using existing C++ hashtable library:

Given an array of integers, return indices of the two numbers such that they add up to a specific target.
You may assume that each input would have exactly one solution, and you may not use the same element twice.
Example:
Given nums = [2, 7, 11, 15], target = 9, Because nums[0] + nums[1] = 2 + 7 = 9, return [0, 1].
There are several ways to achieve this. The straightforward solution is to use brute-force search.
1. Brute-Force Search
C++:

To declare and create an object of a class, either following way is ok:
(1). Solution a; Num_output = a.twoSum(...)
(2). Solution *a = new Solution(); Num_output = a->twoSum(...)
The vector in C++ is written as {x1,x2,...}.
Indexing an element: nums.at(i).
Get vector size: nums.size().
Note: in C++, the int main() is a must. For main function it cannot be changed to other types except int.
Java:

1. Brute-Force Search
C++:

To declare and create an object of a class, either following way is ok:
(1). Solution a; Num_output = a.twoSum(...)
(2). Solution *a = new Solution(); Num_output = a->twoSum(...)
The vector in C++ is written as {x1,x2,...}.
Indexing an element: nums.at(i).
Get vector size: nums.size().
Note: in C++, the int main() is a must. For main function it cannot be changed to other types except int.
Java:

Use Solution a = new Solution() to create an object of a class.
To return an array, either way is ok:
(1) int[] ret_val = {i,j}; return(ret_val);
To return an array, either way is ok:
(1) int[] ret_val = {i,j}; return(ret_val);
(2) return new int[]{i,j};
Indexing an element in an array: nums[i]
Get array size: nums.length
In Java the public class name MUST match the file name.
Python:

Get array size: nums.length
In Java the public class name MUST match the file name.
Python:

The "if __name__ == '__main__'" section is used to run the .py program directly, otherwise it must be imported by another module to execute.
Note that in python we don't have ';' at the end of each sentence.
'self' corresponds to the newly created object. If X is a characteristics defined before 'def', then we can use self.X to refer to that characteristics.
range(a,b) in for loop means [a,b), which does not include b.
Instead of using {...}, in python we use ':' after for loop and if statement.
The array in python is represented as [a,b,...].
Use a = Solution() to create a new object of a class.
In python we don't have to always declare the type of data such as int. However in Java and C++ that is a MUST.
2. HashTable:
The brute-force way has a time complexity of O(n^2). We can use Hash Table to trade space for time.
C++:
(1) Customized Two-Way Hash Map (with a very simple hash function):
2 additional classes:
(1) HashEntry: create a map between pointers and their pointed keys/values
(2) HashMap: include initialization (constructor HashMap()), HashFunc(), HashInsert() and HashSearch()
Constructor: is not a function. It will automatically execute when initializing a new object of the class. The name must be the same as the class name:
class HashMap{
...
HashMap(){
//Initialization here
}
thus you cannot use XXX.HashMap because that is NOT a function. But we can use HashMap hm; or new HashMap() to create such object.
Note that here the '**table' is a pointer-to-pointer. The first level of pointers is to store the key-value mappings and point to them. The second level of pointers is to indexing the first-level pointers. Like:
table[hash]->key/value means the key/value pointed by pointer located at 'hash' location inside the first-level pointer array which is indexed by the 'table'.
Make the table private in order to protect it.
Hash Function is very important of hashing performance. Here I use a simple hash function (which is very inefficient as the result will show):
(key & (0x3FFFFFFF)) % input_size;
We need to use (0x3FFFFFFF) in order to avoid the hashing points to somewhere that cannot be indexed in the memory.
This is a two-way hashing:
(1). First loop to store the mappings;
(2). Second loop to search for the solution.
(2) Two-Way hashing using existing C++ hashtable library:

In C++ we have something called unordered_map to store the mappings. It has several methods:
m.insert(), m.find()
Note that m.find() returns an iterator (<key, value>) which is not an int. The iterator got->second returns the value of the search.
(3) One-Way Hashing C++:
One way means we store the mappings and search the solution at the same time.
The key idea is that we try to find the solution every time we insert a new element. If the solution is not found in this round, we insert the element into the vector and move to the next round.
(4) One-Way Hashing Java:
Everything looks quite same except that here we use HashMap instead of unordered_map (they are basically the same thing). The methods in Java are:
m.put(), m.get(), m.containsKey()
(5) One-Way hashing Python:
Hashtable in Python is implemented called dictionary. The methods are:
dict.get(), dict[j] = i (add), dict.has_key()
3. Common Map with Iterators (Similar as One-Way Hashing):
Instead of using hashmap, we can directly create a common map and store the mappings.
Here we create 3 iterators:
it: point to the current element
start: point to the beginning of the vector
it_rem: point to the solution
Note that '*it' means the value/element pointed by the it iterator.
vector has a push_back() method that appends an element to the end of the vector.
4. Sort the Array First (C++):
It turns out we can first sort the given array, and start searching from the two ends (beginning and last). This strategy will save some time during the search.
Two loops:
(1) The first loop is used to find the solution in after-sort array;
(2) The second loop is used to map the solution to the original array.
The sort function: sort(nums.begin(), nums.end())
5. Sort the Array First (C++ with Pointer Array):
We can use pointers as well. The idea is the same as 4.
for (auto &num:nums){
ptr.emplace_back(&num);
}
is the same as:
for (int i=0;i<nums.size();i++){
ptr.emplace_back(nums[i]);
}
In sort(ptr.begin(), ptr.end(), cpptr), cpptr is used to provide the rules of sorting.
6. Compress the Search Space (Java)
From the information of the array and the target value, we can first eliminate some impossible numbers.
The bound is calculated as follows:
low_bound = Math.max(nums_min, target-nums_max)
max_bound = Math.min(nums_max, target-nums_min)
Thus we create a new array whose length is max_bound-low_bound+1, and do the one-way store and search again.
7. Tricky Encoding (Java)
This is a uncommon solution which encodes both the index of the element and element itself into one value and stores them. The basic idea is to use a = x*n + b where b < n. Then x = a / n and b = a % n.
A small trick here is that since the element can be negative, thus one should consider this. In other words, a, x and b should be all positive or negative.
8. Performance
Brute-Force (C++) 146 ms
Brute-Force (Java) 37 ms
Brute-Force (Python) 5512 ms
Customized Hash (C++) 549 ms (Weird)
Two Way Hashing using Library (C++) 9 ms
One Way Hashing using Library (C++) 9 ms
One Way Hashing using Library (Java) 7 ms
One Way Hashing using Library (Python) 42 ms
Common Map with Iterators (C++) 16 ms
Sort the Array First (C++) 6 ms
Sort the Array First (C++ with Pointer Array) 9 ms
Compress the Search Space (Java) 4 ms
Tricky Encoding (Java) 6 ms









Comments
Post a Comment