DataStax (Cassandra)
1) Describe some challenges of provisioning software across a datacenter consisting of thousands of nodes
2) How would you go about programming the Kevin Bacon problem?
Couchbase
1) In databases, what is the difference between a delete statement and a truncate statement?
MongoDB
1) How would you design an online chat system with separate room
2) The input array was sorted integers, how to square all the elements and keep it sorted?
3) Taking in two strings, find out if they are anagrams
4) Find the largest prime divisor of a number
5) Given a string of parantheses, brackets, and curly braces, write a function that returns whether the string is well balanced, in that every left delimiter is closed by the correct right delimiter
6) Print the prime factors of the input number
7) How various search algorithm O time change if done in parallel on multi cpu machine
8) How many gas stations in the United States?
9) How would you write DB write / access functions to make sure data is not modified by threads.
10) Parse mathematical expression given as chars in array ['1', '+', '2,' '/', '4']
11) Write algorithm to find loop in graph. The graph is unidirectional with any number of connections. The graph is not necessarily connected
12) Write code to convert integer to string.
13) Fizzbuzz
14) Find the height of a binary tree
15) Parsing parenthesis. Given a set of open and close parenthesis, make sure the string is valid, such that each open parenthesis has a matching close parenthesis in the correct place
16) We have a person who logs into a website using their Facebook credentials. Anything they post to that website can be seen by their friends and those friends' friends only.
17) Name several different sorting algorithms and rank them by their computational complexity.
18) Write an algorithm that tells if 2 and only two numbers in a 1x3 array are the same.
19) Given a word, how to find all anagrams in a dictionary.
20) Reverse a linked list
21) How much water is on the planet (earth)?
Cloudera (Hadoop)
1) Design gmail from the ground up
2) Write a function to do the "Exponentiation" . i.e power(a,b) where a is raised to b. Eg: power(3,2) gives 9. power(5,2) gives 25 etc
3) Find the common elements in two arrays
4) Grid search algorithm and analyzing its time complexity, and then various alternative implementations that would minimize data storage or the order of the algorithm.
5) Given an m*n matrix with all its elments=1. Also given a list of (x,y) points. The question was to make the elements of xth row and yth column as zero. Optimise it.
6) Array Pair sum, provide O(N) solution
7) Recursive Permutation
8) Implement few methods of a Linked-list (like append, getAtIndex, removeAtIndex, size)
9) Find the total number of 1s in a byte array
10) Generate a random 4 letter word from /usr/share/dict/words
11) 1TB of data on my laptop --> sort the data
12) Distributed Merge Sort algorithm
13) A file contains a billion integers, try to find any one integer that is not in the file.
14) How would you find two numbers that add to a sum in an array?
15) How would you find three numbers that add to a sum in an array?
16) How would you implement hash table on your own? Write the code for implementing your own hash table?
17) If you wanted to make a highly concurrent cache with a least recently used replacement policy, what data structures would you use? How would this scale per number of threads?
Hortonworks (Hadoop)
1) Word sorting in a file
2) Binary search
3) Implement file system using class
4) Implement hashmap
MapR (Hadoop)
1) Implement a hash table
2) Determining if a tree is a valid binary tree
3) Write a recursive algorithm( return true if there is a path from root to a leaf with total sum == sum)
4) LRU Cache
5) There are numbers from 1..n in a list of numbers with size (n+k) with k duplicates. Print k duplicates.
6) Read from a huge file say 1tb and write it into a huge file and scale it
7) BFS, insert line breaks after every level (not necessarily balanced)
8) Clock angle question
Basho (Riak)
AeroSpike