Microsoft Interview Questions - 2
When I went to Microsoft for the second time, the interview was more pressurizing.
Interview 1 - The interviewer asked me a lot of questions about my project and the disucssions started with a lunch, which my interviewer hosted in an nearby restaurant. After more than 30 mins discussion about my favorite project, he gave a design question, which was not hard.
Q 1. Given a data store and multiple getters & setters, write a program to maintain consistency.
I gave out a simple queue concept, where getters and setter are allowed one by one to access the data store and in this case it would be like a semaphore. However, this would make the system inefficient by killing parallelism. So, I gave optimization, by adding 2 points:
1. If there are consecutive getters in the queue, allow them all in parallel
2. If there are consecutive setters, throw out all but the last, as they dont make any useful change and simulate the exceptions. I left the exception simulation for hardware and did these two things to have the data store with high parallelism and consistency.
The interviewer seemed to be pleased by the design.
The next interview was slightly more abstract. The interviewer grilled me on team working and bothered about the team-aspects of my favorite projet - a distributed file systems project. Then she gave a question:
Q 2. There are a number of tasks and they have their dependencies. Design data structures to represent them and write a program to schedule the tasks such that the dependencies are not violated.
I asked about other questions, like time deadlines, etc, but the tasks didnt have time deadlines and if a task had all its dependencies satisfied it can execute.
I designed a Directed Acyclic Graph (DAG) as I believed that it has to be a graph as tasks can have many to many relationship, it has to be directed as dependencies are always unidirectional and it has to be acylic so that it can be scheduled. To represent the graph, initially I took the adjacency matrix format, but soon realized that the graph will be very dynamic as new tasks keep coming and existing taks are scduled and removed. So, I designed based on adjacency lists and wrote a program that would run through the graph to find a node without dependency, schdule it and traverse the entire graph to find out its dependents and breaking the links. The interviewer was satisfied with teh design and after a long discussion (I dont exactly remember the details) the interview ended.
Q 3. The interview had a simple question, on which I stumbled initially. I was given a long string of numerals and have to write a funtion that checks if it contained all numbers from 000 to 999 (numbers are represented in 3 digits).
I first created a 1000 element hash array one for number from 0 to 999 and set it to false initially. Then I read through the array from 0 to len - 2 & in each iteration I take out i, i+1, i+2 characters, convert to number N and setting the element in the hash array[N] to true. At the end if all the elements in hash arry are true, then all numbers are there in the given string and the function returns true. Then, I did the folloowing optimizations:
1. To check all the elements in the hash array to be true, I have to go through the entire array in O(N) operations. Instead, I could just keep track of number of true elements in the array with a simple counter. Thus, everytime I set an element in the array to true from false, I increment the counter. If the counter reaches 1000, then it means that the entire array has true values and I can stop the fuction there itself, instead of goin through the entire length of the string. I can also eliminate the O(N) checking at the end.
2. The minimum length of the string that could have all 1000 numbers is 1002. I took a minute or two to realize this fact. I saw that the loop has to run atleast 1000 times to set 1000 true values, and to run 1000 times, len has to be atleast 1002. So, if the string length is less than 1002, right at the start we can return false.
3. To go a step further, everytime I find a duplicate element (the array[N] has true already) I increement a duplicate counter. At anytime, the length of the string > 1002 + d. If it is less than 1002 + d, then it means it wont contain all the numbers. This can effectively, reduce teh possibility that the entire string has to be searched.
The interviewer was impressed with the design and so I consider that it was positive.
Q 4. It was the famous tic-tac toe problem. Given a large N * N tic-tac-toe board (instead of reguard 3*3 tic-tac toe), the fucntion has to determine whether any player is winning in the current round, given the board state. Create appropriate data structures to represent the board.
I started out from the simplest of solutions. Take the board as a N*N array, go through all the rows and columns and diagonals to see whether they have same player's coins (for efficiency we can represent the two players as true and false and ignoring the free spaces, and if we XOR all the rows and then columns and diagonal and if get Negative (and full row or column) , then one of them wins). It was simple but was a O(2*N*N+2*N) = O(N^2). I knew i could do better.
I then had sentinels for each row, column and diagnals (2N+2 sentinels) for each player. Every time, I place a coin in the board in (x,y), I increment the row_sentinel[x], column sentinal[y] & diagonal[1] if x == y or diagaonal[2] if x == N-y-1; If any of these elements equals N, then the corresponding player wins. It is cleaner and effecient and any point I need to check just the 2N+2 sentinals and it is a O(N) operation. I still could do better.
I created a super-sentinal for each player that keeps track of max value among sentinels. Thus, everytime I update the sentinals, I would check to see if it greater than the value of super sentianl and if yes, I would update teh super-sentinal. Thus, at anytime to see if a player is winning, I need to check (super-sentinal == N). It is extremely simple and highly efficient and is a O(1) operation.
The interviewer seemed to be very impressed. I clinched the job.
Q 5. Now the interview with the Product Manager (a person responsible for many core aspects of MSDN). Last interview with a head of the group, is always a positive sign and so I was upbeat. But, the question was extremely tricky and one of the stunning problems with a very simple solution.
There is an array of odd and even numbers. Now, sort them in such a way that the top portion of the array contains odd numbers, bottom portion contains even numbers. The odd numbers are to be sorted in descending order and the even numbers in ascending order. You are not allowed to use any extra array and it has to use a conventional sorting mechanism and should not do any pre or post processing.
I'll give you a clue (which I had to secure after a painful 20 mins).. You can use operator overloading and write the solution in 5 lines.. If you solve now, it would still be great, because I solved it after 20 mins and with a lot of hints.
Interview 1 - The interviewer asked me a lot of questions about my project and the disucssions started with a lunch, which my interviewer hosted in an nearby restaurant. After more than 30 mins discussion about my favorite project, he gave a design question, which was not hard.
Q 1. Given a data store and multiple getters & setters, write a program to maintain consistency.
I gave out a simple queue concept, where getters and setter are allowed one by one to access the data store and in this case it would be like a semaphore. However, this would make the system inefficient by killing parallelism. So, I gave optimization, by adding 2 points:
1. If there are consecutive getters in the queue, allow them all in parallel
2. If there are consecutive setters, throw out all but the last, as they dont make any useful change and simulate the exceptions. I left the exception simulation for hardware and did these two things to have the data store with high parallelism and consistency.
The interviewer seemed to be pleased by the design.
The next interview was slightly more abstract. The interviewer grilled me on team working and bothered about the team-aspects of my favorite projet - a distributed file systems project. Then she gave a question:
Q 2. There are a number of tasks and they have their dependencies. Design data structures to represent them and write a program to schedule the tasks such that the dependencies are not violated.
I asked about other questions, like time deadlines, etc, but the tasks didnt have time deadlines and if a task had all its dependencies satisfied it can execute.
I designed a Directed Acyclic Graph (DAG) as I believed that it has to be a graph as tasks can have many to many relationship, it has to be directed as dependencies are always unidirectional and it has to be acylic so that it can be scheduled. To represent the graph, initially I took the adjacency matrix format, but soon realized that the graph will be very dynamic as new tasks keep coming and existing taks are scduled and removed. So, I designed based on adjacency lists and wrote a program that would run through the graph to find a node without dependency, schdule it and traverse the entire graph to find out its dependents and breaking the links. The interviewer was satisfied with teh design and after a long discussion (I dont exactly remember the details) the interview ended.
Q 3. The interview had a simple question, on which I stumbled initially. I was given a long string of numerals and have to write a funtion that checks if it contained all numbers from 000 to 999 (numbers are represented in 3 digits).
I first created a 1000 element hash array one for number from 0 to 999 and set it to false initially. Then I read through the array from 0 to len - 2 & in each iteration I take out i, i+1, i+2 characters, convert to number N and setting the element in the hash array[N] to true. At the end if all the elements in hash arry are true, then all numbers are there in the given string and the function returns true. Then, I did the folloowing optimizations:
1. To check all the elements in the hash array to be true, I have to go through the entire array in O(N) operations. Instead, I could just keep track of number of true elements in the array with a simple counter. Thus, everytime I set an element in the array to true from false, I increment the counter. If the counter reaches 1000, then it means that the entire array has true values and I can stop the fuction there itself, instead of goin through the entire length of the string. I can also eliminate the O(N) checking at the end.
2. The minimum length of the string that could have all 1000 numbers is 1002. I took a minute or two to realize this fact. I saw that the loop has to run atleast 1000 times to set 1000 true values, and to run 1000 times, len has to be atleast 1002. So, if the string length is less than 1002, right at the start we can return false.
3. To go a step further, everytime I find a duplicate element (the array[N] has true already) I increement a duplicate counter. At anytime, the length of the string > 1002 + d. If it is less than 1002 + d, then it means it wont contain all the numbers. This can effectively, reduce teh possibility that the entire string has to be searched.
The interviewer was impressed with the design and so I consider that it was positive.
Q 4. It was the famous tic-tac toe problem. Given a large N * N tic-tac-toe board (instead of reguard 3*3 tic-tac toe), the fucntion has to determine whether any player is winning in the current round, given the board state. Create appropriate data structures to represent the board.
I started out from the simplest of solutions. Take the board as a N*N array, go through all the rows and columns and diagonals to see whether they have same player's coins (for efficiency we can represent the two players as true and false and ignoring the free spaces, and if we XOR all the rows and then columns and diagonal and if get Negative (and full row or column) , then one of them wins). It was simple but was a O(2*N*N+2*N) = O(N^2). I knew i could do better.
I then had sentinels for each row, column and diagnals (2N+2 sentinels) for each player. Every time, I place a coin in the board in (x,y), I increment the row_sentinel[x], column sentinal[y] & diagonal[1] if x == y or diagaonal[2] if x == N-y-1; If any of these elements equals N, then the corresponding player wins. It is cleaner and effecient and any point I need to check just the 2N+2 sentinals and it is a O(N) operation. I still could do better.
I created a super-sentinal for each player that keeps track of max value among sentinels. Thus, everytime I update the sentinals, I would check to see if it greater than the value of super sentianl and if yes, I would update teh super-sentinal. Thus, at anytime to see if a player is winning, I need to check (super-sentinal == N). It is extremely simple and highly efficient and is a O(1) operation.
The interviewer seemed to be very impressed. I clinched the job.
Q 5. Now the interview with the Product Manager (a person responsible for many core aspects of MSDN). Last interview with a head of the group, is always a positive sign and so I was upbeat. But, the question was extremely tricky and one of the stunning problems with a very simple solution.
There is an array of odd and even numbers. Now, sort them in such a way that the top portion of the array contains odd numbers, bottom portion contains even numbers. The odd numbers are to be sorted in descending order and the even numbers in ascending order. You are not allowed to use any extra array and it has to use a conventional sorting mechanism and should not do any pre or post processing.
I'll give you a clue (which I had to secure after a painful 20 mins).. You can use operator overloading and write the solution in 5 lines.. If you solve now, it would still be great, because I solved it after 20 mins and with a lot of hints.