Problem solving in the real world

Are complex algorithms relevant in the real world?

I have been asked this in a variety of tones: some curious, some anxious, and some condescending. Let’s see…

Training Project

Morgan Stanley, November 2014.

My friend was drawing on the board while I looked on. We were in different teams, each assigned a project. His program currently took 5 hours to execute. The requirement stated one hour. As the ‘algorithm expert’, I was called for help.

The program was to find the friends of all users in a social network. After explaining the use-case, my friend showed me his approach. I was very determined to find an optimization, hence I did. Immediately.

“An adjacency matrix?! No, use an adjacency list. The execution time will reduce in your case to O(N). Or you could use an adjacency tree! That would – blah blah….”

I made some calculations, and guaranteed that the program could not take more than a second to run. My friend hurried to his computer, and I got back to my project. He came back the next day. The program had taken 5 hours again.

There must be a bug in his code. Looking into it, I found the problem. And it wasn’t any bug.

“You are making a DB call for each user?”
“Yeah.”
“Where is your DB?”
“California, I think.”
“….”

Each DB (Database) call was traveling halfway round the world to query a database before heading back. Every call took 5 seconds, and there were 1700 users. To add to it all, the DB gateway got suspicious at the large number of requests from a single client. It was throttling them.

“A stored procedure will do the job. You can code the algorithm into it.” 

A stored procedure, like the name suggests, is a piece of code stored in a database. It executes within the database and sends results in a single call. This would avoid all the back and forth.

Within a few hours, his team was celebrating. Their program’s execution time had dropped to 12 minutes.

Meanwhile…

My own project was client facing. The little backend work was coded in a script, written by the project manager.

From the start, I found the code disproportionately long. 300 lines to make a simple DB call? Here is the procedure:

  1. The database table was being queried entirely.
  2. Stored in a map.
  3. Filtered on a date range.
  4. Sorted according to the dates.

I changed it to:

  1. The database table was queried on a date range, with the results ordered and returned in a list. Only the relevant columns would be sent back.

It looks like I have just squashed the previous list into a single line. That makes a big difference though. Now, the sorting and filtering is at the DB layer and the network load is significantly smaller. The program speed went up by 400%.

I soon realized: it didn’t matter. Our data set was small, requiring no optimization. The previous script ran almost as fast, and I never mentioned the changes to the project manager.

Final part

I am now studying system design in detail. This involves understanding how organizations write software to handle millions of users. Specifically, how do large organizations stay available, reliable and fast. This despite the enormous load on their machines.

When going through eventual consistency, consistent hashing, etc… I started appreciating algorithms. Algorithms enabled these systems. They were saving millions of dollars for firms worldwide. And algorithms allowed me to understand and appreciate the complexity of these systems.

“You rarely code complex algorithms in the real world” is a gross simplification. Most problems in the world are easy to solve. A more accurate statement would be: “You will rarely face a complex-solvable-monetizable problem in the real world”. And that is, of course, true. But the number of adjectives in that statement make it trivial.

Scene 1 and 2 encourage me to:
1. Not be quick to make conclusions.
2. Remember optimizations need not be algorithmic.
3. Keep the bigger picture in mind.

But scene 3 reminds me that, all around the world, we are solving difficult problems through study and ingenuity.

Thanks for reading!

I have changed some of the project scenarios in this post, to avoid revealing private data.

 

Advertisements

‘Follow your passion’. LOL.

About a decade ago, I was given three choices: Science, Commerce and Arts. A passive kid, I picked science. After all, it’s what everyone does. 

Two years later, I found myself applying for colleges. Irritated that I had to think and choose again, I looked at the most popular options.

  1. Computer Science
  2. Electronics and Telecommunication
  3. Mechanical

That must mean something, I thought to myself, before copying the list into the application. 

I knew nothing about computers, save for Age of Empires and FIFA. With little exposure to the outside world, I had a romanticized view of ‘passion’ and ‘brilliance’. If this failed, I could always excuse myself by saying: I was never interested in all thisA few months later, college. 

Kalpana Deorekar was our computer teacher, and an angel. She did a fabulous job explaining a computer’s model to us. Her encouraging attitude towards problem solving and sincere involvement in correcting answers deeply impacted me. Now the computer seemed interesting, at least as much as playing lego.

On the other hand we had our disinterested electronics teacher. She never raised or lowered her voice when speaking. I always felt sleep-starved when hearing her voice.

Still, I should have worked at electronics. I shouldn’t have failed on the very first semester. But there I was, in front of my father, giving him my report card. I expected a scolding, a few insults from him and a few comebacks from me.

He smiled instead.

“Just one?” he asked. I nodded cautiously, incase he was being sarcastic. But he was glad that I hadn’t flunked more. For the first time, my father congratulated me on my results. I felt strange, and soon, empowered.

I started taking responsibility, and with it: control. Having cleared electronics on the second attempt, I had three years to turn myself into an engineer. My summer vacations were spent learning about Java and getting programs to work. It was hard, and the console doesn’t break bad news nicely.

But within three months, I was at par with students who had a CS background. My teachers found my enthusiasm refreshing. In my fourth semester, one of my optional projects had some AI, and it managed to beat my computer graphics professor at tic-tac-toe.

My love for chess took fruit here. Being part of the CRCE chess team, I went to MIT Pune and got to witness an amazing show of sports talent from across India. I returned to college to win the bronze medal, having lost to our chess captain in the semi-finals. In hindsight, the loss was a good thing; it reminded me that competition was to be dealt with a cool head and that I should never underestimate my opponent’s competence. I would need a reminder soon…

In my third year, there was a project contest by IBM. It was here that I learnt the value of coding in an organized manner. My friend, Kushan Sen, and I worked very hard to complete the project on time. We would stay back after college everyday, for 2 to 4 hours, working on optional features.

Placement season began and due to some reason, the minimum cut-off was reduced to 60%. That meant I could apply to companies. My father asked me to just enjoy the experience. JP Morgan and Morgan Stanley soon sent me their offers.

My last intra-college chess tournament was interesting. Each player gets 20 minutes in their time pool. Everyone believed I would win, and I let it get to my head. Before a match, I would strut around college. Storming into the room, I would annihilate my opponent within 2-3 minutes.  They would barely hit the clock before I slammed it again. My fastest win took 13 seconds.

One of my opponents was a guy who I had beaten the previous year. Of course he must be scared. Look at him, his hands are shaking.

And…disaster. I lost my rook. My crush was watching this. How could I let this happen? I managed to then win it back, with an extra pawn to boot. Yes! I am winning! He then played a pawn ahead, threatening my queen. I was mentally blind. I was obsessed with the result. I took it.

The pawn was guarded by his rook.

I just lost a queen. For a pawn. Those of you who play chess know that this is the time to resign. I looked away, dejected. I wanted to overturn the board. You bastard, I just gave you a free pass. Furious, I looked up.

My opponent was shivering.

I turned around to see his clock reading 41/2 minutes. Mine still had 15 minutes left.

My hand flew, playing my move and smashing the clock. He played his and returned the smash. But speed was my territory. We threw moves at each other till he blundered, and I won a bishop in compensation.

I could feel his fear as he slowed down. This gave me more time to think and less for him in turn. 1 minute left, his legs were shaking, and I was still holding the fort.

30 seconds…everyone in the audience was leaning in to look into the board and our faces. I sensed it, but didn’t register. Focus on the game, Sen.

The last 10 seconds were filled with the clock bouncing on the table as we slammed our moves into it. He was on the verge of winning, when the clock’s flag fell.

There were cheers as we both covered our faces. I did so in relief. That year, I would go on to win Gold.

There are many lessons that I took from those games and my graduate course. True intellect is bereft of ego and is ever-willing to share. I learnt that hard work is more important than romanticized views of ‘passion’. ‘Passion’ is often an escape route, and what we achieve is determined not by an exam but by our continuous choices.

And I also learnt, that teachers inspire us. In a profound way.

Thanks for reading!

Challenge Problems

In this blogpost, we talk about what Challenge problems are and how to solve them. I find them the most attractive questions in a long contest. However, students new to competitive programming often avoid them because they seem weird at first. Lets try and change that.

What are Challenge problems?

A challenge problem in a programming contest uses NP problems to test a candidate. With no perfect answer, the candidate is tested on how good he/she is at finding approximate solutions. This is why the evaluation is score based.

The top scorer is given a score of 100 while the others are given scores as per their closeness to the top score. Usually, problem statements are simple and allow candidates to find interesting nuances while solving. Because of the generality of the problem, a candidate can choose numerous approaches to optimise his/her score. These include Number Theory, Graphs, Data Science, etc…

Some problems require prior knowledge of algorithms, but the best challenge problems are those which have simple explanations and lots of scope for improvement. Such problems allow beginners to try and test their approximate solutions while making sure that the experienced players are tested too.

How do we solve this

MIT professor Patrick Winston makes a brilliant note on problem solving at 42:30 in this video. Here is a summary:

  1. Definition
    Does the problem reduce to Knapsack? Subset sum? Something else?
  2. Representation
    Should the main data structure be an array, a list or a tree?
  3. Approach
    Dynamic Programming? Graph Algorithm? etc…
  4. Algorithm
    DFS / Sorting / Edit Distance –> Simulated Annealing / Genetic Algorithm
  5. Experiment

A key point is that drawing and discussing ideas are far more important than jumping into code. The other super important thing is you can’t follow this process in just one shot. It has to be repeated again and again, with each iteration improving your solution.

First, you need a base solution. Choose a simple, clear cut algorithm(like the first three mentioned below point 4). This is the foundation, setting a minimum guarantee to your score. Invest your time in this, because a weak base solution will directly result in a poor score.

You now need to tweak the best solution you have, updating if you get something better than the current best. Hill climbing, Simulated Annealing or any Evolutionary Algorithm will help you here. Most of the time after this will be spent improving the parameters of the algorithm and improving the time complexity.

Time required to solve them

optimise-challenge-problem

They can’t be solved in half a day, to be practical. Unless you are looking for an okay score and minimum effort, be prepared to invest significantly.

Also beware of the law of diminishing returns. In general, the more time you invest in these problems, the smaller improvements you will see. The first few hours will give you interesting solutions => perhaps a score of 75 – 85, the next few days might take you to 95-100. And staying there takes tons of effort.

The primary reason why these problems are difficult is because we have to customise and optimise our solutions every time. You can create a general framework for Simulated Annealing to help in the coding phase. However, the ‘representation’ part will change every time.

Upto what point do I keep optimising

Until you feel that there is very little progress being made. To be practical, the other problems will give you many more points for the same amount of effort. When you hit problems beyond your league, come back to improving on the challenge problem solution. Remember that the people at the top have access to the same resources as you do. There is no reason why they should get a better score than you.

When should I start with this problem

This problem will stay at the back of your mind throughout the contest. So it is better to start a little late. More often than not, the problems within our league are solved within the first 80% time of the contest. The challenge problem can be started then.

Final tips and tricks:

batman-robin-challenge-problem

  1. Do not jump into meta-heuristics (Genetic Algorithms, Simulated Annealing) without creating a strong base solution.
  2. When optimising, try to keep the code clean for later.
  3. Your nested loops need the most attention, because that where you get maximum gains.
  4. Greedy is good. Keep your heuristics simple and easy to change.
  5. If some idea feels too complicated, it is. Save it for the next contest, after reading a little more and being confident about it.

I speak about other topics from competitive programming on my Youtube Channel. You can check them out here.