This is a catch-up post for anyone who has been kinda, sorta, not really following along so far. Like a sleeper TV show in a bad time slot, I have to believe that lots of perfectly good nerds would be interested in Ancho if only they knew about it. And if I had published some code.
If you've ever backed off a risky decision because you just couldn't get a handle on what the odds were, Ancho is designed to be your thing.
If you find any of this interesting, please let me know. Follow @fortpedro on Twitter, or like the Facebook page. Leave a comment on a post. If you think I'm nuts to put time into something like this, let me know that too.
What is Ancho?
Ancho is a framework for running Monte Carlo and other dynamic, stochastic simulations. It is written primarily in Python and uses Cassandra as its data store. It is designed to allow regular people to use this extremely useful method for using what you do know about something to find out more about what you don't know.
"Monte Carlo" is the name for a group of methods that allow you to simulate systems where you know a lot about the factors involved and their ranges of probability, but you can't pin down exact values for them. Its first practical application was during the Manhattan Project. More recently, Monte Carlo simulations were used by Nate Silver of the New York Times to accurately predict the results of the U.S. Presidential election. In November I wrote a post that featured a simple example.
If Monte Carlo Is So Great, Why Haven't I Heard Of It?
I asked the same question. When I first learned about the method, I thought of several things it would be useful for: evaluating business plans, personal financial planning, and closer to my career, management of risk.
I looked at other tools you might use for this, and decided that they were either not up to handling real tasks, or so difficult to learn that they would only be used by people with graduate-level training in math and statistics. I don't have that kind of training, but I do have interest and chutzpah, and that counts for a lot.
I thought about what features I would want in a system like this. I think it should be a framework that handles all the computer science and tedium, and allows you to focus on modeling. I think it should be written in a general-purpose programming language that is widely known and easy to learn. And I think it should be designed from the ground up to run distributed in the cloud, so you can run non-trivial models in minutes rather than hours.
Sounds Good. Where's It At?
Ancho is mostly vaporware at the moment.
I had hoped to get some input on the design process, and I have gotten some from friends offline, but not here. Any input along those lines would be appreciated.
I'll probably have a code drop with tests by around the middle of February. The first code won't have any of the distributed functions yet, but it should be a decent first whack at the API you'd write models against. The point of that API, like any framework, is that the underlying implementation can change without making your code less useful. Once the basics are in place, I can change the implementation so it spools up a temporary cluster of machines in the background and runs your model much, much faster than it does on a single local machine.