pip vs easy_install

The answer

There is a huge amount of confusion surrounding pip, easy install, disutils, setuptools, and the whole range of other tools out there for python.  If you don’t care about the history, debate, and banter I’ll list the answer first and then go into details:

The “Official” recommendation for python is currently:

  • Use pip with virtualenv to install packages
  • Use setuptools to define packages and distribute to the pypi index

Reference: https://python-packaging-user-guide.readthedocs.org/en/latest/current.html

(Updated: 2014-04-09)

How do I use them together:

Here is a recent (2013) well-written article that goes into the finer points of setuptools and pip requirements (abstract and concrete requirements). https://caremad.io/blog/setup-vs-requirement/

But setuptools is bad

For those of you still reading you might wonder how the hell setuptools is the recommended distribution tool when it is not backwards compatible with the standard lib disutils and in fact promotes using non-compatible APIs. Well its pretty simple. The standard library disutils didn’t provide a mechanism for dealing with package dependencies (a fairly critical flaw), no one fixed it in the stdlib and it looks like the python community just gave up on staying backwards compatible with “pure” python. After all most of us using python are doing so for profit and could care less about the nerd battle going on between the 5 different ways of distributing python software.

Here is a nice rant from 2008 that was from a purist (with good points) but ultimately lost: http://www.b-list.org/weblog/2008/dec/14/packaging/ 

But setuptools was meant to be used with easy_install

Setuptools ships with easy_install as the package installer so how can the recommendation be to use pip? Well pip uses setuptools under the hood (except when building wheels — a topic for later) EVEN if you use disutils. I suspect the case for this is that setuptools works with disutils but not the other way around. It also provides additional dependency support (reading from requirement.txt files)

Here is an update by that same guy a day later in 2008: http://www.b-list.org/weblog/2008/dec/15/pip/

So what are we left with for people who want to build production applications that make money?

Use:

Don’t use:

  • disutils
  • distribute (merged into setuptools)
  • bento (it looks like it stalled)
  • or really anything else

Ansible: Gotchas deploying to ubuntu

Ansible (http://www.ansible.com/home) is a pretty cool new system for deploying systems. While using it i pulled out the gotchas that took up some of my time. Many are just unix admin details but some are ansible specific.

Gotcha #1: Shell Types and environment setup

Depending on what kind of shell you are using you might be surprised to learn that some of the files you expect to be sourced aren’t and your environment is messed up. Ansible has taken the stance that you should declare your environment requirements in the playbook rather than sourced shell files. However there are cases where that just isn’t possible and in that case its very important to know what gets sourced when and how to trigger different shell types.

In bash you can:

  • enable a login shell with “-l” (ell)
  • enable a interactive shell with “-i” 
  • with sudo you can enable an interactive login shell for a user with “sudo -iu username”

What scripts get read with what shell type can be found here: https://github.com/sstephenson/rbenv/wiki/Unix-shell-initialization

Gotcha #2: Command and shell are different

Ansible offers up two modules for executing arbitrary commands. The “command” module is well documented and states that it can’t handle piping, shell operators or much else. Its pretty much only good for a single non complex command. The shell command defaults to “sh” which is a POSIX standard. However for different *nix you may have different shells symlinked to that executable:

On ubuntu > 6 you have “dash” which is not “bash”. See https://wiki.ubuntu.com/DashAsBinSh

More on this here: http://stackoverflow.com/questions/5725296/difference-between-sh-and-bash

Gotcha #3: RVM

RVM is a pretty nice ruby version manager but it has some subtle issues regarding the first two gotchas.

  • RVM works best in bash or zsh which isn’t what ansible guarantees when you run a shell.
  • RVM needs its initialization script normally ~/.rvm/scripts/rvm.sh to be sourced for RVM to be added to the path
  • RVM functions like “rvm use” will not work by default on noniteractive shells even if you source the initialization script. Instead you can source a particular ruby OR use the RVM binary option
source $(rvm 2.1.2 do rvm env --path)

OR

rvm 2.12 do rvm gemset create my_gemset

More on this here: https://rvm.io/rvm/basics#post-install-configuration and here https://rvm.io/workflow/scripting

Gotcha #4: .bashrc files that return quickly for non interactive shells

Watch out for pregenerated  bashrc files that automatically return when the shell isn’t interactive. Often there is a small line at the top that reads 

[ -z "$PS1" ] && return

that will automatically return if you aren’t in an interactive shell. This will bite you if you add necessary sourced files to the bottom of your bashrc hastily.

More on this here: https://rvm.io/rvm/basics#post-install-configuration

Gotcha #5: sudoers file isn’t setup correctly

This one is pretty obvious but you should know that when you sudo anything everything from what environment is inherited to what programs you can access is determined by your /etc/sudoers file. Beyond not setting it up correctly you can also have add a type to it. If modifying it with ansible its recommended to validate it with lineinfile

More on this here: http://docs.ansible.com/lineinfile_module.html

More gotchas as i hit them will arrive here

Implications of no free will

I have recently come to believe with high probability that free will is simply a misunderstanding of the mass complexity in our universe and that free will does not exist, as it is normally defined. I rationalize my belief through induction and that the universe (macro, micro, subatomic, etc.) does not shift randomly when observed without rhyme or reason at the smallest time deltas available. 

If you have no free will, meaning that your very next action is predestined, then a seemingly logical progression is that the step following the next is also predestined. If so then everything you will ever do (no matter how many times you reflect on it … weird) have already been “done” and you are just waiting to experience it. 

However if you consider the implications of this idea, it leads to a rather confusing outcome where definitions of concepts like a soul, life, time and fate all seem to solidify.

  • Your soul then can be defined as a constantly changing but always known function of your DNA and the environment you have been exposed to so far.
  • Your life can be likened to a movie that has already been taped and that if accurately modeled enough in a computer could be fast-forwarded and predicted with 100% accuracy.
  • Time is then defined as nothing more than a measure of distance in that movie and doesn’t imply uncertainty in the future as it always has for me.
  • Fate is simply a fact.
  • What you will do is what you have already done but you are just waiting to experience it.

I find it interesting that humans have the ability to do predict things naturally (albeit limited and sometimes flawed) and that perhaps our projections of our futures (especially as we get older) may just be more reasonable than I previously thought

I also wonder if we could get a quantifiable metric on how well a humans can predict the future based on recorded interviews of individuals conducted at differing lengths in the past (1,2,5,10,20 years etc.). My guess is pretty well. Doing a quick mechanical turk survey might just tell you a bunch about your future (not that you can change it or anything) :) 

When SOA is appropriate and when it’s not

image

I have had this itch to investigate the pros and cons of service oriented architecture (SOA) and its derivatives (ROA, WOA, etc) for many years but never really cared enough to do it. Usually because many of the projects I built didn’t use it or the decision had been made a long time ago and it was pretty much irreversible at that point. 

My experience with SOA systems has been at medium/large sized companies and generally I found that it worked sufficiently well for them. However SOA didn’t seem like a vastly superior solution relative to the architectural patterns I had been exposed to over time. SOA just had a different set of trade offs that really just made it feel like a different tool in the tool box. 

Now, I deliberately try to avoid religious wars with tools and somewhere in my mid 20s I stopped caring about what they did in “theory” as well. I care much more as a technically competent entrepreneur about what actually happens when you implement a solution across all of the business and not just the problem at hand (think hiring, execution risk, business flexibility, time to develop, etc).

Now, with my history and viewpoint clearly established i’ll jump into my analysis. 

Given my experience at medium/large companies where SOA is used internally to power the product here are the complaints of SOA that I have heard/experienced:

  • Services are only as good as the best architect on that service team.
  • Its expensive to move engineers from team to team because each service completely different.
  • If a service has a bug you have to wait for the service team to fix it.
  • It is incredibly hard to test changes across multiple services.
  • With each incremental service you add your client requires more integration code (different payloads, protocols, errors etc).
  • When a service introduces breaking changes to an interface it is hard to find who all the clients are and just how big of an impact those changes will have downstream.
  • When there are a nontrivial number of services its difficult to find what services exist and what functionality they expose.

Now let’s go over how SOA is meaningfully different from the traditional object oriented architecture (OOA).

  • You have the flexibility to use different everything in SOA (languages, libraries, operating systems, protocols, web servers, testing tools, etc) per service where as in traditional OOA you typically stay in one language have a list of common dependencies, on a single stack.
  • You usually have stateless communication between components where as in OOA you implicitly have stateful communication with pass by reference mechanics.
  • With SOA everything that is exposed publicly is done intentionally and its harder for “client” engineers to break service abstractions although that doesn’t stop the “owner” engineers from doing it on occasion.
  • In theory you get to deploy your service independently of others (until you make changes that break other clients, which happens frequently enough in practice for me to mention it here)
  • Strict permission control over code bases if you have contractors etc. 

With these differences SOA has the following advantages over OOA in my mind:

  • You can adopt new technologies independently from the rest of the org.
  • Monkey patching, accessing private methods, or otherwise inserting hacks into code you don’t really own is harder to do.
  • Prevents code commit conflicts and enforces ownership contracts.
  • With stateless communication implementation *can* be simpler, it certainly makes threading easier.

Here is my list of disadvantages of SOA vs OOA:

  • Its much more complex and therefore slower to execute initially.
  • You typically need more engineers for the same amount of work.
  • You need sr engineers and architects proportional to # of services.
  • It requires tooling for service discovery, registration and testing.
  • Moving engineering resources has larger fixed costs.
  • Cross service development requires more coordination.
  • Without solid architects there is significant execution risk.

I think SOA is really appropriate for large teams working on complex systems at medium/large profitable “cool tech” companies where the costs can be amortized over a longer time periods with great engineers without threatening the success of the business. I also think SOA is appropriate for communication between companies where each company is a service. At that level you really don’t lose much because few of the OOA benefits are applicable.

If SOA matches your company’s profile then i highly recommend taking a look at this: http://www.infoq.com/presentations/twitter-soa

If your company doesn’t match the criteria above then I think OOA is probably right for you. It is faster to start, easier to understand, more flexible from an engineering management perspective, and allows you to move much faster with smaller teams. I would also say that even if you are planning on being a huge successful company with 100s of engineers its not worth investing in SOA until you actually feeling the pain of your immense success.

Small aside: SOA’s tradeoffs actually remind me a bit of database sharding but for engineering teams. 

(Lead Image source: https://tech.bellycard.com/blog/migrating-to-a-service-oriented-architecture-soa/)

First post in about a year

After a year long hiatus i have finally reconstructed my blog. There is a lot to talk about and not a lot of time to do it in. First thing is to put this placeholder in and start writing later this week.