Hi! I'm Michael DeHaan and I'm probably best known for creating and leading Ansible, but I've also created and helped to create a few other open source automation systems. I've since moved on from writing tools for the IT automation world, and my new project is SourceOptics which is all about mining the interesting data and trends in our source code repos.
Source Optics started as a project in a university class I help with at NC State University and was initially developed by a great team of students, and I'm continuing to evolve it to cover some new use cases and new features.
A bunch of reasons I guess...
Before Ansible, I had a popular open source project called Cobbler, which was a PXE-based install server (as well as a DHCP and DNS manager) for bare metal server installations. I wanted to have another open source project, but I was also using Puppet at work at the time (I worked for them for a short while, actually) and wanted alternatives. I also had some ideas of building a multi-node deployment system for a long time - something that could do both config management and deployment tasks equally well - and wanted to give that a shot.
I figured it would be a really simple project, and if it had 40 users I would be exceptionally happy. It took off in ways that I couldn't have anticipated at all.
Ansible started out as a nights and weekends thing and it was crazy after just the first month or so. In the first few days, I'd work on it a few nights a week and, as interest grew, it gradually became much busier.
I was able to jumpstart it by emailing my old mailing list for Cobbler ("hey I've got a new project") since I knew many of those users would be interested in trying out a new configuration management idea. The thought that some people I respected liked the idea gave me the energy to keep working on it, and somehow it spread virally from there. I'm really indebted to those folks and their willingness to try out and help improve (and share ideas on) something so new. If it wasn't for their enthusiasm, I would not have gone on with it. I was enjoying all of the conversations so much more than the code itself.
It's been a while (2012) so I don't quite remember everything.
Some of the early "viral" spikes came from asking High Scalability if I could do a short guest blog, and that then got some of the original reposts on Hacker News and Reddit.
I didn't actually give a presentation outside of North Carolina in the first year, I just let the internet do it for me, which was pretty cool. One of the first presentations I recall reading about was at a library in the Midwest. There were a lot of local user groups and meetup group presentations that happened, and nothing was ever coordinated, it was just people liking the tool and sharing it and saying whatever they wanted.
Having really good documentation was also key, as this meant someone who tried it out could be successful in 30 minutes. It meant writing for multiple audiences, making sure you explained almost everything and anticipating all of the questions that might be asked. You have to realize that everybody learns differently, and then try to supply everybody with what they need to learn. So many websites, even today, don't do this well. It felt like I spent about 40% of my time on documentation then, and folks would then write their own tutorials that were more or less just reblogged versions of the docs, and every time a new one of those came out it would get shared online (Twitter or Hacker News or Reddit) and more people would see it. People would also give their own presentations to user groups and a lot of it spread that way.
These strategies don't seem to work as well today - people are more distracted and busier. I'm still working out what the new strategies are, but I think they involve being Google or Facebook. Maybe it works differently in different application verticals though.
Docs help. Making sure the code isn't super clever and is easy to understand helps. Making sure you write great explanations about all the things you have added to the code (like to a mailing list or blog) helps. Ultimately though, I think everyone was feeling the same kinds of needs - the need for simpler automation systems - around that same time in 2012, so they were on the edge of building them anyway, and they wanted to help out.
Most of Ansible's contribution happened around plugins, which were very easy to learn how to write, and I still think that plugin modules are a great strategy if you want lots of contributions.
Do a little bit of everything, I guess, and don't focus exclusively on anything. You kind of have to be paranoid, anticipate everything that might go wrong, and try to keep the wheels on the bus. You have to wear all of the hats - architect, product manager, marketing, writing, graphic design, tester, etc, all at once.
I tend to think product management and communication was the most important thing. For product management, that meant listening to everyone. I think I read almost every email, tweet and blog post anyone ever wrote. I tried to answer questions wherever they came up, to quickly squash misconceptions, as at the time users of some competing tools were being told stories about configuration management principles that weren't exactly true in my opinion.
Disagreements were usually pretty easy to manage, but not always. When I had to say no to someone, I tried to find a compromise where I could. But if not, I think I at least tried to explain myself and chart the course that kept the most people happy. I don't believe OSS work can easily be assigned to anyone though. OSS contributions were always about something that person needed themself, which is more authentic anyway.
The best you can do is make a pluggable system and teach people how to write plugins, and create a platform that's ready for surprises to come in. And when those contributions do come in, you kind of have to look at them, think about whether they would work for everyone, and try to coach the pull request to go in a slightly tweaked direction if it does need some changes.
It's an architecture of surprise more or less; for that reason, I think open source is sometimes dangerous for larger projects - their architectures quickly become unplanned, or at least organically grown versus designed. This is why whole ecosystems of companies sometimes exist to repackage open source software that has grown too disorganized. There's a balance to be had there.
Ultimately, we had a really good user community because we made it easy to add things, and after about the first six months people sort of knew my aesthetic preferences and it took less coaching to get things to come in like we wanted. People began to help each other on IRC instead of me having to provide all the help. But in that first six months, before the patterns were established in the code and the documentation, and before all the people working on it had the same vision for things, it took more communicating to get everyone on the same page, and that was a lot of work.
It was entirely a nights and weekends thing for me for the first year. I think what made me find time for it was people were always there wanting things, and I felt I had to keep working at it to keep them happy, and it pretty much consumed everything :)
From a code perspective, I thought saying no was especially important because every time you take a code contribution it is like being gifted a free puppy. In the end, though, I said yes too much, and we ended up with too many puppies, and quality suffers when that happens.
However, my largest issues were dealing with the company that came up around Ansible. I enjoyed working with a lot of people there, but it was also exceptionally painful to work with parts of it. I always felt like there was the Ansible community I knew and then the other Ansible that I knew, and it was hard because there were very few people I could talk to about that. When I left, I greatly missed all of the community conversations and that left a gaping hole, but I didn't miss the company at all.
Technically, managing all the contributions was definitely the biggest challenge - you could wake up and find 10 new modules contributed from Europe, and if you don't deal with all of them, you're going to have that many more the next day - and that included weekends.
So it's coming from a ton of places at once and kind of all folding back in on itself. The root cause was this: I help with a class at NC State University, which is the undergraduate Senior Design course (CSC492), and each semester there are teams of students working on about 10 different industry projects.
I really wanted some tools to be able to understand what everyone was working on better, but GitHub was showing graphs on just one branch at a time and didn't show me the views I wanted. So what we did was to form a team of students to build something we could use for the course, and that became the first edition of Source Optics.
At the same time, I was thinking back to use cases from working for a local company that had 200 microservices. With that many services, it is difficult to understand who owns what, and I ended up building a custom dashboard for them that tracked service owners and versions and things like that. What I really wanted to know though, was how much effort was being applied to all the different services, and which services were new, and which were relatively static, and this was just an impossible thing to visualize. And then I realized all the software companies essentially have this problem to different degrees. Software history is just an incredibly rich dataset, but there aren't very good tools to mine it.
And, of course, I just wanted to play with some cool data science stuff to stay up to date and didn't want to do the usual stock tracking or climate tracking stuff but build something that could have new conclusions.
Much of the testing is done on open source repos because they are often very large, and with ones like Ansible, because I remember the history and can tell if the graphs show the things I think they should show or not.
I'd really like to get a lot more eyeballs on it and hear from people about the ideas they might have. It's still new so there's room for a lot of new ideas from everywhere, and I want to build an exciting multi-purpose platform that we can all share.
That's a hard one for me, because I've been struggling with whether the same methods that made Ansible successful in 2012 still work today.
It seems that with "agile" processes consuming so many workplaces, developers no longer have time to explore cool new open source projects during their day jobs, and are too tired when they get home to contribute to other people's projects - 8 hours of computing a day is definitely too much already.
Think about how you are going to get a user audience, and decide how important that contributor audience is. With Ansible, support for a wide variety of modules was important, so the contributor base made that possible. But in other cases, contributors may not be as required to build what you want to build.
In the era where people don't have a lot of free time, make sure you make it very easy for people to try things out and spend a lot of time on documentation.
I'd also think about business models. Open source business models are very hard to start with - for instance, even with Ansible in 2012, I had a very hard time getting anyone interested in paying for support, which ultimately led to my decision to accept a funding offer and that in turn required making some closed source software to sell. If you are aiming for a business, would a SaaS model work better? Do you want a small company or a big one?
The biggest one is that "project marketing" should be a thing. There are so many fantastic projects sprinkled all over GitHub, and most are undiscovered. There's a black art into what makes one popular and not another sometimes, and I would love a world where everybody has better ways to get their ideas out there and to find people to work on those projects with. I've been really lucky with that at a few points in my career, but it's a challenge. If you're having a hard time getting adoption, keep trying, and try to circulate your program first within your personal network.