Currently studying muscle disease in horses, Rob Schaefer is curious about examining the genetic basis for quantitative traits in complex organisms. He utilizes open source software and resources to examine large scale data and extract meaningful insight to answer questions rooted in biology. As an advocate for open science and reproducibility, he is an active participant and contributor to the Mozilla Science Lab.
I’m wondering if you could start by telling me a bit about your work.
I am a scientist at the University of Minnesota and I just finished my PhD last November. My work right now is in the veterinary school here at the University of Minnesota. I’m currently working on a project that’s studying muscle diseases in horses. Our project’s questions are rooted in biology, but the methods are mostly computational and analytical. My project is funded by a USDA NIFA postdoctoral fellowship, which is the National Institute of Food and Agriculture.
My background is in computer science and genetics. I’m working with biologists and we are trying to create methods that are both interdisciplinary and focused. But, at the same time, the methods are generalizable and applicable to other organisms, other species, and, in a general sense, reusable.
I’m about six months into my postdoc, it’s a two-year appointment. Day-to-day it’s just research, probably half writing about science, and another half is the programming and computational development.
Can you tell me about a specific time where you felt a sense of success?
I feel successful when I can understand someone else’s problem, or when someone is struggling with a certain component of their workflow or pipeline and I work with them to come up with a solution.
A lot of times someone will want to do an analysis and they’ll get hung up. They know exactly what they want, but they’ll get hung up on a program that needs to be installed or something that they just don’t have the experience in.
Together, we sit down and work through it line-by-line, and accomplish what they set out to do. That interdisciplinary component is really important. That’s when you feel like you did it.
How about an example of a recent challenge — and how you went about addressing it?
The scale of our work can be overwhelming and we’re dealing with many different types of data. A lot of times it’s so big that you can’t look at all of it at once, especially when you’re integrating things.
There have been times when we’ll be working on something or doing an analysis, and it just won’t look right. Once we start breaking it down, we’ll see that we had made a mistake early on. And that mistake propagated down and we didn’t catch it it until weeks, maybe months, later.
Having to go back and restart, basically from scratch is a major challenge. The work is such that there’s not a manual. It’s a lot of working on your own and figuring out stuff on your own. The challenge of research is difficult — there’s no single, correct path. You’re wandering around and trying to figure stuff out as you go along, and hopefully catching all your mistakes, which is why ideas such as open science and reproducibility are so important — we’re still humans and we all make mistakes.
In this case, where you found this mistake, or when you’re feeling lost along the path, what are the strategies that you use to keep going?
I think just acknowledging the nature of the work. It’s important to be OK with mistakes or negative results — to be OK with trying something different or new — to know that your successes are measured incrementally.
It builds up to a point where you finally feel like you finished something much later on. Knowing that the journey is one step at a time. You’re literally working on stuff that no one’s ever looked at before. It’s pure research. It’s completely unknown. Getting back to that excitement and recognizing why you’re doing it in the first place is really important.
Shifting now to the broadest issue in the Mozilla universe — the “open internet”. What is that for you?
The open internet for me is a place where everybody has the same amount of access to everything. There are no hurdles getting in your way to look at something that’s on the internet. The idea that the website that a large corporation puts out there is on the same playing field as someone else’s blog, the access to that is equal.
Probably one of the strongest pieces of the open internet is that things are, by default, accessible. For instance, for something like, publicly funded research, whether or not you’re a scientist at an institution, versus maybe someone doing civilian science or a science teacher in a high school. They have access to those resources, the source of that information does not discriminate.
Can you tell me about a time where this kind of openness has been important to you?
Right now science is going through an evolution where more and more people are focusing on open science, and I can see the parallels with open source software. Open source software has been extremely beneficial to us in our work. I’ve used many open source packages.
The large majority of the resources that we use are open source, and free, and available to anybody who wants to use them. Just knowing that the stuff that we do day-to-day would have never been possible if we weren’t, for instance, running Linux. If we weren’t using the scientific software that people put out there for free, the visualization software… the list goes on.
And even in cases where we “do it all ourselves”, the programming languages we choose to use, the compilers themselves are free and open. In our case, where funding is limited and we are trusted in using taxpayer money, our ability to be efficient with our resources is completely dependent on others making those resources available for us.
The flow of information in open science strongly parallels the flow of information, historically, in open source software. While you can probably see the value in open access papers, and reproducible workflows and methods, a lot of the things that the Mozilla Science Lab is advocating for, that hasn’t always been the case.
People have tended to protect their data and results because they are spooked of being scooped. Being scooped is when someone publishes a similar result to yours which limits your ability to publish in a high impact journal. And if publications are currency in academia, the journal impact factor is the denomination. You can maybe see the same patterns 20 or 30 years ago in the software development community. Access to source code was scarce and the currency was in the number of CDs you sold.
The internet obviously had a major impact in how software was written and distributed and also how contribution is recognized. I think the same changes are happening in science. I think this change has been really important. It’s nice to see other scientists that recognize this also — especially when they’re already established or they have experience. Academia is kind of a weird place where the status quo is a different than the software development realm, and the value is placed differently. It’s cool to see that transition.
On a personal level, this concept of openness has influenced who I have decided to work with. The horse genetics community isn’t massive. There are not a ton of us out there. You’d think that, maybe, competition would be really harsh, but the community has recognized that we can get a lot more done together than isolated.
People are much more willing to share and collaborate than to directly compete. We acknowledge that there’s not a lot of available funding in horse genetics, and we can’t afford to be isolated and secretly be working on the same thing. We need to wring out every tiny piece of data we can. The best way to do that is to work together, make sure people aren’t stepping on each other’s toes. I’ve recognized how open and collaborative the equine community has been and I value it a lot.
Getting a little more specific about Mozilla and the Science Lab, how did you get involved with Mozilla, and what has that been like for you?
I’ve always been aligned with the greater Mozilla mission and their overall goals for open internet and open source software. I’d always been very impressed with that. Everybody is familiar with Firefox, but they might not always be aware of the root of that success, that it’s coming from a huge group of volunteers centered around a small nonprofit.
Then I learned — I don’t even remember how, maybe through Twitter or a blog or something — about the Science Lab. It was maybe a year-and-a-half or two years ago — they were just gearing up. I dove in. I thought it was a great idea and just got involved. I didn’t feel like I had much to contribute in terms of something like Firefox or a lot of the other projects that were going on within Mozilla, but I definitely thought I could have an impact with the Science Lab. I started talking to them, and tried to get involved as much as I could.
What kinds of activities have you done with them?
Right now, the most official activity would be the study group. I lead a study group here at the University of Minnesota. They, once a year, have their code sprint. We’ve been a part of that the past two years. I’ve helped host a site here in St Paul.
Unofficially, I stay in contact with other people in the community — building connections and networking. Getting involved and being in the IRC or the chat room. Then trying to contribute back with study group lessons or other things that they’re helping to develop.
Can you tell me about a time where this involvement has impacted your life or your work?
Most people, when you talk to them about open source or open science, they generally agree with you, but there’s a barrier to getting involved. It was nice to have the Mozilla umbrella, or their brand, to stand behind and say, “Instead of just meeting to talk about this, we’re going to have this support network for the study group. Let them lead us and open that channel of communication with them.” People were much more willing to get involved when they knew that it had something to do with Mozilla. Channeling that energy was very, very helpful in getting people around me and on campus involved.
Is there a time where your involvement with Mozilla hasn’t met your expectations?
It’s hard to say, they are pretty amazing. I think that in terms of bang for your buck, the community is great. If I was forced to think of things completely idealistically, it’s difficult for an organization like Mozilla to focus down and narrow onto a specific impact that they can have.
Both Mozilla as well as scientific researchers are driven by technology and care about helping and making people’s lives better. Seeing their success in the tech world makes you so excited about what they can do in the science world. It is easy to get caught up in this idea and to lose focus on how to properly implementing things, to keep your feet on the ground. Scientists are pedantic for foundations which sometimes forces you to re-evaluate your expectations.
Doing science is very, very specific. You get stuck in this tiny, little world where you’re trying to do this tiny little thing that may or may not have a foreseeable impact. And I think Mozilla’s coming from the other side where they’re very broad and they’re advocates. Trying to figure out how to connect those two worlds, I think, is going to be a very big challenge.
Where I’m hearing the value for you is in having a community of people to connect with around these ideas of open science and sharing. It’s like Mozilla is providing a framework or a platform for that grouping of ideas that can, as you said, help you be more efficient in your work and wring out that last little bit of funding.
Yeah, for sure. In terms of massive amounts of impact, no one can deny that Firefox has been the leverage point that Mozilla has had in terms of being able to elicit real change. That’s where they have a lot of leverage. They have been able to use their success as an open internet platform to amplify changes in other areas such as encryption and privacy. I’m trying to think of something like Firefox that could have an impact in the scientific community, something that would really have a grounding point to put leverage on.
This is difficult because most of the time, the small advances made during scientific discovery are so disconnected from their eventual social impact. For instance, when radio waves were discovered, Hertz famously stated that they were, “of no use whatsoever”. Now, radio powers much of our global telecommunications systems, including large parts of the internet.
Another shortcoming in science was the rediscovery of Mendel’s laws, which are the cornerstone in the field of genetics. His foundational work laid dormant for nearly 30 years before the ramifications were realized. Certainly, having an organizational framework to communicate about science would have helped in these instances! I have no doubt that Mozilla has helped me better connect with other scientists.
Can you think of any ways that these stories might be useful to you, if at all?
Of course! I think that this is a nice way to see how Mozilla, the community, and ideas that we all share impacts individual people. Seeing how the same ideas that people have in the open source world can be applied to science or other fields or areas. Really, we’re rallying around the same ideas and the same values.
Hearing other people’s success stories or different challenges people face is inspiring, it’s encouraging to see that happen other places. A lot of times you’re stuck in your own little world. It’s not until you realize how many other people out there who also view Mozilla like you do, or the ideas that they are trying to pursue, that it becomes empowering.
Hearing those stories, collecting them, is very useful. Because, it’s hard to talk about a lot of these things. They’re very high level. I think that you doing the hard work of interviewing people, helping them throwing their ideas down, and helping articulate those ideas is really helpful.
It’s been a challenge, because when I interview folks, most responses are very high level. I try to push for concrete examples. I want to build a collection of specific anecdotes. That’s what’s going to be interesting — the specifics that people choose to highlight. Once we get a big collection of those, hopefully we’ll be able to see some emerging patterns and insights.
I’m just thinking back on a question that you’ve asked and specifics. Here’s another success story. During our code sprint, this one that we did last spring, it’s two days of total focus on a specific project. You sit down, and code. It’s worth putting everything else aside and just focusing on this one thing. Well, there was this idea that I had had that I wanted to try and implement it.
I sat down and started writing about it. Because in order for other people to help you, you have to document everything. Everybody is working in the same workspace, and you are bouncing ideas off of each other. I got in contact with some other people remotely and they contributed to my project! This would have never happened without using the code sprint as a platform or excuse to sit down and write this project that you had wanted to do.
Even better, they were writing test code for me, which is something you usually have to twist people’s arms to do, but they were completely willing. They said, “I like your idea, how can I contribute?” They wrote some test code, which is really great. And we can track that, they’re in the repository and they’ll be in there forever now. They contributed and their usernames are there.
That’s a connection that would have never happened. I can think of similar stories. I’m working with a student on a project. We used this open source library called Cytoscape.js. It’s a visualization library written for the web. He had written some code that he had for it. They got in contact, through me, to talk about some specific problem that they were running into, which would have never happened if we didn’t have this place to talk about all this stuff.
That place was through Science Lab?
Yeah, it was through the Science Lab. I made a connection with Max Franz through the code sprints. I’ve just seen him around. Because we were both working with Science Lab, it wasn’t scary to reach out to him and say, “Hey, I have a student who has a question about this library you wrote.” He’s like, “Yeah, ask me.”
Now, I think that the code that my student wrote is actually part Cityscape.js. Max accepted his changes and my student’s code is now part of that open source project. Another contribution that would have never happened if it hadn’t been through that channel.