Amy Lee “Part of working open is to put everything together in such a way that you can communicate ideas to the public, to other scientists, and in an open access journal.”

Amy is a microbiologist at the University of British Columbia. Her work focuses on using genome sequencing to understand how bacteria adapt to cause infectious diseases in humans, and how their immune system responds to these infections. She’s a strong believer that science can progress faster and better if we have open communication on data sharing, data analyses, and the final results. Her passion in building a strong open science community led her to start the first Mozilla Science Lab R Study Group, as well become actively involved with various organizations including the Vancouver Bioinformatic User Group, Software Carpentry and hackseq. In her spare time, she works on edible experiments by growing her own sourdough starter and brewing beer.

Evidence

Amy’s Story

Tell us a little bit about your work — starting with a broad overview and highlighting some specific projects?

I’m a post-doctoral research fellow at University of British Columbia. The research question that I’m interested in is to understand host-pathogen interactions. My work focuses on figuring out what makes a pathogen successful in causing disease, and how host responds to the presence of any particular pathogen. How I answer these questions is through genomics.

One of the projects that was just recently published looks at Burkholderia cenocepacia, which is a group of bacteria that causes disease in patients with cystic fibrosis. In those cases, what I’m trying to understand is how the genomes of this group of bacteria allow them to be successful pathogens in cystic fibrosis patients. This is an example of a project where I try to answer the question by looking from the bacterial side.

Other projects that I am working on involve looking at what happens in the host. There’s another project that was recently published that looks at the interaction between Chlamydia, which is a sexually-transmitted bacterium, and how it causes changes in the transcriptome in the host. Specifically, how does an infected host — in this case, it’s a human macrophage cell — responds to the presence of Chlamydia.

And what’s the transcriptome?

It’s basically a snapshot of changes in expression of the genes that may be involved in responding to this pathogen.

Ok, that makes sense. Thank you. Are there any other projects you want to highlight?

Yeah, another project that I’m working on is a large collaborative project with people at the Vancouver Children’s Hospital as well as Boston Children’s Hospital. In that project, we’re trying to understand immune development in newborns, how they respond to vaccines and ultimately find ways where we can boost their immune systems through vaccination. This is a very exciting collaborative project that could make a real impact in combating infectious diseases in developing countries.

Yeah, and they’re so different — at least in my mind. There’s an underlying thread, but this is really looking at a bunch of different aspects. I love it. Can you tell me about a time when you felt a sense of success? Think about your work and hone in on one specific example — a story or an anecdote — where you felt a sense of success. It can be recent or anything that stands out to you.

The most recent successes are the two papers that are published this year. As an academic scientist, publishing is a major aspect of my work. It can take many years to lead up to that: for instance, while I started working on the Burkholderia Project in 2013, many of the samples collected by our collaborators came from the 1980s. What is exciting about that story is that, our current work helps us gain a clearer molecular picture of how different Burkholderia strains adapt to long-term infection of cystic fibrosis patients.

How about an example of a challenge? Just one.

I will continue to talk about the Burkholderia Project. When I started this project, I collected a very large dataset. We essentially sequenced over 200 Burkholderia strains. There were 200 genomes and we collected phenotypic data as well for these 200 strains. The phenotypic data that we collected looked at how these bacteria behave in various laboratory conditions that could potentially play a role in disease-causing.

I had this large dataset and you can’t use Excel to analyze it. As a bench scientist – this was my training during my graduate career – I did not know what to do. How do I even analyze this large genomic dataset? My ability to acquire data occurs much faster than I am able to analyze it; and this is now a common problem facing many biologists. Figuring out how to analyze this dataset was probably one of the biggest challenges I have encountered because I had to learn a new way of thinking. I had to learn the computational skills necessary to analyze this dataset. That took quite a while, but it ended up being a very fruitful challenge.

You mentioned how you had to change your way of thinking and talked about how you couldn’t use your usual tools for analyzing this large dataset. What approaches did you take to address that challenge?

I talked to bioinformaticians and figured out that I needed to learn R and how to use various bioinformatic tools. I spent a lot of time reading and trying things out, which was challenging without a computational background. Around the same time, as I was learning these different tools, I read an article by Bill Mills. At the time, Bill worked at Mozilla Science Lab. He wrote this very empathetic article about his own struggle with writing good code and being scared about showing other people your code. That article really resonated with me especially as a beginner full of self-criticism. I reached out to Bill on Twitter and that is when everything went in a totally different direction. Should we talk about that?

Let’s go ahead. I had a couple of questions before I dig into that, but let’s go ahead and go there because you’ve already opened the door and I think it’s a great way to move into that conversation. You talked about how you read an article from Bill Mills. Was that the point where you became aware of Mozilla and what it’s been doing?

Yes, it was definitely through Bill that I became aware of what Mozilla has been doing.

What other ways have you gotten involved in Mozilla and what has that been like for you?

It’s been amazing. When I reached out to Bill and said, “Your article really echoed with me and I really enjoyed it. Some of the struggle that you discuss in the article are things that I’m going through, myself — Bill, being Bill, said, “Funnily enough, I have a solution for you.” We talked about this idea of building up a study group. Essentially, it is creating a supportive, open, and accepting environment for people with all levels of coding skills to come work together and learn together. That is the concept behind study groups and that is how I became involved with Mozilla.

You were in our inaugural study group, weren’t you?

Yes, I think I was the first guinea pig.

Where did that guinea pig opportunity take you as far as how you’ve been involved with Mozilla since then?

It has been a whirlwind. I met really interesting people through Bill and people involved in various aspects of scientific coding. I made lots of great friends. It was also a great opportunity to connect multiple communities: the Software Carpentry community and a very robust community of R coders. Locally, I’ve connected with many good scientific coding groups that became my support network. On a larger scale, I have met really interesting study group leads through Mozilla, people like Madeline, who’s another Mozilla network 50.

I participated in a number of interesting events. I came out to Toronto for some of the open science leadership summits. I also had the opportunity to go to London. That was really fun to be on the main stage for Mozfest. I think that was 2015.

Yeah, that was my first year.

That was super fun. It went from a local group to a very international experience.

Have you found that the work that you’ve done with Mozilla has made an impact on your life or your work beyond meeting the new study group leads? Has this involvement had an impact on your life, work, or organization on a different level?

It has definitely impacted my life on a personal level in many ways. Partly because it does open the door to many great discussions and, as a scientist, you never know where that next discussion leads.

As the first study group lead, it was a really fun experience to be a community leader. I never thought that I would enjoy it as much as I did. I became involved with hackseq, which is a Vancouver hackathon for genomics. Around the same time, I also joined VanBUG, which is the Vancouver Bioinformatics User Group. It was a fun experience of organizing different communities and bringing them together.

In addition, the study group was great because there were people of different coding skills. I was able to learn coding much faster that way rather than banging my head against a really thick book to try to figure things out. That has been the biggest success story for me.

Did you feel that there was a time when Mozilla disappointed you? What feedback would you give to us to improve?

Part of what makes it challenging is figuring out how to provide continual support to the Study Group members and this is what makes the study group different from some of the other coding models. Specifically for Study Group, it is challenging to maintain that momentum. I am not really sure that is something specifically Mozilla-related. Right now, the burden is on the study group lead to keep the momentum going and that is the biggest issue with keeping the Study Group going.

I’m going to turn, now, to the broadest issue in the Mozilla universe — internet health. How would you define a healthy internet? What does healthy internet mean to you? What would be a healthy internet for you?

No fake news. For me, a healthy internet should not just be about providing knowledge, but it should also be about connecting people. The unfortunate truth is that with more technology, we are becoming more disconnected and more isolated. A healthy internet should provide factual information, but also a way for people to come together and interact with each other.

What does working open mean to you? Also, has there been a time when working open has had an impact on you?

Working open can happen on many different levels. This is something that I’ve had many discussions with people at the Science Lab and at Study Group. On a very basic level, it involves using specific tools that allow you to share your code, share your thought process, and so on. On a bigger level, we need to share your data, the metadata associated with it, and have your data in such a way that other people can use it. Of course, if it is not curated properly, other people will not be able to necessarily reproduce it or take it in a different direction. As a scientist, that is a key part. A third part of working open is to put everything together in such a way that you can communicate ideas to the public, to other scientists, and in an open access journal and so on. These are ways of how I see myself working open.

In terms of how it has impacted my life and my work, I have found that by talking to scientists and people in the Study Group about the various challenges that I encountered helps me solve the problem much faster. That is one big reason and advantage to working openly. Going back to some of the projects that I have been involved in, all the data are open to the public. For example, all of the assembled genomes from the Burkholderia Project are deposited at the NCBI and open to public. All the raw sequence data are also there, so if people want to use that dataset, they can. When you think about the amount of money that went into sequencing them, I am sure there are lots of other projects that can spin off of that dataset. Having that in the public sphere where other researchers can access the data is really important for me.

The return on investment is much greater when other people can use what you’ve already done. If you had access to ten skilled collaborators or contributors, what would their skills be and what would you ask them to do?

Having different coding skills would be really useful. It also depends on the question. I’m thinking from a biologist’s perspective, so if we have a biological question that we’re trying to solve, having someone with that expert skill set would be useful. That person doesn’t necessarily have to have coding skills, but a deep understanding of the biological question is key. I’m discussing my dream team, so people with computational skills and people with biology skills. It would also be great to have someone with great communication skills, who can write good blog posts, and hopefully they would be interested in talking to the public about the problem. I think those are the three major skills that I would want in a team.