Using Agile In Data Science Research


“Can Agile be used for data science research projects?” An architect at a local research-heavy technology company posted that question in our community Slack channel. My immediate reaction was “Of course!”, but I didn’t post an answer. While I have led R&D teams, the work we did was more development than research. I was only halfway through a Master of Data Science program, so I couldn’t speak from experience with Data Science projects either. One lecture touched on Agile, but it was from a software development perspective. The final project at the end of our program was the perfect opportunity to prove to myself that it could be done.

Applying Agile to a “Short” Academic/Government Project

For our final project, we had nine weeks to visualize and analyze inconsistent, infrequent GPS data. In some contexts, nine weeks may be considered a long time. It’s relatively short for data science projects. Our project sponsor and client was the local city government, in addition to the university grading us for our degrees.  Our team was three people with very different backgrounds: academic, petroleum engineering, and video game development. I was the only one with Agile experience. Due to personal circumstances, we weren’t colocated. For the last half of the project, we were in three different cities, one being eight time zones away. These challenges reflect the reality that many research projects face in industry and academia.

Planning the Unknown

At the start of the project, our team had no experience with the domain. Other than simple visualizations of coordinate data, we had no experience with GPS data, mapping software, traffic analysis or urban planning. We needed to figure out how to tackle the project, what packages/algorithms to use, and how to apply them to a large amount of data. We also had no idea what we were going to discover along the way. Yet, we needed to produce a plan for the project proposal. Fortunately, we had a fixed deadline. So, I created a high-level strategy with goals for every sprint:

Project strategy showing sprints for proposal, count, model, scale, analyze and report

High-level strategy and schedule in project proposal

This gave us a roadmap of what goals we were trying to achieve in order of priority. It was also flexible enough for us to make changes as we made discoveries.

During the project proposal sprint, we did our literature search and preliminary prototyping. As a result, we had some ideas on how to tackle the project and experiments we needed to do. Our “Count” sprint was finishing the prototyping. Our initial approaches failed, but our goal remained the same. If we weren’t able to achieve the goal for “Count”, we couldn’t complete the project. As we moved into the third sprint, we decided to “Scale” first and added “Visualize”, which turned out to be an important step for determining the right model. The “Model” sprint turned out to be more complex than expected and included the scaling factor.

Our high-level strategy provided guidance and we updated it as things changed. Without that strategy, it would have been easy to lose our way and pull the project in different directions.

Communication and Transparency

Since our team was not colocated, our daily video conference “stand-ups” were critical for staying in sync and adjusting to challenges and discoveries. We discussed where we were at, took on work that aligned with our individual strengths and interests, and paired up when needed. Throughout the day, we were in almost continuous communications over our team Slack channel. We met with our university supervisor weekly and provided written progress reports. Those reports also served as a weekly retrospective and check-in on our planning.

We met with our clients weekly to be completely transparent and engage them on the project progress. Although it wasn’t an explicit goal, we could actually show progress each week. We showed the prototypes with small samples of the data and our first visualization at scale. We showed each of the models we experimented with and had a discussion about pros/cons of each. They were the domain experts and would ultimately use what we delivered. We needed their collaboration.

The Final Analysis

At the end of the project, we achieved what we needed to achieve in the time we had. If we had more time, we would have spent it looking at the data to extract more insights. If this were an industry project, we would have added an extra sprint or two with a list of investigations.

With respect to applying Agile, we covered all the essentials using the minimum amount of process:

  • Had a high-level strategy to guide our work and updated it as things changed
  • Prioritized and planned our work each sprint
  • Kept track of our work and what was left to be done
  • Had a regular communication cadence with each other and our clients
  • Regularly showed our work

The one thing we did not do: estimate. This was our first data science project. We did not have enough experience to estimate specific tasks. Plus, when you’re working in the unknown, it’s impossible to estimate effort. However, we had goals with time boxes. This provided enough structure to ensure we could meet our goals and give attention to anything that lagged behind.

How Others Apply Agile to Data Science Research

After completing the project and before writing this article, I compiled some articles on how others apply Agile to Data Science and other research projects.

Academic research:

Research in Industry:

Data Science in Industry:

 

Related Posts

 

 

 

 

, , , ,

About Liza Wood

After a dozen years leading video game development projects in a variety of roles, I decided to pursue a Master of Data Science at the University of British Columbia. Studying data science doesn’t mean I’m moving away from leading people. Growing data science teams need collaborative, pragmatic, Agile leadership to connect data to all areas of the business. I would like to share that point of view, along with my experiences, on this blog.

View all posts by Liza Wood

Subscribe & Connect

Subscribe to our RSS feed and social profiles to receive updates.

No comments yet.

Please Share Your Thoughts

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: