Our new team room. Long a dream of mine, the team room includes a large scrum board at one end, and multiple "information radiators", displaying real-time updates on all our projects.
One year ago, I joined the Dana-Farber Cancer Institute to lead the newly created DFCI Knowledge Systems Group. We are an applied genomics software and data sciences group, focused on enabling cancer genomics research and precision cancer medicine at DFCI. We are also now part of a larger, new computational biology center, headed by Chris Sander, who has recently moved from Sloan Kettering Cancer Center to DFCI.
One year into the new position, here is a quick summary of what we are working on, who we are, how we work, and positions we are currently looking to fill.
Enabling Precision Cancer Medicine
The knowledge systems group is focused on building new genomic information and knowledge systems and performing data analyses of cancer genomic data, all with the larger goal of enabling precision cancer medicine at DFCI.
A large part of our effort is focused on mining of data generated by the Profile project, a multi-institution initiative between DFCI, Brigham and Women's...
After many years of struggling with R, I have now made the transition to Python for data analysis. Python notebooks have also been a revelation, and there is no turning back now.
A few useful resources for those looking to take the plunge:
10 Minutes to Pandas: Good introduction with basic code examples.
Pandas Cookbook: Excellent introduction with sample Python notebooks you can work through yourself. I especially enjoyed the notebooks for analyzing New York City 311 calls.
Pandas, MatPlotLib, and Numpy Cheat Sheets: Excellent, concise cheat sheets. Print them out and keep them handy.
Official Pandas Docs: Not the easiest for beginners, but handy nonetheless.
What’s New in Pandas: New stuff appears here.
MatplotLib: Plotting in Python. So much easier than R!
Seaborn: statistical data visualization: For when MatPlotLib doesn't cut it.
Plot.ly: now fully open source; easily create interactive plots and visualization.
And, for those looking for a quick, simple mash up of pandas and plotly, served via flask, a basic hello world application: https://github.com/ecerami/hello_flask.