Bokeh is a python library for creating interactive plots and figures. With the bokeh server, you can create fully interactive applications with pull-down menus, sliders and other widgets. This has the advantage that you can create fluid and responsive web applications – for example, as you move a slider bar, your plot can respond and update in real-time. However, bokeh plots may only be one part of larger data analysis platform you are trying to build, and once you move beyond a single small application, it can be hard to scale a bokeh server application to accommodate everything you need.
The alternative that I recommend is to use Python flask for your overall web application framework, and then use Bokeh as one component within that framework. This enables complete control over the platform, and also enables you to more easily integrate other components (even other plotting components)...
With the recent release of Bokeh version 0.12.5, you can now embed Bokeh applications within Jupyter Notebooks. You can therefore more easily prototype your Bokeh applications within Jupyter, for eventual transition into Bokeh server applications or flask-based applications.
To illustrate the new functionality, I have put together a simple “Hello, World” application using the Iris data set, one of the most commonly used data sets in machine learning. The application enables users to select a feature (Sepal Length, Sepal Width, Petal Length, or Petal Width) via a pull-down menu, and automatically displays a color-coded histogram for the chosen feature. A screenshot of the application is shown below:
A few important points to note:
First, this will only work in the latest version of bokeh. If you are using pip,...
Our new team room. Long a dream of mine, the team room includes a large scrum board at one end, and multiple "information radiators", displaying real-time updates on all our projects.
One year ago, I joined the Dana-Farber Cancer Institute to lead the newly created DFCI Knowledge Systems Group. We are an applied genomics software and data sciences group, focused on enabling cancer genomics research and precision cancer medicine at DFCI. We are also now part of a larger, new computational biology center, headed by Chris Sander, who has recently moved from Sloan Kettering Cancer Center to DFCI.
One year into the new position, here is a quick summary of what we are working on, who we are, how we work, and positions we are currently looking to fill.
Enabling Precision Cancer Medicine
The knowledge systems group is focused on building...
After many years of struggling with R, I have now made the transition to Python for data analysis. Python notebooks have also been a revelation, and there is no turning back now.
A few useful resources for those looking to take the plunge:
10 Minutes to Pandas: Good introduction with basic code examples.
Pandas Cookbook: Excellent introduction with sample Python notebooks you can work through yourself. I especially enjoyed the notebooks for analyzing New York City 311 calls.
Pandas, MatPlotLib, and Numpy Cheat Sheets: Excellent, concise cheat sheets. Print them out and keep them handy.
Official Pandas Docs: Not the easiest for beginners, but handy nonetheless.
What’s New in Pandas: New stuff appears here.
MatplotLib: Plotting in Python. So much easier than R!