Color points in scatter plot of Bokeh

15,074

Solution 1

Here's a way that avoids manual mapping to some extent. I recently stumbled on bokeh.palettes at this github issue, as well as CategoricalColorMapper in this issue. This approach combines them. See the full list of available palettes here and the CategoricalColorMapper details here.

I had issues getting this to work directly on a pd.DataFrame, and also found it didn't work using your from_df() call. The docs show passing a DataFrame directly, and that worked for me.

import pandas as pd
import bokeh.plotting as bpl
import bokeh.models as bmo
from bokeh.palettes import d3
bpl.output_notebook()


df = pd.DataFrame(
    {
        "journey": ['ch1', 'ch2', 'ch2', 'ch1'],
        "cat": ['a', 'b', 'a', 'c'],
        "kpi1": [1,2,3,4],
        "kpi2": [4,3,2,1]
    }
)
source = bpl.ColumnDataSource(df)

# use whatever palette you want...
palette = d3['Category10'][len(df['cat'].unique())]
color_map = bmo.CategoricalColorMapper(factors=df['cat'].unique(),
                                   palette=palette)

# create figure and plot
p = bpl.figure()
p.scatter(x='kpi1', y='kpi2',
          color={'field': 'cat', 'transform': color_map},
          legend='cat', source=source)
bpl.show(p)

Solution 2

For the sake of completeness, here is the adapted code using low-level chart:

import pandas as pd

import bokeh.plotting as bpl
import bokeh.models as bmo
bpl.output_notebook()


df = pd.DataFrame(
    {
        "journey": ['ch1', 'ch2', 'ch2', 'ch1'],
        "cat": ['a', 'b', 'a', 'c'],
        "kpi1": [1,2,3,4],
        "kpi2": [4,3,2,1],
        "color": ['blue', 'red', 'blue', 'green']
    }
)
df

source = bpl.ColumnDataSource.from_df(df)
hover = bmo.HoverTool(
    tooltips=[
        ('journey', '@journey'),
        ("Cat", '@cat')
    ]
)
p = bpl.figure(tools=[hover])

p.scatter(
    'kpi1', 
    'kpi2', source=source, color='color')

bpl.show(p)

Note that the colors are "hard-coded" into the data.

Here is the alternative using high-level chart:

import pandas as pd

import bokeh.plotting as bpl
import bokeh.charts as bch
bpl.output_notebook()

df = pd.DataFrame(
    {
        "journey": ['ch1', 'ch2', 'ch2', 'ch1'],
        "cat": ['a', 'b', 'a', 'c'],
        "kpi1": [1,2,3,4],
        "kpi2": [4,3,2,1]
    }
)

tooltips=[
        ('journey', '@journey'),
        ("Cat", '@cat')
    ]
scatter = bch.Scatter(df, x='kpi1', y='kpi2',
                      color='cat',
                      legend="top_right",
                      tooltips=tooltips
                     )

bch.show(scatter)
Share:
15,074
Dror
Author by

Dror

PhD in computational geometry (mathematics) from the FU-Berlin. Programming in Python, C++ and Mathematica. Mining DATA.

Updated on June 16, 2022

Comments

  • Dror
    Dror almost 2 years

    I have the following simple pandas.DataFrame:

    df = pd.DataFrame(
        {
            "journey": ['ch1', 'ch2', 'ch2', 'ch1'],
            "cat": ['a', 'b', 'a', 'c'],
            "kpi1": [1,2,3,4],
            "kpi2": [4,3,2,1]
        }
    )
    

    Which I plot as follows:

    import bokeh.plotting as bpl
    import bokeh.models as bmo
    bpl.output_notebook()
    source = bpl.ColumnDataSource.from_df(df)
    hover = bmo.HoverTool(
        tooltips=[
            ("index", "@index"),
            ('journey', '@journey'),
            ("Cat", '@cat')
        ]
    )
    p = bpl.figure(tools=[hover])
    
    p.scatter(
        'kpi1', 
        'kpi2', source=source)
    
    bpl.show(p)  # open a browser
    

    I am failing to color code the dots according to the cat. Ultimately, I want to have the first and third point in the same color, and the second and fourth in two more different colors.

    How can I achieve this using Bokeh?

  • Hendy
    Hendy over 6 years
    bokeh.charts doesn't exist anymore. Is there an up to date way to accomplish this? I really like the idea of not having to manually specify a mapping.
  • Dror
    Dror over 6 years
    Bokeh is changing rather frequently and I don't know. Probably a new question would be the best take.
  • Hendy
    Hendy over 6 years
    Thanks for the reminder! I actually did stumble across a so-so way to do this and posted a new answer. I think it fits here well enough; this is a top google hit and there's precedent to update answers as things change (e.g. a python3 answer where a python2 answer already exists).
  • Thomas
    Thomas almost 6 years
    Is this really the simplest way to do this? That seems really convoluted, I need to add 2 lines and another dict just to do something that in R is <color = c>?
  • Thomas
    Thomas almost 6 years
    Also, this does not really allow for a legend, at least I can't figure out how to create one?
  • Hendy
    Hendy almost 6 years
    I originally came to this question for the same reason, and also having come from R/ggplot2 I'm pretty blown away by the grossness that is the python plotting community. Take a look at plotly; same issue. You have to create lists for each series, which does, indeed, seem convoluted. I posted this answer because it was the simplest I could find at the time (granted, I'm pretty noob-ish with bokeh). If you can find a simpler way, please post another answer.
  • mic
    mic about 2 years
    > the grossness that is the python plotting community This summarizes my feeling too. Also documentation. I miss ggplot
  • Hendy
    Hendy about 2 years
    @mic You are in for a treat with plotnine! This answer specifically asked about bokeh, so it's not applicable here... but it's a fun observation that I can fit the whole plot code into a comment, using the df as is (note you have to quote variables in plotnine's ggplot). from plotnine import *; ggplot(df, aes(x='kpi1', y='kpi2', color='cat')) + geom_point()
  • mic
    mic about 2 years
    Yeah I have been following plotnine for a while (I do miss ggplot so much...). Unfortunately I have been creating plots for people that want hover and zoom functionalities, and I cannot really move away from plotly or bokeh. Yours is not only a fun observation, but also one that highlights how easy, streamlined, and readable code could be when using gg.
  • Hendy
    Hendy about 2 years
    @mic have to be python? Have you seen ggplotly, or considered just wrapping static plots in shiny/dash to get some of that functionality (not sure if hover is possible, but analogous functionality might be accomplished via sliders or buttons to toggle labels (ggrepel layer) and filtering thresholds might be possible)? Just an idea!
  • mic
    mic about 2 years
    Yeah it has to be python and it has to run on jupyter notebooks unfortunately. As for some sort of interactivity we have been using the excellent holoviz panel (basically shiny for python, and more), but really we cannot move away from python (decently sized company and moving everyone to R is not an option). The select/deselect functionality can be replicated by panel/ipywidgets, but zooming and hovering make researchers' lives much easier. I was using ggplot/ggplotly in my previous life and I never had to actually go to straight plotly. Thanks for the interest though!