Blog Machine Learning Python

How to Customize a Simple Bar Chart in Altair

In this tutorial, I describe how to build and customize a simple bar chart using the Altair Python library. The tutorial is organized in three steps:

  • build the basic graph
  • customise the graph
  • add annotations to the graph

As sample graph, I load the hybrid car registrations in Italy from 2018 to 2020, manually built.

Load data

Firstly, I import the libraries, which will be used during the tutorial: altair and pandas.

import altair as alt
import pandas as pd

Now I build the dataset as a list of dicts, with pairs (Year, Value). Then I build a dataframe from this basic list.

data = [ { 'Year' : 2018, 'Value' : 4800},
         { 'Year' : 2019, 'Value' : 12400},
         { 'Year' : 2020, 'Value' : 32000}]

df = pd.DataFrame(data)
df.head()
YearValue
20184800
201912400
202032000

Build the basic graph

Before building the bar chart, I define some basic parameters as variables:

  • width: the width in pixel of the chart
  • height: the height in pixel of the chart.
width = 300
height = 300

Now I can build the basic bar chart, by invoking the mark_bar() function of the Chart() class. The Chart() class receives as input the data frame df to be shown. In order to specify the mapping between the data and the visual aspect of the graph, the altair library provides the encode() function. With this function, I specify the mapping between the axes and the data. I also specify which information must be shown in the tooltip.

bar = alt.Chart(df).mark_bar().encode(
    alt.Y('Value'),
    alt.X('Year:N', title='Anno'),
    tooltip=[alt.Tooltip('Year:N', title='Anno'), alt.Tooltip('Value', format=',', title='Immatricolazioni')]     
)
bar

Customise the graph

The basic graph can be customised in order to communicate something to the reader immediately. For example, I could focus the readers’ attention of the fact that from 2018 to 2020 there has been an incredibly improvement in hybrid cars registration.

I can customise the axes and the graph title, through the following parameters:

  • font: the font family to be used.
  • axis_config: some altair properties regarding both the axes, including the font family, the font size and the label rotations. Regarding axes, the altair library provides two types of text: label and title. The label refers to every label associated to data, such as the name of a country. The title refers to the title of the axis, such as Country. For each type of text, the altair library permits to configure many properties, such as the font size and the font family.
  • scale: some altair properties regarding to the y axis scale to be used. I have set the max domain value to 40,000, i.e. a value greater than the maximum data value.
  • title: the title of the bar chart.
font = 'utopia-std, serif'
axis_config = alt.Axis(labelAngle=0,labelFont=font,titleFont=font, labelFontSize=16, titleFontSize=18)
scale_config=alt.Scale(domain=[0, 40000])
title='Immatricolazioni Auto Ibride'

Then, I can focus the readers’ attention to the last year (2020), by highlighting the corresponding bar with a stronger colour than the other bars. As colour, I can use a type of green, to mark the fact that ecological cars are increasing. I set the color property through a condition().

bar = alt.Chart(df).mark_bar(tooltip=True).encode(
    alt.Y('Value', axis=axis_config,title=title,scale=scale_config),
    alt.X('Year:N', axis=axis_config, title='Anno'),
    color=alt.condition(
        alt.datum.Year == 2020,  # If the Year is 2020,
        alt.value('#154734'),     # highlight a bar with green.
        alt.value('lightgrey')   # And grey for the rest of the bars
     ),
    tooltip=[alt.Tooltip('Year:N', title='Anno'), alt.Tooltip('Value', format=',', title='Immatricolazioni')]     
)
bar

Add annotations to the graph

Now I can show explicitly the improvement from 2018 to 2020, by calculating the percentage increase and adding it to the graph as an annotation.

df['Value'][2]/(df['Value'][2] - df['Value'][0])*100
117.64705882352942

The percentage increase is around 117%, thus I can add a text in the graph with this information. I exploit the mark_text() function of the Chart() class. I also specify the font and other style parameters.

text = alt.Chart(df
).mark_text(x=width/2, y=20, dx=-5,fontSize=30, color='green',text='+117% nel 2020',font=font
).encode()

text

Now I build a broken line which connects the 2018 bar with the 2020 one. I build a dataframe, which contains the dots of the four lines.

df_line = pd.DataFrame([
    {'Year': 2018, 'Value' : df['Value'].min()}, 
    {'Year': 2018, 'Value' : 35000},
    {'Year': 2020, 'Value' : 35000},
    {'Year': 2020, 'Value' : df['Value'].max()}
])
df_line.head()

Out[9]:

YearValue
20184800
201835000
202035000
202032000

Then, I build the lines through the mark_line() function of the Chart() class. Note that, in this case, the Chart() class receives as input the df_line dataframe.

line = alt.Chart(df_line).mark_line(color='#154734').encode(
    alt.Y('Value', axis=axis_config,title=title,scale=scale_config),
    alt.X('Year:N',axis=axis_config, title='Anno', ),
)
line

Finally, I can combine the three previous graphs, simply by summing them. I can also specify some properties of the final chart, such as the title, the width and the height.

inal_chart = bar + line + text
final_chart = final_chart.configure_title(
    fontSize=25,
    dy = -15,
    font=font,
).properties(title=title,width=width,height=height)

final_chart

Finally, I save the graph as an HTML page, which can be included everywhere in a Web site

final_chart.save('auto_ibride.html', embed_options={'actions': False})

For more Python related blogs Visit Us Geekycodes . Follow us on Instagram.

If you’re a college student and have skills in programming languages, Want to earn through blogging? Mail us at geekycomail@gmail.com

Leave a Reply

%d bloggers like this: