In this blog post, we'll create a colorful scatterplot to visualize the relationship between two quantitative measurements: culmen length and culmen depth of penguins. The culmen is the upper edge of a bird's beak. By visualizing these measurements, we can gain insights into how different penguin species vary in beak size and shape.
The data
The data we will use is the penguins dataset, which contains information about three species of penguins: Adelie, Chinstrap, and Gentoo. Please download the dataset from the link below.
Now head over to the DataPicta app. You can upload the penguins dataset by clicking on Data. A dialog opens where you can click Upload, go ahead and upload the data you just downloaded.
After uploading the data, you will see the data in presented in csv text format, it gets better in a moment. Now name the dataset penguins and click on the Parse button. Now you will see the data in a table format.
A basic scatterplot
We will now focus on creating the chart, no worries, it only takes a minute or two. Since a scatterplot is made of dots we will add the dot element. Click Element to add an element and choose Dot. A new panel named Dot appears.
For the X-axis, we will select culmen_length_mm and for the Y-axis, we will choose culmen_depth_mm. You should now see a basic scatterplot chart.
Adding colors
To visually differentiate between the three penguin species, we can add a stroke color to the chart. Open the stroke color menu and select species. This will color-code the data points according to their species, making it easier to identify patterns and trends within each group.==
Adding tooltips
To make the chart interactive, we can add tooltips to each data point. Tooltips will display additional information about a specific point when you hover over it, providing valuable insights without cluttering the chart. To access the tooltip settings, look for the style tab. The properties under the style tab do not rely on the data. Since tooltips don't affect the visualization directly, they are not included in the data panel. So click on the style tab and then select tooltip to enable them. Now hover over the chart to see the tooltips.
Adding a legend
This chart uses three scales: the X-axis, the Y-axis, and the Colors. While the X-axis and Y-axis have clear labels, the colors currently lack a description. Let’s add a legend to identify the meaning of each color.
To modify a scale, we need to add it first. Click Scales and select Color. This will create and open a new panel named Color. Click Legend. This will add a legend to the chart, providing a key for interpreting the colors and their corresponding penguin species.
And that’s it! You’ve created a beautiful and interactive visualization.