Gender, Social Interactions and Interests of Characters Illustrated in Scratch and Python Programming Books for Children

From an early age, girls may opt out of Computer Science (CS) for not fitting the CS stereotypes of being male, asocial and technology-oriented. These stereotypes might be strengthened by children’s books on programming, but little is known about this. Therefore, this paper explores the gender, social interactions and interests of characters illustrated in ten popular extracurricular Scratch and Python children’s books. We found more masculine than feminine characters in all but one book. Furthermore, nearly half of the characters are illustrated alone, and 15% are interacting with computers & robots. Over two-thirds of the characters fit at least one stereotypical trait. With this paper, we aim to create awareness of stereotypes in CS books among creators, publishers and buyers. Making and using more inclusive CS materials will help close the gender gap.


INTRODUCTION
In Computer Science (CS), women are a minority.In 2021, they represented 21.2% of computer scientists in the US [39] and 19.1% of ICT specialists in Europe [12].If these numbers increase, we have more people to fill the high amount of ICT vacancies, which benefits the economy [17].Moreover, more women enter a field with in-demand and high-paying careers [20], which benefits gender equality.Furthermore, increased gender diversity in CS will likely result in less biased products [17], which benefits society.
If we would benefit from more women in CS, then why do we have a gender gap?One of the reasons is the stereotypical image of a computer scientist [6,7,23,24]: being male, asocial and technologyoriented [6,7,15].These stereotypes develop in children from an early age, as found by analysing children's drawings of a computer scientist [15].Since girls are less likely to fit the stereotypes, they are more likely to have a low sense of belonging and interest in CS [24].
The stereotypical image of CS is conveyed in multiple ways, including by the people in the field and the media [6,7].Books and magazines are a source of inspiration for children when drawing scientists [33].Moreover, Kerkhoven et al. [19] argue that "visual content of educational resources should be gender-balanced in order to prevent a stereotypical view of the roles of men and women [...] in science".Hence, characters illustrated in CS books for children should not enforce CS stereotypes.
However, visual biases are found in children's books on Science, Technology, Engineering and Mathematics (STEM), which includes but is not limited to CS.These biases manifest in more men being depicted than women [10,27,28,30], as well as gender stereotyping in the activities and occupations illustrated [10,13,27,28,30].
Unfortunately, we know little about the stereotypes in CS books specifically.To our knowledge, and supported by Papadakis [30], almost no work is published on stereotypes in CS books.Moreover, many analyses of STEM books focus on textbooks used in schools.However, CS is not taught at all in primary or secondary schools in approximately 60 to 70 countries [29].In many more countries, CS is not a mandatory part of the curriculum.Therefore, we target extracurricular CS books for children.
To narrow the scope of our research, we focus on two commonly used programming languages in CS education [2,25,31]: Scratch and Python.Scratch is a visual language designed for children, while Python is a textual language not explicitly designed for children.
Thus, we examine stereotypes in extracurricular Scratch and Python books for children.In line with de Wit et al. [8], we focus on stereotypes related to gender, social skills, and interests.More specifically, we are interested in whether characters in the books fit the stereotypical traits of being male, working alone and being interested in computers.Similar to Kerkhoven et al. [19], we focus on characters illustrated in books.Additionally, we want to explore whether characters fit multiple stereotypical traits.This results in the following research question:

RQ.
To what extent do characters illustrated in Scratch and Python books for children fit the stereotypical CS a) gender, b) social interactions, and c) interests traits?
To answer our research question, we analyse the characters illustrated in ten popular Scratch and Python books for children.We analyse the gender of the character, whether a character is illustrated individually or in a group, the type of character (e.g. a human or a robot) and what activity the character is doing.

BACKGROUND 2.1 Stereotypes on people working in CS
From an early age, children develop stereotypes about CS.At the age of 6, children think that boys are better than girls at robotics and programming [23].At this age, they also believe that girls are less interested in CS than boys [24].Moreover, asking children aged 8 to 11 to draw a computer scientist showed that they think a computer scientist is male, predominately uses computers and works alone [15].However, in another study, children aged 7 to 18 had the counter-stereotypical belief that programmers are social [8].
In the same study, children did belief that programmers are male and like to play video games.Other stereotypical hobbies include watching anime and programming, while non-stereotypical hobbies include playing sports and listening to music [7].
Why do children have this stereotypical image of people working in CS?This image is conveyed in multiple ways, including by the people in the field and the media [6,7].In a study by Tan et al. [33], 9 and 10-year-olds listed their sources of inspiration when drawing a scientist.Of the 266 students, 34% mentioned books and magazines.Moreover, books help children to understand the social meaning of, among others, gender [4,27,33].Thus, books potentially impact the development of CS stereotypes.

Stereotypical images in STEM books
Numerous studies examined stereotypes and gender biases in the images of STEM books.Overall, men are more frequently depicted than women [10,27,28].However, Spanish mathematics textbooks illustrate an almost equal amount of men and women [13].Additionally, men are more often portrayed as professionals than women [13,27,28].Moreover, Nigerian STEM textbooks [10] as well as British and Irish chemistry textbooks [28] illustrate more men as scientific professionals.On the other hand, Spanish mathematics textbooks depict more women as STEM professionals [13].Additionally, female characters are more often depicted as teachers [10], and illustrated in domestic activities [28].
Research that focuses specifically on stereotypes in CS books for children is limited.Papadakis [30] analysed three CS books used in Greek high schools.Their results align with what is found for books in other STEM fields.They include men being more often depicted than women.Men also appeared in a greater amount of occupational activities than women.Moreover, they found women being illustrated as a digital consumer but not as a digital producer.

Materials
Since we are interested in popular extracurricular programming books, we based our selection on Amazon's Best Sellers: Best Children's Programming Books (accessed on April 18th 2023).This list contains the top 100 popular programming books for children based on sales.For both Scratch and Python, we selected the five highest-ranked books that met the following criteria: 1) being a physical book, 2) being written in English, and 3) focusing on the specific programming language.These criteria resulted in the selection below.[34] We analysed every page in the body of the books and thereby excluded the cover, contents, forward, glossary, index, acknowledgements, references and appendices.This resulted in 1,803 pages, with the number of pages per book in Table 1.Since we focus on the characters illustrated in the books, we excluded screen captures, example programs and layout components such as headers.
We identified illustrations as characters when they have a face, including both human and non-human characters.Additionally, we included characters with no face visible when they would have a face when drawn from a different perspective.We also included every occurrence of a character, even if they appear multiple times, because each illustration of the same character may differ.For instance, a character might read a book alone on one page while playing a video game with another character on another page.
We identified 1,639 characters.The amount of characters per book is shown in Table 1.Two books, S5 and P2, have no illustrations of characters and are therefore not mentioned in the results.

Measures
We based our measures on the guide 'Promoting Gender Equality through Textbooks' [4].This guide describes a quantitative method to study textbooks on gendered identities and social roles.For illustrations, it suggests analysing characters' sex, age, actions and attributes.It also describes other features that can be analysed, such as occupational function and the interaction between characters.We took this guideline as a base and made adoptions to analyse the gender, social interactions and interests of the characters illustrated.

Gender.
We measured gender and not sex since we did not look at the biological differences but at the gender expression of the characters [4,5].To ensure the objectivity of the collected data and minimise inference, we gathered the following data: • Pronouns • Masculinity and femininity of characters' appearance We identified the pronouns of the characters by scanning the surrounding text for references to the characters using the male (he/him), female (she/her) or non-binary (them/they) pronouns.
To label characters' appearance as masculine or feminine, we used categories described by Halim et al. [14], which include dresses and skirts, gender-typed colours, patterns, formal wear and superhero references.We noted the prevalent colours of the clothing and non-human skin, which we refer to as the main colours.To ensure objective and consistent labelling, we used a colour wheel consisting of primary, secondary and tertiary colours with the addition of brown and pink.Furthermore, children consider long hair feminine and short hair masculine [8].Combining these aspects results in the following measures for masculine and feminine appearances: We combined these four measures to categorise the appearance of characters as masculine or feminine.Each measure has an equal weight.Characters with a masculine appearance have more masculine than feminine aspects.Characters with a feminine appearance, on the other hand, have more feminine than masculine aspects.Characters with no or the same amount of masculine and feminine aspects are categorised as neutral.
Characters fit the stereotypical CS gender trait if they are referred to with the male pronoun, or when no pronoun is found and their appearance is labelled as masculine.

Social interactions.
To determine characters' social interactions, they are categorised as being illustrated alone, in proximity to others, or as having interactions with others.These interactions include looking at each other and performing an activity together.
Characters fit the stereotypical CS social interactions trait if they are illustrated alone.

Interests.
We measure interests in two ways: We defined the following type categories: animals, computers & robots, fantasy & history, humans, and others.We further specify types when applicable, such as a cat as a specification of an animal.
An activity is categorised in one or more of the following interests: arts, domestics, computers & robots, education, fantasy & history, food, music, outdoors, sports, STEM (excluding computers & robots), vehicles, and others.The categorisation is based on a character's role in society, objects they are interacting with, and physical activity (e.g.swimming).The others category includes characters for which we could not identify an activity.
Characters fit the stereotypical CS interests trait if they are a computer or robot, or engage in a computers & robots activity.

Combinations of traits.
Next to the individual stereotypical traits, we are interested in whether and how they combine.Therefore, we counted the occurrence of each combination of traits.We also looked at whether the social interactions and interests differ based on the characters' gender and whether the social interactions differ per interest.

Procedure
We tested our coding scheme with books in our selection.For practical reasons, we used Dutch translations of some books (in the testing phase only).The first author and a research assistant discussed the first version of the coding scheme and analysed five pages together.After that meeting, the first author filled in the scheme for 10% of the pages from S1 (translation), S3, P4 and P5 (translation).A random generator determined the page numbers analysed.The research assistant also used the scheme for a selection of pages.Thereafter, we revisited our measures and defined standardised categories where possible.In this way, we created the final coding scheme in which we collected data as objectively as possible by only noting what we saw.Although still time-consuming, we also experienced that the data collection with the final coding scheme was faster than with the initial one.During the data collection executed by the research assistant, the research assistant noted down questions or doubts and discussed these with the first author.After the data collection, the first author fine-tuned the categories and manually categorised masculine and feminine appearances and activity interests.

RESULTS
We analysed 1,639 characters illustrated in five Scratch and five Python books on whether they fit one or more of the stereotypical CS gender, social interactions or interests traits.

Gender
We analysed the pronouns and appearances of the characters to determine whether characters fit the stereotypical CS gender trait of being male.Of all characters, 275 (or 17%) have blue as one of their main colours, while 250 (or 15%) have pink, magenta or purple as one of their main colours.When looking at the individual books, it stands out that none of the 124 characters in S3 has a feminine main colour, while 33 (or 27%) have masculine blue as one of their main colours.
We identified 20 (or 1%) of the characters with masculine clothes and 43 (or 3%) with feminine clothes.S3 and P4 have no characters with feminine clothing, while P1 and P3 have no masculine ones.
For the accessories, we identified 38 (or 2%) of the characters with at least one masculine accessory and 78 (or 5%) with at least one feminine accessory.Within S3, we identified one character with feminine and none with masculine accessories.P4 has only 1 (or 2%) of its characters with at least one feminine accessory, while 16 (or 24%) have masculine accessories.
We found 460 (or 28%) of the characters with masculine hairstyles and 326 (or 20%) with feminine hairstyles.P4 and S3 have no characters with feminine hair.S3 also has only 4 (or 3%) of its characters with masculine hair.P3 has the most characters with a masculine hairstyle (137 or 39%) and with a feminine hairstyle (100 or 28%) Based on the pronounces and appearances, we labelled 537 (or 33%) masculine, 390 (or 24%) feminine, and 712 (or 43%) neutral characters.The labelling per book is shown in Figure 1a.

Social interactions
We found that 773 (or 47%) of the characters are illustrated alone.Furthermore, 296 (or 18%) are illustrated within proximity of others, while 570 (or 35%) of the characters interact with others.
In S1, S2, S3, S4 and P4, at least half of the characters are illustrated alone.S3 and P4 have relatively the most characters that fit the stereotypical social interactions trait: 69% and 73% of their characters are illustrated alone.However, in P1, P3 and P5, more than 60% of the characters are illustrated together.In P1, relatively the most characters interact with others (22 or 58%).The percentages of social interactions in each book can be found in Figure 1b.

Interests
We analysed interests based on the characters' types and activities.
We identified 678 characters who engage in an activity indicating an interest in arts, computers & robots, domestic, education, fantasy & history, food, music, outdoors, sports, STEM and/or vehicles.Of the 985 characters with other interests, 640 (or 65%) are not interacting with any object.The main activity of people in the other category is standing with n=624 (or 63%).The most popular activity interests are sports (n=134), computers & robots (n=121), and outdoors (n=95), followed by music (n=72) and food (n=55).Of the characters interested in computers & robots, 15 (or 12%) interact with a game controller.Moreover, 65 (or 54%) of the characters interested in computers & robots are illustrated with a monitor, computer and/or laptop.
When combining the type and activity, 246 (or 15%) characters fit the trait of being interested in computers and robots.As shown in Figure 1c, 78% of the characters illustrated in S3 have an interest in computers & robots.This is mostly caused by many characters being robots.

Combination of traits
Of the characters, 531 (or 32%) do not fit any stereotypical CS traits.In contrast, 715 (or 44%) fit one stereotypical trait, 338 (or 21%) fit two stereotypical traits, and 55 (or 3%) fit all three stereotypical traits analysed.Figure 2 shows the combinations of traits per book, while Figure 3 includes some examples of characters with two or The percentage of characters that fit two or three stereotypical CS traits in total and per book three stereotypical traits.S3 has the most characters that fit two stereotypical traits (61 or 49%) and all three traits (24 or 19%).On the other hand, both S4 and P1 have no characters that fit all three stereotypical traits.P1 also has the lowest percentage (4 or 11%) of characters with two stereotypical traits.
Of the characters that fit two stereotypical traits, the combination of being masculine and alone is most common with n=270.Furthermore, relatively more masculine characters are illustrated alone (50%) than feminine characters (43%).Moreover, feminine characters interact more often with others (40%) than masculine characters (34%).
We identified 87 (or 5%) of all characters as masculine and interested in computers & robots.Of the characters interested in computers & robots, 49 (or 40%) are labelled masculine and 40 (or 33%) feminine.However, when looking at these numbers relative to gender, 9% of the masculine and 10% of the feminine characters are interested in computers & robots.When only looking at characters of the computer & robot type, we identified 38 (or 28%) masculine characters and 5 (or 4%) feminine ones.
The activity categories with the most masculine characters are STEM, music and arts.Of the characters interested in STEM, 24 (or 67%) are masculine and 5 (or 14%) feminine.Of the characters interested in music, 37 (or 51%) are masculine and 13 (or 18%) feminine.Of the characters interested in arts, 16 (or 44%) are masculine, and 10 (or 28%) are feminine.The activity categories with more feminine than masculine characters are fantasy & history, with 16 (or 43%) feminine and 11 (or 30%) masculine characters and vehicles, with 11 (or 38%) feminine and 10 (or 34%) masculine characters.
There are 146 (or 9%) characters who are illustrated alone and have an interest in computers & robots.Of these, 81 (or 55%) are of the computer & robot type, 21 (or 14%) are sitting behind a desk or on a chair, and 6 (or 4%) are playing video games.Of the 246 characters interested in computers & robots, 146 (or 59%) are illustrated alone, 34 (or 14%) are in proximity of each other, and 66 (or 27%) interact with others.The activity categories with relatively the highest amount of characters illustrated alone are domestics (91%), arts (69%) and STEM (69%).The category with relatively the lowest percentage of characters illustrated alone is sports (35%).
Of the 55 characters that fit all three stereotypical traits, 21 (or 35%) are a (partly) blue robot standing alone.26 (or 47%) of the characters fitting all three traits are human.Most of these humans (23 or 88%) interact with a computer or robot-related object.

DISCUSSION
If we want children to have a less stereotypical image of who works in CS, we should ensure CS books don't enforce stereotypes or even counteract them.However, little is known about stereotypes in CS books for children.Therefore, we examined whether characters illustrated in ten popular extracurricular programming books fit the stereotypical CS gender, social interactions and interests traits.

Interpretation of the results
The books in our selection varied in the number of characters illustrated, with two books (S5 and P2) having no characters.Having no characters does imply that there are no characters with stereotypical traits.However, having no characters could convey that CS does not involve others.Moreover, if children who use these books already have a stereotypical image of CS, then there are no counter-stereotypes to change this image.
We found more masculine than feminine characters, which is in line with findings in other STEM books [10,27,28,30].However, for some books, the difference isn't that big, and one book even has more feminine than masculine characters.Finding some gender balance in books is expected since gender diversity in CS has been a topic of interest for many years.However, the percentage of masculine characters in the individual books ranges from 29 to 36, while the percentage of feminine characters ranges from 1 to 34.So, some books have a similar amount of masculine and feminine characters, and some books have masculine characters but almost no feminine characters.However, there are no books that have feminine characters but almost no masculine characters.In other words, there are books targeting boys and books that are gender balanced, but there are no books targeting girls.Therefore, we argue that there should also be books focused on girls to serve a more diverse group of children.In this way, girls can be introduced to CS in a feminine context, which is shown to help in developing interest in science subjects [18].However, we only looked at a selection of books, and it could be that there are CS books targeted at girls but that they are less popular at Amazon.
(a) Stereotypical gender and social interactions in S2 [38] (b) Stereotypical gender and interests in S1 [37] (c) Stereotypical social interactions and interests in S3 [36] (d) Stereotypical gender, social interactions and interests in P4 [3] Figure 3: Characters fitting two or three stereotypical traits We also looked at the combination of gender and the other two traits.We found that masculine characters are more often illustrated alone than feminine characters.This aligns with women more frequently having family and occupational roles that require socially skilled behaviours [11].For gender and interests, we found a gender balance when looking at the percentage of masculine and feminine characters being interested in computers & robots.However, for the characters interested in (other) STEM fields, there is a gender difference.A possible explanation could be that the creators of the books are more conscious about representing both men and women in computer-related fields since this is the focus of the books but are less conscious about other STEM biases.We did not find gender biases in the characters interested in education and domestics in contrast to others [10,28].
The stereotypical social interactions trait is most common, with almost half of the characters illustrated alone.Similar to the gender trait, some books show a more balanced image of this trait than others.Previous work shows that children with an increased interest in a CS career include children who believe that programmers are social [9].It is thus important to illustrate more characters together than alone, especially when they are doing CS-related activities.
All books include characters with an interest in computers & robots, but this is only one of the many interests identified.The most popular activity interest is sports, which is one of the counterstereotypical hobbies [7].Moreover, characters frequently engage in music activities, which is another counter-stereotypical hobby [7].
Although we did look at gender, there are more diversity aspects to consider but were out of the scope of this research.However, we do want to share some of our observations.We identified the majority of the human characters being light-skinned (n=518) and a minority of medium (n=165) and dark (n=99) skinned.Moreover, we did not identify any characters with a visible disability.

Limitations
We only looked at ten books based on Amazon Best Sellers.Although Amazon is a popular platform, we do not know if these books represent the ones children are exposed to.Including other sources, such as libraries, might result in a different selection of books.In this study, we focused on illustration and excluded text.
Although the text might reinforce the same conclusion, it could also contradict the illustrations [4].However, illustrations take up more space on the page and give a direct portrait of gender roles [4,19].Moreover, we found only limited references to the illustrations when collecting pronouns.This might be an indication that not many characters appear in the text.Since we collected data manually, there might be some human errors in the data.However, given the number of characters, these will not likely impact the overall conclusions.Moreover, we made the data collection as objective as possible, as described in Section 3.3.This did result in excluding age from our study since this was difficult to identify objectively, given the different drawing styles.Furthermore, the classification of masculine and feminine is likely culturally biased since every society develops its classification based on its own criteria and principles [4].

CONCLUSION
We researched whether the characters within extracurricular CS books for children fit the stereotypical CS gender, social interactions and interests traits.Understanding this is important since stereotypes are a barrier girls face in pursuing a CS career.
We found books targeting boys and books that are gender balanced, but no books targeting girls.Furthermore, almost half of the characters are illustrated alone, and 15% of the characters have an interest in computers & robots.The most common combination of stereotypical traits is gender (being male) and social interactions (being alone).
For future work, we suggest analysing the text, including the exercises and example programs, and the covers of the books since they might contain different stereotypes [1,4].Moreover, this could reveal (gender) biases in the connotations and attitudes of characters [13,26].Additionally, we did not consider the background of the creators of the book, but this would be interesting.Furthermore, we suggest looking for potential biases in other sources within CS education, such as the software and programming languages used to teach programming to children.Lastly, we suggest developing software to detect stereotypical traits in text and illustrations.This can be utilised to analyse more books and can aid creators of CS materials in identifying stereotypes and biases.
With our work, we want to raise awareness of stereotypes in CS children's books among creators, publishers and buyers.Making and using CS materials with a wide variety of characters can spark CS interests in a diverse group of students, including girls.
Main coloursMasculine blueFeminine pink, purple, magenta Clothes Masculine suit, superhero outfits, masculine patterns Feminine dress, skirt, feminine patterns and decorations Accessories Masculine tie, facial hair, bow, cape Feminine jewellery, makeup, hair accessories, shawl, feminine shoes Hairstyle Masculine hair length above the ears Feminine hair length below the ears, updo Neutral no hair, afro

Figure 1 :
Figure 1: The percentage of characters fitting stereotypical traits for all books and per book

Figure 2 :
Figure 2: The percentage of characters that fit two or three stereotypical CS traits in total and per book

Table 1 :
Number of pages analysed and characters identified per book