Algorithmic Bias and Visual Perspectives: How AI 'Sees' the World

Acknowledgement: Lesson is derived from the transcript of video/s created by UNSW University/Organization
Learning Objectives
  1. Analyze how training datasets influence the output and cultural bias of generative AI models.
  2. Compare and contrast Western linear perspective with non-Western visual systems (e.g., Mughal, Indian Miniature, Indigenous).
  3. Evaluate the concept of the 'Black Box' in machine learning and its implications for accountability.
  4. Discuss the relationship between text prompts and visual limitations, specifically how the 'sayable' constrains the 'visible'.
  5. Propose ethical considerations for the future development of inclusive AI technologies.
Key Topics

Data Bias and the 'Default' Image

Generative AI models, such as DALL-E or Midjourney, learn to create images by analyzing billions of existing image-text pairs. However, these datasets are not neutral; they are predominantly composed of Western media. As a result, when a user prompts for a generic concept like a 'beautiful garden,' the AI reverts to its statistical mean—often a manicured English or French garden—ignoring other cultural styles like Mughal or Zen gardens. This phenomenon reinforces specific cultural norms as 'universal' or 'natural' while rendering others invisible. In STEM, understanding the composition of training data is crucial because it determines the output's accuracy and cultural relevance.

Further Inquiry

Australian research institutions and government bodies actively study data ethics and the impact of algorithmic bias on society.

Search Terms
  • "Artificial Intelligence Ethics Framework Australia"
  • "Algorithmic bias in data sets"
  • "Human rights and technology"

The Geometry of Vision: Linear vs. Multi-Point Perspective

Computer vision and AI image generation are deeply rooted in mathematical patterns. Historically, Western art since the Renaissance has prioritized 'linear perspective,' where all lines converge at a single vanishing point, simulating a viewer standing in one spot. This is how cameras (and thus most training data) record the world. In contrast, other traditions, such as Indian miniature painting, utilize 'floating' or multi-point perspectives. These allow a viewer to see a scene from above and the side simultaneously, or to see a character in multiple places within one frame to show the passage of time. Current AI struggles to replicate these non-Western spatial geometries because it quantifies vision based on the dominant linear patterns in its database.

Further Inquiry

To understand different visual systems and indigenous mapping, Australian cultural institutions provide resources on non-Western art history and First Nations knowledge systems.

Search Terms
  • "Asian art collection perspectives"
  • "Aboriginal Songlines and mapping"
  • "Visual storytelling in Indigenous art"

The Black Box and the Limits of Text-to-Image

Modern AI image generators rely on 'diffusion models,' which are complex systems that turn random noise into clear images based on text prompts. These systems are often called 'black boxes' because their internal decision-making processes are opaque, proprietary, and difficult to audit. Furthermore, these models rely on the pairing of text and image, creating a constraint where the 'visible is enslaved to the sayable.' If a visual concept (like the multisensory experience of a garden's sound and temperature) cannot be easily described in words or captured in a dataset, the AI cannot generate it. This limits AI's ability to capture the 'ineffable' or untranslatable aspects of human experience.

Further Inquiry

Several Australian organizations are dedicated to the responsible development of machine learning and the regulation of digital technologies.

Search Terms
  • "Responsible AI development"
  • "Explainable AI (XAI)"
  • "Future of machine learning in Australia"
Knowledge Check
Quiz Progress Score: 0 / 10
1. According to the transcript, what type of garden is predominantly generated by AI when prompted with 'beautiful garden'?
2. Why do AI systems reinforce Western aesthetic conventions?
3. Approximately how many images were generated by AI in the single year following the DALLE-2 beta launch?
4. What is the 'Black Box' problem referred to in the lesson?
5. How does the perspective in Indian miniature paintings differ from Western linear perspective?
6. What does the speaker mean by 'the visible is enslaved to the sayable'?
7. What architectural feature is central to the Mughal garden described?
8. The transcript mentions 'AI slop'. What does this refer to?
9. What is a potential consequence of allowing AI to flatten visual diversity into a single perspective?
10. What defines the Western perspective established during the Renaissance?
Question 1 of 10