Automated data collection relies heavily on two technologies: web crawlers and web scrapers. While often used together, they serve different purposes. Web crawlers, like those used by Google, are scripts that systematically browse the internet to index content. They mimic human behavior by following links from page to page to map where information is located. In contrast, web scrapers are designed to extract specific types of data from specific sources (e.g., extracting prices from a marketplace or tweets from a profile) and store it in a structured format for analysis. These tools can operate at scale, collecting vast amounts of data—from text posts to biometric information—often bypassing the manual effort required for such tasks.
Australian government agencies and research institutes provide extensive resources on digital technologies and data standards.
Data scraping is a powerful tool used across various sectors. In market research, companies scrape reviews and social media to understand customer demographics and sentiment, allowing for highly targeted advertising. In the field of criminology, researchers and law enforcement scrape data from the open web and the dark web to identify security risks and understand criminal behaviors, such as the sale of illicit goods or the dynamics of hacker forums. Furthermore, scraping provides 'social insights' by analyzing public discourse on platforms like Twitter during elections or major events to gauge public opinion. However, this ease of access means personal data—including biometrics, location history, and financial habits—can be aggregated to create detailed profiles of individuals.
Research into cybercrime and digital social trends is frequently published by specialized Australian institutes.