Google Maps Data Scraping Using Selenium in PHP

Google Maps is a valuable source of information for businesses, marketers, and developers. Whether you’re looking for local business data, reviews, or geographic coordinates, scraping data from Google Maps can help. While Python is a common language for web scraping, this guide focuses on Scraping Google Maps data using Selenium in PHP. Selenium is a browser automation tool that works well with PHP to extract dynamic content from web pages like Google Maps.

What You’ll Learn

  • Setting up Selenium in PHP
  • Navigating Google Maps using Selenium
  • Extracting business data (names, addresses, ratings, etc.)
  • Handling pagination
  • Tips for avoiding being blocked

Prerequisites

Before diving into the code, make sure you have:

  • PHP installed on your machine
  • Composer installed for dependency management
  • Basic understanding of PHP and web scraping concepts

Step 1: Setting Up Selenium and PHP

First, you need to install Selenium WebDriver and configure it to work with PHP. Selenium automates browser actions, making it perfect for scraping dynamic websites like Google Maps.

Install Composer if you haven’t already:

    curl -sS https://getcomposer.org/installer | php
    sudo mv composer.phar /usr/local/bin/composer
    

    Install the PHP WebDriver package:

    composer require facebook/webdriver
    

    Download and install the Chrome WebDriver that matches your Chrome browser version from here.

    java -jar selenium-server-standalone.jar
    

    Now that Selenium and WebDriver are set up, we can begin writing our script to interact with Google Maps.

    Step 2: Launching a Browser and Navigating to Google Maps

    Once Selenium is configured, the next step is to launch a Chrome browser and open Google Maps. Let’s start by initializing the WebDriver and navigating to the website.

    <?php
    require 'vendor/autoload.php'; // Include Composer dependencies
    
    use Facebook\WebDriver\Remote\RemoteWebDriver;
    use Facebook\WebDriver\Remote\DesiredCapabilities;
    use Facebook\WebDriver\WebDriverBy;
    use Facebook\WebDriver\WebDriverKeys;
    
    $host = 'http://localhost:4444/wd/hub'; // URL of the Selenium server
    $capabilities = DesiredCapabilities::chrome();
    
    // Start a new WebDriver session
    $driver = RemoteWebDriver::create($host, $capabilities);
    
    // Open Google Maps
    $driver->get('https://www.google.com/maps');
    
    // Wait for the search input to load and search for a location
    $searchBox = $driver->findElement(WebDriverBy::id('searchboxinput'));
    $searchBox->sendKeys('Restaurants in New York');
    $searchBox->sendKeys(WebDriverKeys::ENTER);
    
    // Wait for results to load
    sleep(3);
    
    // Further code for scraping goes here...
    
    ?>
    

    This code:

    • Loads the Chrome browser using Selenium WebDriver.
    • Navigates to Google Maps.
    • Searches for “Restaurants in New York” using the search input field.

    Step 3: Extracting Business Data

    After the search results load, we need to extract information like business names, ratings, and addresses. These details are displayed in a list, and you can access them using their unique CSS classes.

    <?php
    // Assuming $driver has already navigated to the search results
    
    // Wait for search results to load and find result elements
    $results = $driver->findElements(WebDriverBy::cssSelector('.section-result'));
    
    // Loop through each result and extract data
    foreach ($results as $result) {
        // Get the business name
        $nameElement = $result->findElement(WebDriverBy::cssSelector('.section-result-title span'));
        $name = $nameElement ? $nameElement->getText() : 'N/A';
    
        // Get the business rating
        $ratingElement = $result->findElement(WebDriverBy::cssSelector('.cards-rating-score'));
        $rating = $ratingElement ? $ratingElement->getText() : 'N/A';
    
        // Get the business address
        $addressElement = $result->findElement(WebDriverBy::cssSelector('.section-result-location'));
        $address = $addressElement ? $addressElement->getText() : 'N/A';
    
        // Output the extracted data
        echo "Business Name: $name\n";
        echo "Rating: $rating\n";
        echo "Address: $address\n";
        echo "---------------------------\n";
    }
    ?>
    

    Here’s what the script does:

    • It waits for the search results to load.
    • It loops through each business card (using .section-result) and extracts the name, rating, and address using their corresponding CSS selectors.
    • Finally, it prints out the extracted data.

    Step 4: Handling Pagination

    Google Maps paginates its results, so if you want to scrape multiple pages, you’ll need to detect the “Next” button and click it until there are no more pages.

    <?php
    $hasNextPage = true;
    
    while ($hasNextPage) {
        // Extract business data from the current page
        $results = $driver->findElements(WebDriverBy::cssSelector('.section-result'));
        foreach ($results as $result) {
            // Extraction logic from the previous section...
        }
    
        // Check if there is a "Next" button and click it
        try {
            $nextButton = $driver->findElement(WebDriverBy::cssSelector('.n7lv7yjyC35__button-next-icon'));
            if ($nextButton) {
                $nextButton->click();
                sleep(3);  // Wait for the next page to load
            }
        } catch (NoSuchElementException $e) {
            $hasNextPage = false;  // Exit loop if "Next" button is not found
        }
    }
    ?>
    

    This script handles pagination by:

    • Continuously scraping data from each page.
    • Clicking the “Next” button (if available) to navigate to the next set of results.
    • Looping through all available pages until no more “Next” button is found.

    Step 5: Tips for Avoiding Blocks

    Google Maps has anti-scraping measures, and scraping it aggressively could lead to your requests being blocked. Here are a few tips to help avoid detection:

    Use Random Delays: Scraping too fast is a red flag for Google. Add random delays between actions to simulate human behavior.

    sleep(rand(2, 5)); // Random delay between 2 and 5 seconds
    

    Rotate User-Agents: Vary the user-agent string to prevent Google from detecting your scraper as a bot.

    $driver->executeScript("Object.defineProperty(navigator, 'userAgent', {get: function(){return 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)';}});");
    

    Proxies: If you’re scraping large amounts of data, consider rotating proxies to avoid IP bans.

    Conclusion

    Scraping Google Maps data using Selenium in PHP is a powerful way to gather business information, reviews, and location details for various purposes. By following the steps in this guide, you can set up Selenium, navigate Google Maps, extract business details, and handle pagination effectively.

    However, always be mindful of Google’s terms of service and ensure that your scraping activities comply with legal and ethical guidelines.

    Similar Posts