Sign Up for Abstract Web Scraping API
If you are not familiar with Abstract Web Scraping API, then signup for your free Abstract account to get access to all the APIs. Once logged in, you can access the Web Scraping API from the dashboard.
You can access your API key from the Web Scraping API console.
You can make a test request from the console to scrape the URL provided by default. You should see a large blob of HTML returned by the API. This ensures that your API key works fine and you are now ready to integrate this API into the PHP application.
Demo Web Scraping App with PHP
Let’s put this Web Scraping API to some use by building a demo application. PHP is one of the most widely used programming languages for building web applications, and Laravel is a popular PHP web framework. We will leverage both of these technologies to show you how to build a basic web scraping application.
Follow these steps in the rest of this post to witness how you can create a Laravel based web app for web scraping in a few steps. But first here are a few prerequisites you should have to ensure that you have the right developer environment for creating this application.
Prerequisites
- PHP Runtime: Make sure you have a PHP 8 runtime available on your developer environment
- PHP Toolchain for Laravel: Make sure you also have the composer package manager
Step 1 - Create a New Laravel Project
Open a terminal and run the following composer command to create a new Laravel project named php_abstractapi.
This will create a directory named php_abstractapi under the present working directory where the command is executed. This is the project directory of this demo app containing all the boilerplate code and dependencies. Make sure to change to this directory for executing all further commands from the terminal.
Open your favorite IDE and check out the directory structure of the project directory
Step 2: Test the Default Laravel App
The empty Laravel project can be tested by launching it from the terminal.
This will start a development web server that hosts the default Laravel app at https://127.0.0.1:8000. You can check out the default landing page for this app on the browser.
Step 3: Add the API Credentials for Demo App
Open the environment file for the project and add two new environment variable entries for the Abstract API URL and Abstract API key.
File: .env
Replace the placeholder <YOUR_ABSTRACT_API_KEY> with your Abstract API key and <ABSTRACT_WEB_SCRAPING_API_ENDPOINT> with the API URL. The URL can be found in the live test console within the Abstract API console.
As of now, this URL is set to https://scrape.abstractapi.com/v1
Step 4: Add the HTTP Helper Class for Handling Abstract API
Create a helper class, AbstractAPI.php under the Http subdirectory.
File: app/Http/Helpers/AbstractAPI.php
Add the following PHP code snippet within this file:
This helper class handles the call to Abstract Web Scraping API from the PHP backend.
Step 5: Add a New Controller Named WebscrapeController
From the terminal, add a new controller named WebscrapeController to the project.
This will create a new PHP file
File: app/Http/Controllers/WebscrapeController
Replace the default content of the file with the following code:
This controller defines a custom API endpoint, ‘/requestapi’. This API accepts the URL from the frontend UI and passes it to Abstract Web Scraping API for scraping the contents of the URL. As part of handling the Abstract API call, this controller also defines ConnectionException to catch invalid URLs.
This controller also defines the home page view for the UI which is labeled as ‘webscrape’.
Step 6: Update the App Routes
The app has two routes. One is ‘/’ for displaying the home page of the frontend UI of the demo app. And the other is ‘/requestapi’ for triggering the scraping request.
You must register these routes for the demo app in Laravel. To achieve this, replace the content of the routes definition.
File: routes/web.php
Step 7: Create the HTML and JavaScript for the Demo App UI
At this point, all the backend PHP logic is built for the demo app. Now the last thing is the graphical user interface (UI) which is an HTML page.
For this, create a new view file of the Laravel app under the resource subdirectory
File: resources/view/webscrape.blade.php
Add the following content inside this view:
This is a Bootstrap based HTML code for a web form that lets the user input a URL and submit the form. The JavaScript code links the form to the ‘/requestapi’ endpoint of the PHP backend to send web scraping requests with the URL.
Step 8: Adding Bootstrap CSS to the code
To ensure that Bootstrap CSS styles are applied to the frontend UI, get the Bootstrap.min.css from the link below:
https://getbootstrap.com/docs/5.0/getting-started/download/
Create a sub-directory ‘css’ within the public sub-directory of the project and copy the downloaded bootstrap.min.css into it.
Alternatively, you can also add a link to the CDN source of bootstrap in the HTML file header.
With this step, we are done with all the code development for this demo app. Make sure to save all the files before proceeding with the next steps.
Step 9: Relaunch the Laravel Server
Relaunch the Laravel development server which was earlier run in step 2 to test the default Laravel app.
Now you should see the demo app UI on the browser as per the view created in step 7.
Step 10: Test the Demo App
Now you are ready to test the app.
Enter any website URL in the form and click submit. The UI requests the PHP backend and displays the loading icon while waiting for the response.
Behind the scenes, the Laravel framework will call the Webscrare Controller to get the scraped webpage content from Abstract Web Scraping API and display it in the front end.
Here is what the scraped content looks like for example.com
That’s it !!
If you have followed the steps this far then pat yourself on your back. You have triumphantly built and tested a PHP demo app for web scraping.
As you can witness, all the heavy lifting of scraping the content was taken over by the Abstract Web Scraping API, while you focussed on building the UI and backend logic for handling user requests.
FAQs
How To Scrape Data From Websites using PHP?
As a programming language suitable for building web applications, PHP is capable of web scraping. You can use an in-built PHP library to run the scraping tasks within the PHP runtime, or leverage an external service. For large-scale scraping projects, it is recommended to use a third-party API. The Abstract Web Scraping API offers a full-fledged, secure and scalable solution for web scraping. It is easy to integrate which API within a PHP application. This API also offers a free tier for basic scraping chores.
How To Crawl A Website in PHP?
There are many ways to crawl a website. You can write the business logic in PHP using the built-in cURL library to scrape the homepage of a website and then parse all the internal links to crawl additional pages. Alternatively, you can also use an API for larger websites, which might be blocking scrape requests from the same IP address. With Abstract Web Scraping API, you can undertake large-scale website crawling tasks such that the API will spread the scraping requests across a pool of large IP addresses. Moreover, it is super easy to integrate this API within PHP using cURL or other HTTP client libraries.
Which Tool is best for Web Scraping?
There are many tools available for performing web scraping. However, if you want to have complete control over it, it is better to write your own web scraping tool. You can easily build a demo web scraping web application using PHP and Abstract API. PHP handles the web app interactions and accepts web scraping requests, while the Abstract Web Scraping API does the heavy lifting of scraping and returning the scraped content. Integrating the API within a PHP application is easy with a plethora of PHP HTTP client libraries, such as cURL. The Abstract Web Scraping API offers a free tier of 1000 requests per month to try the API.