If you’re managing the content for a website, it’s a good practice to keep a content inventory to keep track of your existing content and identify content gaps.
When creating your content inventory, it is incredibly useful if you can automatically fetch the page titles and descriptions from the URL.
Luckily, Google Sheets offers numerous functions to make data manipulation and analysis easier.
In this guide, we’ll walk through a simple method to extract page titles and descriptions from URLs in Google Sheets using a combination of functions.
Or alternatively, simply copy my template with the formula and add your URLs in the first column.
How to Extract Page Titles in Google Sheets
To extract page titles in Google Sheets, use the IMPORTXML function to fetch the title element from a specified URL.
Step 1: Create a Google Sheet with the full list of URLs
Create a new or open an existing Google Sheet where you have a list of URLs from which you want to extract page titles. Make sure the URLs are in a column, such as column A, for easy reference. Don’t forget to add a header row with title.
If you don’t have all the URLs, you can either crawl your website using ScreamingFrog or extract URLs from your sitemap.
Step 2: Use the IMPORTXML Function
IMPORTXML allows you to import data from structured data sources such as XML and HTML. We can use this function to extract page titles from URLs.
In a new column next to your list of URLs, enter the following formula:
=IMPORTXML(A2, "//title")
Replace “A2” with the cell reference containing the URL you want to extract the title from. This formula fetches the title of the webpage specified by the URL.
Step 3: Drag down the formula
Once you’ve entered the formula in the first cell, you can drag the fill handle (the small square at the bottom-right corner of the cell) down to apply the formula to the rest of the cells in the column. Google Sheets will automatically adjust the cell references for each row.
Step 4: Clean Up the Data (Optional)
After extracting the page titles, take some time to review the results and adjust your formulas as needed.
For example, it is common that your page title includes a separator and your website name. This information is usually irrelevant for analysis, so you can clean it up by using additional functions such as SUBSTITUTE, TRIM, or REGEXEXTRACT to remove unwanted characters or extract specific parts of the title.
=SUBSTITUTE(B2, " - SiteName", "")
This formula would replace ” – SiteName” with an empty string, effectively removing it from the title.
How to Extract Meta Descriptions in Google Sheets
To extract meta descriptions, use the IMPORTXML function in Google Sheets to retrieve the content attribute of the meta tag with the name attribute set to ‘description’ from specified URLs.
Step 1: Prepare a Google Sheet with all URLs
Open a new or existing Google Sheet where you have a list of URLs containing the meta descriptions you want to extract. Ensure that the URLs are organized in a column, such as column A.
Step 2: Use the IMPORTXML Function
Similar to extracting page titles, use the IMPORTXML function to fetch meta descriptions from webpages. In a new column adjacent to your list of URLs, enter the following formula:
=IMPORTXML(A2, "//meta[@name='description']/@content")
Replace “A2” with the cell reference containing the URL you want to extract the meta description from. This formula retrieves the content attribute of the meta tag with the name attribute set to ‘description’.
⭐ Pro Tips:
If you encounter issues with loading the formula, consider clearing the cell and type “=” to input the formula again. Or try using as few IMPORTXML as possible in a single sheets/tabs, and use separate tabs for different data instead.
Step 3: Expand the Formula
Drag the fill handle of the cell containing the formula down to apply it to the rest of the cells in the column. Google Sheets will automatically adjust the cell references for each row.
Conclusion
With the IMPORTXML function in Google Sheets, extracting page titles from URLs becomes a straightforward task. However, IMPORTXML is often not the most efficient solution for handling data extraction tasks on large websites.
As the website gets bigger and the data more complex, IMPORTXML may slow down, causing delays and possible problems with the spreadsheet’s performance.
For larger websites with hundreds or thousands of URLs, consider alternative methods, such as ScreamingFrog, to extract website data for SEO analysis.