Convert Beautifulsoup Object to String in Python

BeautifulSoup is a Python module that extracts data from HTML, XML, and other markup languages. A classic example of the usage of BeautifulSoup is to use the requests library to get website sources and use the BeautifulSoup to parse the content and extract data as needed. Here is an example,

Output (truncated):

<!DOCTYPE html>
<html>
 …
   <h1>
	Example Domain
   </h1>
   <p>
	This domain is …
   </p>
   <p>
	<a href="https://www.iana.org/domains/example">
 	More information...
	</a>
   </p>
</html>
<class 'bs4.BeautifulSoup'>

Note: You can view the page source of a given website by visiting the site and clicking Ctrl+Shift+I or right-click your mouse and selecting “View Page Source.”

Convert BeautifulSoup Object into Python String

The output is a BeautifulSoup object. If you want to get the soup as a Python string, you can just cast it using the str function.

Output:

<class 'str'>

Note: If you are using requests to get web content parsing the response as a string can be done using the text attribute. In the example above, response.text will give the response as a string. In this case, you won’t need to use BeautifulSoup.

Converting BeautifulSoup Tag Object to String

Let us work on another example where we want to convert a Tag object into a string.

Output:

<time class="time1" datetime="2022-09-28T07:37:15.000Z" title="Wednesday, September 28, 2022 at 10:37:15 AM">A day ago</time>
<class 'bs4.element.Tag'>

If you want to convert the Tag element into a string, you can just cast it using the str function as we did before.

Output:

<time class="time1" datetime="2022-09-28T07:37:15.000Z" title="Wednesday, September 28, 2022 at 10:37:15 AM">A day ago</time>
<class 'str'>

To get the inner content of the tag as a string:

Output:

A day ago

And you can use the function soup.find(<tag>).get(<attribute>) to get the value of the <attribute> inside <tag> as string

Output:

Wednesday, September 28, 2022 at 10:37:15 AM