How To Convert HTML Code To Plain Text With PHP

HTML is of course a very useful language, but sometimes, within a website, it can be problematic. Especially if you’re using a script to parse the text within it. In this article, converting or parsing HTML code into plain text variables will be explored.

Let’s see the example HTML that will be parsed into plain text –

HTML

<body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
            <b>Let's parse it</b>
        </p>
</body>

Now let’s see it inside a PHP variable.

PHP

$html = "
    <body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
            <b>Let's parse it</b>
        </p>
    </body>";

Finally, before we start, the output what we require, as plain text.

This is a body of text encapsulated in HTML. Let’s parse it

Method One

PHP comes with a spectacularly brilliant one-liner method, fresh-out-the-box. Namely, strip_tags(), this easy-to-use function requires the following two parameters.

  1. The HTML string.
  2. Allowable tags, for example, tags that you do not want stripping out of the HTML string (pretty handy actually).

See the following examples of strip_tags() in use, the first using only the HTML string parameter and the second with the allowable tags parameter.

Example 1

PHP

$html = "
    <body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
            <b>Let's parse it!</b>
        </p>
    </body>";

echo strip_tags($html);

Output

This is a body of text encapsulated in HTML. Let's parse it!
Example 2

PHP

$html = "
    <body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
            <b>Let's parse it!</b>
        </p>
    </body>";

echo strip_tags($html, '<b>') // Don't strip <b> Tags

Output

This is a body of text encapsulated in <b>HTML</b>.
<b>Let's parse it!</b>

In these 2 examples, the strip_tags() function proves to be very swift to execute, check out the documentation on the function below.

strip_tags() documentation

Also, here is the above code in a PHP Fiddle to play with,

Method 2

There is a specially made class-based library purely dedicated to converting HTML markup into a plain text output. It’s called html2text which is a very descriptive package name to say the least. It has a public and free to use license which is available to download from it’s GitHub repository.

Example 1

Firstly, you will need to install this package from it’s repository page here. The quickest way is to use composer to download and install the package with the following command –

composer require html2text/html2text

If you haven’t got Composer yet and would like to start using it, then following this tutorial on using Composer.

The class can then be called within your PHP script as an object and be utilized with an object oriented style of coding. Here is an example of it being used –

PHP

$html = "
    <body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
        </p>
    </body>";


$html2TextConverter = new \Html2Text\Html2Text($html);

echo $html2TextConverter->getText();

Output

This is a body of text encapsulated in HTML.
Example 2

If you’re not a fan of Composer and like to do things the old school way, then you can simply download the class itself from the GitHub page. The Html2Text.php file is located within the ‘src’ directory within the repository. After downloading, place it in your classes folder and require it with PHP like the following.

PHP

require_once('Classes/Html2Text.php');
use Html2Text\Html2Text;

$html = "
    <body>
        <p>
            This is a body of text encapsulated in <b>HTML</b>.
        </p>
    </body>";

$html2TextConverter = new Html2Text($html);

echo $html2TextConverter->getText();

Output

This is a body of text encapsulated in HTML.

Summary

I can imagine you are quite surprised how easy it is to convert html to tag-less text. It’s super-simple and can be done with clean and understandable code. This is another one of the joys of using PHP, in some other languages, this simple task can be unnecessarily time consuming. Happy Coding!

How To Convert HTML Code To Plain Text With PHP
Posted in PHP

You May Also Like

Leave a Reply

Your email address will not be published.