Search for resources by:

Definitions of materials Definitions of levels Advanced Search

Working with PDF-Formatted Periodicals

Newspapers and magazines are important resources for foreign language teachers, allowing them to integrate into the classroom setting authentic materials which are current and interesting to both the students and to the populations which speak the target language. A frequent problem for people who want to use newspapers and magazines as teaching materials is that they are often inconvenient and expensive to purchase. The fact that very many print publications around the world now maintain web sites, making available all or most of the content of the print editions available for free at the fingertips of anyone anywhere in the world with access to a web browser, makes this less of a problem, provided that the teacher has the skills to manipulate the web content to distribute to his or her students in a useful format.

While most web content, including that of sites associated with print periodicals, is in HTML (web page format), many newspapers and some magazines are available on the Internet as PDF files. This format has some distinct advantages for the teacher who wants to use the content to create printed instructional materials. This page will compare some of the features of the PDF format with those of both the HTML (web page) formats and popular image formats (such as JPEG) and explain how to "clip" an article from a PDF files to create language teaching materials.

File Formats: PDF, HTML, and Images

PDF format. The Portable Document Format, or PDF, was created by Adobe, makers of the popular software products Adobe Acrobat and Adobe Photoshop. More information about the PDF format can be found here on the Adobe web site. A PDF file is a relatively compact file which can contain text both images. A PDF file will look the same on the screen and in print, regardless of what kind of computer or printer it is viewed on and regardless of what software package was originally used to create it. In this respect, it is like an image file (JPEG, GIF, etc.). But unlike an image file, which will appear blurry if enlarged to any size larger than its original size, a PDF file usually contains text and other content which remains sharp and clear regardless of what size it is enlarged to.

Periodicals available on the web in PDF format are generally broken down so that each page is contained in a separate PDF file. The PDF format is generally an exact copy (sometimes with advertising omitted) of the print version with respect to both content and layout. Use of PDF versions of periodicals can therefore give the learner the experience of dealing with authentic printed magazines and newspapers.

PDF files can be read on Windows, OS X, and Linux using the free Adobe Acrobat Reader program and web browser plug-in, available here, as well as by some other third-party and free software. Chances are that Acrobat Reader is already installed on your computer.

Web pages in HTML format. HTML (Hypertext Markup Language) and derivatives like XHTML are the formats used for most web pages, such as this one. HTML pages are rendered directly by the browser rather than via a plug-in, and can contain text, images, and form elements (such as checkboxes, buttons, and text input fields). The way in which HTML treats text has a few features which are useful to the teacher. For one, the text is rendered in fonts whose sizes can usually be enlarged or made smaller by the reader by changing the browser's settings. Another feature is that blocks of text can usually be selected with a mouse and copied into a word processor for reformatting. Many print periodicals maintain sites which post content reformatted as web pages in HTML format. While this reformatted content is ideal for reading directly from a computer screen, using articles reformatted in this way has certain disadvantages for the teacher who wants to print out the material for use in the classroom:

  • Essential and non-essential formatting elements. It is usually difficult or impossible to print out web pages in a way which both preserves essential formatting elements such as placement of images related to the text, without also retaining elements which are annoying in the print medium, such as space wasted by banners and sidebars and inappropriate font sizes.
  • Inappropriate elements. Web page elements which simply do not translate into the non-interactive medium of print either annoy the reader, as in the case in hyperlinks, or waste valuable space, as in the case of buttons, checkboxes, input fields, and other form elements.
  • No pagination. HTML is not a paginated format. A "page" of Web content is intended to be viewed as one continuous scroll, regardless of the length of the content, and browsers offer little or no control over how these scrolls are broken up into multiple physical pages for printing.
  • Web look-and-feel. Print-outs of HTML formatted material usually have the feel of a web page, rather than of a magazine or some other printed format. Articles printed out from a browser do not give the learner the feeling of having worked with a real newspaper or magazine. Conversely, articles culled from web pages and reformatted in a word processor give the student the feeling of working with specially prepared instructional materials rather than with authentic materials.

These disadvantages can be summarized by saying that HTML documents are designed to look like web pages and are intended to be experienced using a web browser.

Pages saved as images. There are a few web sites which offer archived versions of printed magazines in image formats such as JPEG. These are generally not of much use to the language teacher. The reason is that to obtain a reasonably small file size, the publisher must save the image with a relatively low resolution. This means that if you try to enlarge the image to make the print easier to read for your students, the image becomes blurry, and hence just as hard to read as the original size. Frequently, such archives are created by breaking down the printed page into several image strips rather than as a single image. While such a page will display correctly in your web browser, it is difficult and impractical to manipulate in any other computer program, because each of the strips must be saved separately then reassembled in the other program.

PDF Tips

The following tips will assume that you have installed Adobe Acrobat Reader and can open PDF files. In addition to this, you will need a graphics program to manipulate the image we capture from the Acrobat Reader program. Windows XP has the Paint utility, located under Accessories. If you would like a more sophisticated graphics program, or you are not using Windows XP, you can download the free, open source GIMP graphics program from GIMP.org. Versions of the the GIMP are available for Windows, OS X, Linux, and other platforms.

Clipping an article from a PDF page. The following steps describe how to clip an article from a PDF file to produce an image which you can either print out or embed in a word processor document to enrich with additional material. Follow these steps:

1. Open the PDF file in Acrobat Reader and find the article you want to clip. (If Acrobat Reader is properly configured as a browser plug-in, you should be able to open the file by just clicking on the link to the PDF file in your browser.) Here's what a PDF document will look like opened in a browser using the Acrobat Reader plug-in:

PDF file opened in browser using plug-in

2. Zoom in close so that the text of the article is rather large. The size and resolution of the image we are going to make is dependent on the resolution and size of the view on the screen. We want the print to be large and clear enough by zooming in on it at this point:

zooming in on the PDF file

3. Select the Acrobat Reader snapshot tool:

snapshot tool

This will allow you to create a square selection of the PDF page on your screen. Click and hold down at the top left-hand point of the area you want to clip. Drag your mouse to the lower right-hand corner of the area you want to clip. If that area is below the bottom of the window, drag the cursor below the bottom of the window and Acrobat Reader will slowly scroll the document in that direction. When you have reached the lower right-hand corner of the area you want to clip, let go of the mouse. When you do this, a dialog box will appear telling you that the area has been pasted to the clipboard:

snapshot tool

4. Now open your graphics program to paste the image from the clipboard.

  • If you're using Windows XP Paint, paste the image into a blank document and save in PNG or JPEG format.
  • If you're using the GIMP, from the menu select Edit > Acquire > From Clipboard. Save the image in PNG or JPEG format.

At this point you can either manipulate (scale, trim, rotate, etc.) and print the image from within your graphics program or import the image into a word processor file to add additional material such as vocabulary notes, bibliographic information, and questions.


This work is licensed under a Creative Commons License.

  • You may use and modify the material for any non-commercial purpose.
  • You must credit the UCLA Language Materials Project as the source.
  • If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.
Creative Commons License